質問

Where could I find a code (javascript would be the best) to strip out the www and second-level domain names from URLs?

Example:

www.ynet.co.il -> ynet (stripped 'co.il' - two tokens)
www.nike.com -> nike (stripped 'com' - one token)

etc

As a second best - the full list of second-level domains (preferably in CSV or any other format) will be welcomed as well.

役に立ちましたか?

解決

If you use Java, Guava can help you here.

You can use InternetDomainName.topPrivateDomain() together with publicSuffix() to solve your problem.

Guava (as well as Mozilla/Firefox, Chrome and Opera) use the Public Suffix List for this functionality (the raw data is here).

tld.js is a JavaScript library that uses that data as well.

他のヒント

https://gist.github.com/2428561 something like this? Search for 'javascript url parser' in google

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top