Question

What is the best way to extract from a url just the domain (like example.com), taking into account of some tlds like .co.uk?

Is it just with a manual list of the .xx.xx? or do they follow a pattern?

I would say just filter out any domain part under 3 characters, but there are things like .org.xx which are 3 characters

Was it helpful?

Solution

Yes, you will need a per-TLD (or even per-domain) list. It doesn't have to be "manual" per se; in fact, I would strongly recommend you to use the one from http://publicsuffix.org/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top