Question

I have an array of data containing some domains with TLD extensions. I want to collect the domain name and TLD extension seperately.

E.g. From "hello.com" I want to collect "hello" as one variable, and then collect ".com" as another variable.

Another E.g. IMPORTANT, from "hello.co.uk" I want to collect "hello" as one variable, and then collect ".co.uk" as another variable.

My current code using pathinfo() will work correctly on "hello.com", but not "hello.co.uk". For "hello.co.uk" it will collect "hello.co" as one variable, and then collect ".uk" as another variable.

Here is the code I am using:

// Get a file into an array
$lines = file($_FILES['file']['tmp_name']);

// Loop through array
foreach ($lines as $line_num => $line) {
    echo $line;

    //Find TLD
    $tld = ".".pathinfo($line, PATHINFO_EXTENSION);
    echo $tld;

    //Find Domain
    $domain = pathinfo($line, PATHINFO_FILENAME);
    echo $domain;
    }

Hopefully I explained that well enough. I use stackoverflow a lot but couldn't find a specific example of this.

Thanks

Was it helpful?

Solution

Instead of using functions intended for files, you could just use some simple string manipulation:

$domain = substr($line, 0, strpos($line, "."));
$tld = substr($line, strpos($line, "."), (strlen($line) - strlen($domain)));

OTHER TIPS

First method:

$domains = array("hello.co.uk", "hello.com");

foreach ($domains as $d) {

    $ext = strstr($d, '.'); // extension
    $index = strpos($d, '.');

    $arr = str_split($d, $index);

    $domain = $arr[0]; // domain name


    echo "domain: $domain, extension: $ext <br/>";

}

Second method: (Thanks to hakre)

$domains = array("hello.co.uk", "hello.com");

foreach ($domains as $d) {

    list($domain, $ext) = explode('.', $d, 2);
    echo "domain: $domain, extension: $ext <br/>";

}

Here's a function that's pretty flexible, and will work with everything from example.com to http://username:password@example.com/public_html/test.zip to ftp://username@example.com to http://www.reddit.com/r/aww/comments/165v9u/shes_allergic_couldnt_help_herself/

function splitDomain($url) { 
 $host = "";
 $url = parse_url($url);
 if(isset($url['host'])) { 
    $host = $url['host'];
 } else {
    $host = $url['path'];
 }
 $host = str_replace('www.','',$host);
 $tmp = explode('.', $host);
 $name = $tmp[0];
 $tld = $tmp[1];
return array('name'=>$name,'tld'=>$tld);
}

There's no reliable way of doing this other than to use a large table of legal extensions.

A popular table is the one known as the Public Suffix List.

For work with two (co.uk) and three level TLDs (act.edu.au) you need library that using Public Suffix List (list of top level domains), I recomend to use TLDExtract.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top