Question

My array consists of urls, and I noticed several are “somewhat” duplicates. Basically some urls simply have a www. in front of the url, and some have the domain without the www. How would I find the duplicates ones and then kick out the one that has a lower domain value?

I played around with array_unique(), but the problem is that my arrays are not proper duplicates because of the www.

Current array:

Array
(
    [0] => Array
        (
            [url] => www.domain1.com
            [domain_value] => 653
        )
    [1] => Array
        (
            [url] => www.domain2.com
            [domain_value] => 412
        )
    [2] => Array
        (
            [url] => www.domain3.com
            [domain_value] => 723
        )
    [3] => Array
        (
            [url] => domain1.com
            [domain_value] => 543
        )
    [4] => Array
        (
            [url] => domain2.com
            [domain_value] => 956
        )

)

My goal:

Array
(
    [0] => Array
        (
            [url] => www.domain1.com
            [domain_value] => 653
        )
    [1] => Array
        (
            [url] => www.domain3.com
            [domain_value] => 723
        )
    [2] => Array
        (
            [url] => domain2.com
            [domain_value] => 256
        )

)
Était-ce utile?

La solution

You could do this a number of ways.

The first option is to split them into two different arrays: WWW & NONWWW You could do that using either preg_match or strpos. (The strpos example is commented out below).

An example of that would be something like this:

<?php
$www = array();
        $nonwww = array();
        foreach ($array as $domain) {
            // USING PREG_MATCH
            if (preg_match('/www/', $domain['url'])) {
                $www[] = $domain;
            } else {
                $nonwww[] = $domain;
            }

            // USING STRPOS
            //if (strpose($domain['url'], 'www') !== FALSE) {
            //    $www[] = $domain;
            //} else {
            //    $nonwww[] = $domain;
            //}
        }

?>

Now this would return two arrays as such:

WWW

Array
(
    [0] => Array
        (
            [url] => www.domain1.com
            [domain_value] => 653
        )

    [1] => Array
        (
            [url] => www.domain2.com
            [domain_value] => 412
        )

    [2] => Array
        (
            [url] => www.domain3.com
            [domain_value] => 723
        )

)

NONWWW

Array
(
    [0] => Array
        (
            [url] => domain1.com
            [domain_value] => 543
        )

    [1] => Array
        (
            [url] => domain2.com
            [domain_value] => 956
        )

)

Now all you have to do is match the domains and with the www and remove the duplicates then merge them back?

Autres conseils

So loop through your array and for each item. check if it has www in it and see if there exists an entry that is the same except for www removed. If there is then remove the entry

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top