Question

I have text files that have list of thousands of names like this

DallasWebJobs
DallasWebJobs
DallasWebJobs
php_gigs
brotherjudkins
goldbergwb
SanDiegoWebJobs
brinteractive
muracms
browan85
php_gigs
php_gigs
php_gigs
php_gigs

1 name per line, 1 file may have up to 30,000 names on it though and I need to replace all duplicate names because probably as many as half are duplicates.

I would like to do this in PHP, 1 though was importing each line into a MySQL database and then doing it but that seems like overkill, i'm sure there is an easier way.

Please help if you can


Update I found this for emails, it should work too

$list = file('./Emailist.txt');
$list_unique = array_unique($list);
foreach ($list_unique as $mail) {
    echo $mail;
}
Was it helpful?

Solution

From php.net: serg dot podtynnyi at gmail dot com 06-Feb-2009 11:21

//Remove duplicates from a text files and dump result in one file for example: emails list, links list etc

<?php 

$data1 = file("data1.txt");  

file_put_contents('unique.txt', implode('\n', array_unique($data1))); 
?>

This will remove all duplicates and save it as a file of unique.txt

or

<?php 

$data1 = file("data1.txt"); 

$uniqueArray = array_unique($data1)); 
?>

Will store it in $uniqueArray

OTHER TIPS

$lines = file("test-file");

foreach($lines as $line)
{
    $new[str_replace(array("\n","\r"),"",$line)] = 1;
}

print_r(array_keys($new));
$file = file_get_contents($filename);
$arr = array();
$arr = split('\n',$file);
$arr = array_unique($arr);

Then write contents of $arr to textfile again

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top