Como removo itens duplicados de um array em Perl?

https://stackoverflow.com/questions/7651

08-06-2019
|

Pergunta

Eu tenho uma matriz em Perl:

my @my_array = ("one","two","three","two","three");

Como faço para remover as duplicatas do array?

Solução

Você pode fazer algo assim, conforme demonstrado em perlfaq4:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n";

Saídas:

one two three

Se você quiser usar um módulo, tente o uniq função de List::MoreUtils

Outras dicas

A documentação do Perl vem com uma bela coleção de perguntas frequentes.Sua pergunta é frequente:

% perldoc -q duplicate

A resposta, copiada e colada da saída do comando acima, aparece abaixo:

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod
 How can I remove duplicate elements from a list or array?
   (contributed by brian d foy)

   Use a hash. When you think the words "unique" or "duplicated", think
   "hash keys".

   If you don't care about the order of the elements, you could just
   create the hash then extract the keys. It's not important how you
   create that hash: just that you use "keys" to get the unique elements.

       my %hash   = map { $_, 1 } @array;
       # or a hash slice: @hash{ @array } = ();
       # or a foreach: $hash{$_} = 1 foreach ( @array );

       my @unique = keys %hash;

   If you want to use a module, try the "uniq" function from
   "List::MoreUtils". In list context it returns the unique elements,
   preserving their order in the list. In scalar context, it returns the
   number of unique elements.

       use List::MoreUtils qw(uniq);

       my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
       my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

   You can also go through each element and skip the ones you've seen
   before. Use a hash to keep track. The first time the loop sees an
   element, that element has no key in %Seen. The "next" statement creates
   the key and immediately uses its value, which is "undef", so the loop
   continues to the "push" and increments the value for that key. The next
   time the loop sees that same element, its key exists in the hash and
   the value for that key is true (since it's not 0 or "undef"), so the
   next skips that iteration and the loop goes to the next element.

       my @unique = ();
       my %seen   = ();

       foreach my $elem ( @array )
       {
         next if $seen{ $elem }++;
         push @unique, $elem;
       }

   You can write this more briefly using a grep, which does the same
   thing.

       my %seen = ();
       my @unique = grep { ! $seen{ $_ }++ } @array;

Instalar Lista::MaisUtilitários da CPAN

Então no seu código:

use strict;
use warnings;
use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dup_list);

Minha maneira usual de fazer isso é:

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;

Se você usar um hash e adicionar os itens ao hash.Você também tem a vantagem de saber quantas vezes cada item aparece na lista.

Pode ser feito com um simples Perl one liner.

my @in=qw(1 3 4  6 2 4  3 2 6  3 2 3 4 4 3 2 5 5 32 3); #Sample data 
my @out=keys %{{ map{$_=>1}@in}}; # Perform PFM
print join ' ', sort{$a<=>$b} @out;# Print data back out sorted and in order.

O bloco PFM faz isso:

Os dados em @in são inseridos no MAP.MAP constrói um hash anônimo.As chaves são extraídas do hash e alimentadas em @out

A variável @array é a lista com elementos duplicados

%seen=();
@unique = grep { ! $seen{$_} ++ } @array;

Esse último foi muito bom.Eu apenas ajustaria um pouco:

my @arr;
my @uniqarr;

foreach my $var ( @arr ){
  if ( ! grep( /$var/, @uniqarr ) ){
     push( @uniqarr, $var );
  }
}

Acho que esta é provavelmente a maneira mais legível de fazer isso.

Método 1:Use um hash

Lógica:Um hash pode ter apenas chaves exclusivas, então itere sobre o array, atribua qualquer valor a cada elemento do array, mantendo o elemento como chave desse hash.Retorne as chaves do hash, é seu array exclusivo.

my @unique = keys {map {$_ => 1} @array};

Método 2:Extensão do método 1 para reutilização

É melhor criar uma sub-rotina se formos usar essa funcionalidade várias vezes em nosso código.

sub get_unique {
    my %seen;
    grep !$seen{$_}++, @_;
}
my @unique = get_unique(@array);

Método 3:Usar módulo `List::MoreUtils`

use List::MoreUtils qw(uniq);
my @unique = uniq(@array);

As respostas anteriores resumem praticamente as formas possíveis de realizar esta tarefa.

No entanto, sugiro uma modificação para aqueles que não se importar contando as duplicatas, mas fazer se preocupe com a ordem.

my @record = qw( yeah I mean uh right right uh yeah so well right I maybe );
my %record;
print grep !$record{$_} && ++$record{$_}, @record;

Observe que o sugerido anteriormente grep !$seen{$_}++ ... incrementos $seen{$_} antes de negar, então o incremento ocorre independentemente de já ter sido %seen ou não.O acima, no entanto, entra em curto-circuito quando $record{$_} é verdade, deixando o que foi ouvido uma vez 'fora do %record'.

Você também pode optar por esse ridículo, que tira vantagem da autovivificação e da existência de chaves hash:

...
grep !(exists $record{$_} || undef $record{$_}), @record;

Isso, no entanto, pode levar a alguma confusão.

E se você não se importa com ordem ou contagem de duplicatas, você pode usar outro hack usando hash slices e o truque que acabei de mencionar:

...
undef @record{@record};
keys %record; # your record, now probably scrambled but at least deduped

Tente isso, parece que a função uniq precisa de uma lista ordenada para funcionar corretamente.

use strict;

# Helper function to remove duplicates in a list.
sub uniq {
  my %seen;
  grep !$seen{$_}++, @_;
}

my @teststrings = ("one", "two", "three", "one");

my @filtered = uniq @teststrings;
print "uniq: @filtered\n";
my @sorted = sort @teststrings;
print "sort: @sorted\n";
my @sortedfiltered = uniq sort @teststrings;
print "uniq sort : @sortedfiltered\n";

Usando o conceito de chaves hash exclusivas:

my @array  = ("a","b","c","b","a","d","c","a","d");
my %hash   = map { $_ => 1 } @array;
my @unique = keys %hash;
print "@unique","\n";

Saída:um c b d

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow

Como removo itens duplicados de um array em Perl?

Método 1:Use um hash

Método 2:Extensão do método 1 para reutilização

Método 3:Usar módulo List::MoreUtils

Método 3:Usar módulo `List::MoreUtils`