Perl で配列から重複した項目を削除するにはどうすればよいですか?

https://stackoverflow.com/questions/7651

08-06-2019
|

質問

Perl で配列を持っています。

my @my_array = ("one","two","three","two","three");

配列から重複を削除するにはどうすればよいですか?

解決

で示されているように、このようなことができますペルファク4:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n";

出力:

one two three

モジュールを使用したい場合は、 uniq からの関数 List::MoreUtils

他のヒント

Perl ドキュメントには、FAQ の優れたコレクションが付属しています。あなたの質問はよく聞かれます:

% perldoc -q duplicate

上記のコマンドの出力からコピーして貼り付けた答えを以下に示します。

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod
 How can I remove duplicate elements from a list or array?
   (contributed by brian d foy)

   Use a hash. When you think the words "unique" or "duplicated", think
   "hash keys".

   If you don't care about the order of the elements, you could just
   create the hash then extract the keys. It's not important how you
   create that hash: just that you use "keys" to get the unique elements.

       my %hash   = map { $_, 1 } @array;
       # or a hash slice: @hash{ @array } = ();
       # or a foreach: $hash{$_} = 1 foreach ( @array );

       my @unique = keys %hash;

   If you want to use a module, try the "uniq" function from
   "List::MoreUtils". In list context it returns the unique elements,
   preserving their order in the list. In scalar context, it returns the
   number of unique elements.

       use List::MoreUtils qw(uniq);

       my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
       my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

   You can also go through each element and skip the ones you've seen
   before. Use a hash to keep track. The first time the loop sees an
   element, that element has no key in %Seen. The "next" statement creates
   the key and immediately uses its value, which is "undef", so the loop
   continues to the "push" and increments the value for that key. The next
   time the loop sees that same element, its key exists in the hash and
   the value for that key is true (since it's not 0 or "undef"), so the
   next skips that iteration and the loop goes to the next element.

       my @unique = ();
       my %seen   = ();

       foreach my $elem ( @array )
       {
         next if $seen{ $elem }++;
         push @unique, $elem;
       }

   You can write this more briefly using a grep, which does the same
   thing.

       my %seen = ();
       my @unique = grep { ! $seen{ $_ }++ } @array;

インストールリスト::その他のユーティリティ CPANから

次に、コード内で次のようにします。

use strict;
use warnings;
use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dup_list);

私の通常の方法は次のとおりです。

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;

ハッシュを使用し、アイテムをハッシュに追加する場合。また、各項目がリストに何回表示されるかを知ることができるという利点もあります。

単純な Perl ワンライナーで実行できます。

my @in=qw(1 3 4  6 2 4  3 2 6  3 2 3 4 4 3 2 5 5 32 3); #Sample data 
my @out=keys %{{ map{$_=>1}@in}}; # Perform PFM
print join ' ', sort{$a<=>$b} @out;# Print data back out sorted and in order.

PFM ブロックはこれを行います。

@in のデータは MAP にフィードされます。MAP は匿名ハッシュを構築します。キーはハッシュから抽出され、@out にフィードされます。

変数 @array は重複した要素を含むリストです

%seen=();
@unique = grep { ! $seen{$_} ++ } @array;

あのラストはなかなか良かったですよ。少しだけ調整してみます:

my @arr;
my @uniqarr;

foreach my $var ( @arr ){
  if ( ! grep( /$var/, @uniqarr ) ){
     push( @uniqarr, $var );
  }
}

おそらくこれが最も読みやすい方法だと思います。

方法 1:ハッシュを使用する

論理：ハッシュには一意のキーのみを含めることができるため、配列を反復処理し、配列の各要素に任意の値を割り当て、要素をそのハッシュのキーとして保持します。ハッシュのキーを返します。その独自の配列です。

my @unique = keys {map {$_ => 1} @array};

方法 2:再利用性を高めるための方法 1 の拡張

コード内でこの機能を複数回使用する必要がある場合は、サブルーチンを作成することをお勧めします。

sub get_unique {
    my %seen;
    grep !$seen{$_}++, @_;
}
my @unique = get_unique(@array);

方法 3:モジュールを使用する `List::MoreUtils`

use List::MoreUtils qw(uniq);
my @unique = uniq(@array);

以前の回答は、このタスクを達成する可能な方法をほぼ要約しています。

ただし、次のような人には修正をお勧めします。 しないでください 気にする 数える 重複していますが、する順序を気にします。

my @record = qw( yeah I mean uh right right uh yeah so well right I maybe );
my %record;
print grep !$record{$_} && ++$record{$_}, @record;

以前に提案されたことに注意してください grep !$seen{$_}++ ... 増分 $seen{$_} 否定する前にインクリメントされるため、すでにインクリメントされているかどうかに関係なく、インクリメントが発生します。 %seen か否か。ただし、上記は次の場合に短絡します。 $record{$_} それは真実であり、一度聞いたものはそのままにしておきます。 %record'.

自動復活とハッシュキーの存在を利用する、このばかばかしいことを行うこともできます。

...
grep !(exists $record{$_} || undef $record{$_}), @record;

ただし、それは混乱を招く可能性があります。

順序も重複数も気にしない場合は、ハッシュスライスと先ほど述べたトリックを使用した別のハッキングが可能です。

...
undef @record{@record};
keys %record; # your record, now probably scrambled but at least deduped

これを試してみてください。uniq 関数が正しく動作するには、ソートされたリストが必要のようです。

use strict;

# Helper function to remove duplicates in a list.
sub uniq {
  my %seen;
  grep !$seen{$_}++, @_;
}

my @teststrings = ("one", "two", "three", "one");

my @filtered = uniq @teststrings;
print "uniq: @filtered\n";
my @sorted = sort @teststrings;
print "sort: @sorted\n";
my @sortedfiltered = uniq sort @teststrings;
print "uniq sort : @sortedfiltered\n";

一意のハッシュキーの概念を使用:

my @array  = ("a","b","c","b","a","d","c","a","d");
my %hash   = map { $_ => 1 } @array;
my @unique = keys %hash;
print "@unique","\n";

出力：a c b d

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow

Perl で配列から重複した項目を削除するにはどうすればよいですか?

方法 1:ハッシュを使用する

方法 2:再利用性を高めるための方法 1 の拡張

方法 3:モジュールを使用する List::MoreUtils

方法 3:モジュールを使用する `List::MoreUtils`