كيف أقوم بإزالة العناصر المكررة من مصفوفة في بيرل؟

https://stackoverflow.com/questions/7651

08-06-2019
|

سؤال

لدي مصفوفة في بيرل:

my @my_array = ("one","two","three","two","three");

كيف يمكنني إزالة التكرارات من المصفوفة؟

المحلول

يمكنك القيام بشيء مثل هذا كما هو موضح في بيرلفاق4:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n";

النواتج:

one two three

إذا كنت تريد استخدام وحدة نمطية، فجرب uniq وظيفة من List::MoreUtils

نصائح أخرى

تأتي وثائق Perl مع مجموعة رائعة من الأسئلة الشائعة.سؤالك متكرر:

% perldoc -q duplicate

تظهر الإجابة، التي تم نسخها ولصقها من إخراج الأمر أعلاه، أدناه:

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod
 How can I remove duplicate elements from a list or array?
   (contributed by brian d foy)

   Use a hash. When you think the words "unique" or "duplicated", think
   "hash keys".

   If you don't care about the order of the elements, you could just
   create the hash then extract the keys. It's not important how you
   create that hash: just that you use "keys" to get the unique elements.

       my %hash   = map { $_, 1 } @array;
       # or a hash slice: @hash{ @array } = ();
       # or a foreach: $hash{$_} = 1 foreach ( @array );

       my @unique = keys %hash;

   If you want to use a module, try the "uniq" function from
   "List::MoreUtils". In list context it returns the unique elements,
   preserving their order in the list. In scalar context, it returns the
   number of unique elements.

       use List::MoreUtils qw(uniq);

       my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
       my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

   You can also go through each element and skip the ones you've seen
   before. Use a hash to keep track. The first time the loop sees an
   element, that element has no key in %Seen. The "next" statement creates
   the key and immediately uses its value, which is "undef", so the loop
   continues to the "push" and increments the value for that key. The next
   time the loop sees that same element, its key exists in the hash and
   the value for that key is true (since it's not 0 or "undef"), so the
   next skips that iteration and the loop goes to the next element.

       my @unique = ();
       my %seen   = ();

       foreach my $elem ( @array )
       {
         next if $seen{ $elem }++;
         push @unique, $elem;
       }

   You can write this more briefly using a grep, which does the same
   thing.

       my %seen = ();
       my @unique = grep { ! $seen{ $_ }++ } @array;

ثَبَّتَ قائمة :: المزيد من Utils من CPAN

ثم في الكود الخاص بك:

use strict;
use warnings;
use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dup_list);

طريقتي المعتادة للقيام بذلك هي:

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;

إذا كنت تستخدم التجزئة وإضافة العناصر إلى التجزئة.لديك أيضًا ميزة معرفة عدد المرات التي يظهر فيها كل عنصر في القائمة.

يمكن القيام بذلك باستخدام بطانة Perl واحدة بسيطة.

my @in=qw(1 3 4  6 2 4  3 2 6  3 2 3 4 4 3 2 5 5 32 3); #Sample data 
my @out=keys %{{ map{$_=>1}@in}}; # Perform PFM
print join ' ', sort{$a<=>$b} @out;# Print data back out sorted and in order.

تقوم كتلة PFM بما يلي:

يتم إدخال البيانات الموجودة في @in في MAP.MAP يبني تجزئة مجهولة.يتم استخراج المفاتيح من التجزئة وإدخالها في @out

المتغير @array هو القائمة التي تحتوي على عناصر مكررة

%seen=();
@unique = grep { ! $seen{$_} ++ } @array;

هذا الأخير كان جيدًا جدًا.أود فقط تعديله قليلاً:

my @arr;
my @uniqarr;

foreach my $var ( @arr ){
  if ( ! grep( /$var/, @uniqarr ) ){
     push( @uniqarr, $var );
  }
}

أعتقد أن هذه هي الطريقة الأكثر قابلية للقراءة للقيام بذلك.

طريقة 1:استخدم التجزئة

منطق:يمكن أن تحتوي التجزئة على مفاتيح فريدة فقط، لذا قم بالتكرار على المصفوفة، وقم بتعيين أي قيمة لكل عنصر في المصفوفة، مع الاحتفاظ بالعنصر كمفتاح لتلك التجزئة.مفاتيح الإرجاع للتجزئة، إنها مصفوفتك الفريدة.

my @unique = keys {map {$_ => 1} @array};

الطريقة الثانية:تمديد الطريقة 1 لإعادة الاستخدام

من الأفضل إنشاء روتين فرعي إذا كان من المفترض أن نستخدم هذه الوظيفة عدة مرات في الكود الخاص بنا.

sub get_unique {
    my %seen;
    grep !$seen{$_}++, @_;
}
my @unique = get_unique(@array);

الطريقة الثالثة:استخدام الوحدة النمطية `List::MoreUtils`

use List::MoreUtils qw(uniq);
my @unique = uniq(@array);

الإجابات السابقة تلخص إلى حد كبير الطرق الممكنة لإنجاز هذه المهمة.

ومع ذلك، أقترح تعديل لأولئك الذين لا اهتم ب عد المكررة، ولكن يفعل الاهتمام بالنظام.

my @record = qw( yeah I mean uh right right uh yeah so well right I maybe );
my %record;
print grep !$record{$_} && ++$record{$_}, @record;

لاحظ أن المقترح سابقا grep !$seen{$_}++ ... الزيادات $seen{$_} قبل النفي، وبالتالي فإن الزيادة تحدث بغض النظر عما إذا كانت قد حدثت بالفعل %seen أم لا.ما ورد أعلاه، ومع ذلك، عند الدوائر القصيرة $record{$_} صحيح، وترك ما تم سماعه مرة واحدة "خارج". %record'.

يمكنك أيضًا استخدام هذا السخافة، الذي يستفيد من التنشيط التلقائي ووجود مفاتيح التجزئة:

...
grep !(exists $record{$_} || undef $record{$_}), @record;

لكن ذلك قد يؤدي إلى بعض الالتباس.

وإذا كنت لا تهتم بالطلب أو العدد المكرر، فيمكنك إجراء اختراق آخر باستخدام شرائح التجزئة والخدعة التي ذكرتها للتو:

...
undef @record{@record};
keys %record; # your record, now probably scrambled but at least deduped

جرب هذا، يبدو أن وظيفة uniq تحتاج إلى قائمة مرتبة لتعمل بشكل صحيح.

use strict;

# Helper function to remove duplicates in a list.
sub uniq {
  my %seen;
  grep !$seen{$_}++, @_;
}

my @teststrings = ("one", "two", "three", "one");

my @filtered = uniq @teststrings;
print "uniq: @filtered\n";
my @sorted = sort @teststrings;
print "sort: @sorted\n";
my @sortedfiltered = uniq sort @teststrings;
print "uniq sort : @sortedfiltered\n";

استخدام مفهوم مفاتيح التجزئة الفريدة:

my @array  = ("a","b","c","b","a","d","c","a","d");
my %hash   = map { $_ => 1 } @array;
my @unique = keys %hash;
print "@unique","\n";

انتاج:أ ج ب د

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow

كيف أقوم بإزالة العناصر المكررة من مصفوفة في بيرل؟

طريقة 1:استخدم التجزئة

الطريقة الثانية:تمديد الطريقة 1 لإعادة الاستخدام

الطريقة الثالثة:استخدام الوحدة النمطية List::MoreUtils

الطريقة الثالثة:استخدام الوحدة النمطية `List::MoreUtils`