How do I compare two strings in Perl?
-
19-09-2019 - |
Question
How do I compare two strings in Perl?
I am learning Perl, I had this basic question looked it up here on StackOverflow and found no good answer so I thought I would ask.
Solution
See perldoc perlop. Use lt
, gt
, eq
, ne
, and cmp
as appropriate for string comparisons:
Binary
eq
returns true if the left argument is stringwise equal to the right argument.Binary
ne
returns true if the left argument is stringwise not equal to the right argument.Binary
cmp
returns -1, 0, or 1 depending on whether the left argument is stringwise less than, equal to, or greater than the right argument.Binary
~~
does a smartmatch between its arguments. ...
lt
,le
,ge
,gt
andcmp
use the collation (sort) order specified by the current locale if a legacy use locale (but notuse locale ':not_characters'
) is in effect. See perllocale. Do not mix these with Unicode, only with legacy binary encodings. The standard Unicode::Collate and Unicode::Collate::Locale modules offer much more powerful solutions to collation issues.
OTHER TIPS
cmp
Compare'a' cmp 'b' # -1 'b' cmp 'a' # 1 'a' cmp 'a' # 0
eq
Equal to'a' eq 'b' # 0 'b' eq 'a' # 0 'a' eq 'a' # 1
ne
Not-Equal to'a' ne 'b' # 1 'b' ne 'a' # 1 'a' ne 'a' # 0
lt
Less than'a' lt 'b' # 1 'b' lt 'a' # 0 'a' lt 'a' # 0
le
Less than or equal to'a' le 'b' # 1 'b' le 'a' # 0 'a' le 'a' # 1
gt
Greater than'a' gt 'b' # 0 'b' gt 'a' # 1 'a' gt 'a' # 0
ge
Greater than or equal to'a' ge 'b' # 0 'b' ge 'a' # 1 'a' ge 'a' # 1
See perldoc perlop
for more information.
( I'm simplifying this a little bit as all but cmp
return a value that is both an empty string, and a numerically zero value instead of 0
, and a value that is both the string '1'
and the numeric value 1
. These are the same values you will always get from boolean operators in Perl. You should really only be using the return values for boolean or numeric operations, in which case the difference doesn't really matter. )
In addtion to Sinan Ünür comprehensive listing of string comparison operators, Perl 5.10 adds the smart match operator.
The smart match operator compares two items based on their type. See the chart below for the 5.10 behavior (I believe this behavior is changing slightly in 5.10.1):
perldoc perlsyn
"Smart matching in detail":
The behaviour of a smart match depends on what type of thing its arguments are. It is always commutative, i.e.
$a ~~ $b
behaves the same as$b ~~ $a
. The behaviour is determined by the following table: the first row that applies, in either order, determines the match behaviour.
$a $b Type of Match Implied Matching Code ====== ===== ===================== ============= (overloading trumps everything) Code[+] Code[+] referential equality $a == $b Any Code[+] scalar sub truth $b−>($a) Hash Hash hash keys identical [sort keys %$a]~~[sort keys %$b] Hash Array hash slice existence grep {exists $a−>{$_}} @$b Hash Regex hash key grep grep /$b/, keys %$a Hash Any hash entry existence exists $a−>{$b} Array Array arrays are identical[*] Array Regex array grep grep /$b/, @$a Array Num array contains number grep $_ == $b, @$a Array Any array contains string grep $_ eq $b, @$a Any undef undefined !defined $a Any Regex pattern match $a =~ /$b/ Code() Code() results are equal $a−>() eq $b−>() Any Code() simple closure truth $b−>() # ignoring $a Num numish[!] numeric equality $a == $b Any Str string equality $a eq $b Any Num numeric equality $a == $b Any Any string equality $a eq $b + − this must be a code reference whose prototype (if present) is not "" (subs with a "" prototype are dealt with by the 'Code()' entry lower down) * − that is, each element matches the element of same index in the other array. If a circular reference is found, we fall back to referential equality. ! − either a real number, or a string that looks like a numberThe "matching code" doesn't represent the real matching code, of course: it's just there to explain the intended meaning. Unlike grep, the smart match operator will short-circuit whenever it can.
Custom matching via overloading You can change the way that an object is matched by overloading the
~~
operator. This trumps the usual smart match semantics. Seeoverload
.
print "Matched!\n" if ($str1 eq $str2)
Perl has seperate string comparison and numeric comparison operators to help with the loose typing in the language. You should read perlop for all the different operators.
The obvious subtext of this question is:
why can't you just use
==
to check if two strings are the same?
Perl doesn't have distinct data types for text vs. numbers. They are both represented by the type "scalar". Put another way, strings are numbers if you use them as such.
if ( 4 == "4" ) { print "true"; } else { print "false"; }
true
if ( "4" == "4.0" ) { print "true"; } else { print "false"; }
true
print "3"+4
7
Since text and numbers aren't differentiated by the language, we can't simply overload the ==
operator to do the right thing for both cases. Therefore, Perl provides eq
to compare values as text:
if ( "4" eq "4.0" ) { print "true"; } else { print "false"; }
false
if ( "4.0" eq "4.0" ) { print "true"; } else { print "false"; }
true
In short:
- Perl doesn't have a data-type exclusively for text strings
- use
==
or!=
, to compare two operands as numbers - use
eq
orne
, to compare two operands as text
There are many other functions and operators that can be used to compare scalar values, but knowing the distinction between these two forms is an important first step.
And if you'd like to extract the differences between the two strings, you can use String::Diff.