Pergunta

I was in a code review this morning and came across a bit of code that was wrong, but I couldn't tell why.

$line =~ /^[1-C]/;

This line was suppose to evaluate to a hex character between 1 and C, but I assume this line does not do that. The question is not what does match, but what does this match? Can I print out all characters in a character class? Something like below?

say join(', ', [1-C]);

Alas,

# Examples:
say join(', ', 1..9);
say join(', ', 'A'..'C');
say join(', ', 1..'C');

# Output
Argument "C" isn't numeric in range (or flop) at X:\developers\PERL\Test.pl line 33.

1, 2, 3, 4, 5, 6, 7, 8, 9
A, B, C
Foi útil?

Solução

It matches every code point from U+0030 ("1") to U+0043 ("C").

The simple answer is to use

map chr, ord("1")..ord("C")

instead of

"1".."C"

as you can see in the following demonstration:

$ perl -Mcharnames=:full -E'
   say sprintf " %s  U+%05X %s", chr($_), $_, charnames::viacode($_)
      for ord("1")..ord("C");
'
 1  U+00031 DIGIT ONE
 2  U+00032 DIGIT TWO
 3  U+00033 DIGIT THREE
 4  U+00034 DIGIT FOUR
 5  U+00035 DIGIT FIVE
 6  U+00036 DIGIT SIX
 7  U+00037 DIGIT SEVEN
 8  U+00038 DIGIT EIGHT
 9  U+00039 DIGIT NINE
 :  U+0003A COLON
 ;  U+0003B SEMICOLON
 <  U+0003C LESS-THAN SIGN
 =  U+0003D EQUALS SIGN
 >  U+0003E GREATER-THAN SIGN
 ?  U+0003F QUESTION MARK
 @  U+00040 COMMERCIAL AT
 A  U+00041 LATIN CAPITAL LETTER A
 B  U+00042 LATIN CAPITAL LETTER B
 C  U+00043 LATIN CAPITAL LETTER C

If you have Unicode::Tussle installed, you can get the same output from the following shell command:

unichars -au '[1-C]'

You might be interested in wasting time browsing the Unicode code charts. (This particular range is covered by "Basic Latin (ASCII)".)

Outras dicas

This is a simple program to test the range of that regexpr:

use strict;
use warnings;
use Test::More qw(no_plan);

for(my $i=ord('1'); $i<=ord('C'); $i++ ) {
   my $char = chr($i);
   ok $char =~ /^[1-C]/, "match: $char";
}

Generate this result:

ok 1 - match: 1
ok 2 - match: 2
ok 3 - match: 3
ok 4 - match: 4
ok 5 - match: 5
ok 6 - match: 6
ok 7 - match: 7
ok 8 - match: 8
ok 9 - match: 9
ok 10 - match: :
ok 11 - match: ;
ok 12 - match: <
ok 13 - match: =
ok 14 - match: >
ok 15 - match: ?
ok 16 - match: @
ok 17 - match: A
ok 18 - match: B
ok 19 - match: C
1..19

[1-9A-C] is that match a hex number between 1 and C

[a char-an another char] match all the chars between the two chars in the Unicode table

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top