Perl: basic question, function functionality
Question
What does this function do?
sub MyDigit {
return <<END;
0030\t0039
END
}
Solution
That's called a "here-document", and is used for breaking strings up over multiple lines as an alternative to concatenation or list operations:
print "this is ",
"one line when printed, ",
"because print takes multiple ",
"arguments and prints them all!\n";
print "however, you can also " .
"concatenate strings together " .
"and print them all as one string.\n";
print <<DOC;
But if you have a lot of text to print,
you can use a "here document" and create
a literal string that runs until the
delimiter that was declared with <<.
DOC
print "..and now we're back to regular code.\n";
You can read more about here-documents in the manual: see perldoc perlop.
OTHER TIPS
You’ve all missed the point!
It’s defining a user-defined property for use in \p{MyDigit}
and \P{MyDigit}
using regular expressions.
It’s like these:
sub InKana {
return <<'END';
3040 309F
30A0 30FF
END
}
Alternatively, you could define it in terms of existing property names:
sub InKana {
return <<'END';
+utf8::InHiragana
+utf8::InKatakana
END
}
You can also do set subtraction using a "C<->" prefix. Suppose you only wanted the actual characters, not just the block ranges of characters. You could weed out all the undefined ones like this:
sub IsKana {
return <<'END';
+utf8::InHiragana
+utf8::InKatakana
-utf8::IsCn
END
}
You can also start with a complemented character set using the "C" prefix:
sub IsNotKana {
return <<'END';
!utf8::InHiragana
-utf8::InKatakana
+utf8::IsCn
END
}
I figure I must be right, since I’m speaking ex camelis. :)
It uses something called a Here Document to return a string "0030\t0039"
It returns the string "0030\t0039\n"
(\t
being a tab and \n
a newline that is being added because the line ends in a newline (obviously)).
<<FOO
sometext
FOO
Is a so-called heredoc, a way to conveniently write multi-line strings (though here it is used with only one line).
You can help yourself by trying a simple experiment:
C:\Temp> cat t.pl
#!/usr/bin/perl
use strict; use warnings;
print MyDigit();
sub MyDigit {
return <<END;
0030\t0039
END
}
Output:
C:\Temp> t | xxd 0000000: 2020 2020 3030 3330 0930 3033 390d 0a 0030.0039..
Now, in your case, the END
is not lined up at the beginning of the line, so you should have gotten the message:
Can't find string terminator "END" anywhere before EOF at …