Clarification on chomp

https://stackoverflow.com/questions/7290445

perl
chomp

19-01-2021
|

Question

I'm on break from classes right now and decided to spend my time learning Perl. I'm working with Beginning Perl (http://www.perl.org/books/beginning-perl/) and I'm finishing up the exercises at the end of chapter three.

One of the exercises asked that I "Store your important phone numbers in a hash. Write a program to look up numbers by the person's name."

Anyway, I had come up with this:

#!/usr/bin/perl
use warnings;
use strict;

my %name_number=
(
Me => "XXX XXX XXXX",
Home => "YYY YYY YYYY",
Emergency => "ZZZ ZZZ ZZZZ",
Lookup => "411"
);

print "Enter the name of who you want to call (Me, Home, Emergency, Lookup)", "\n";
my $input = <STDIN>;
print "$input can be reached at $name_number{$input}\n";

And it just wouldn't work. I kept getting this error message:

Use of uninitialized value in concatenation (.) or string at hello.plx line 17, line 1

I tried playing around with the code some more but each "solution" looked more complex than the "solution" that came before it. Finally, I decided to check the answers.

The only difference between my code and the answer was the presence of chomp ($input); after <STDIN>;.

Now, the author has used chomp in previous example but he didn't really cover what chomp was doing. So, I found this answer on www.perlmeme.org:

The chomp() function will remove (usually) any newline character from the end of a string. The reason we say usually is that it actually removes any character that matches the current value of $/ (the input record separator), and $/ defaults to a newline..

Anyway, my questions are:

What newlines are getting removed? Does Perl automatically append a "\n" to the input from <STDIN>? I'm just a little unclear because when I read "it actually removes any character that matches the current value of $/", I can't help but think "I don't remember putting a $/ anywhere in my code."
I'd like to develop best practices as soon as possible - is it best to always include chomp after <STDIN> or are there scenarios where it's unnecessary?

Solution

<STDIN> reads to the end of the input string, which contains a newline if you press return to enter it, which you probably do.

chomp removes the newline at the end of a string. $/ is a variable (as you found, defaulting to newline) that you probably don't have to worry about; it just tells perl what the 'input record separator' is, which I'm assuming means it defines how far <FILEHANDLE> reads. You can pretty much forget about it for now, it seems like an advanced topic. Just pretend chomp chomps off a trailing newline. Honestly, I've never even heard of $/ before.

As for your other question, it is generally cleaner to always chomp variables and add newlines as needed later, because you don't always know if a variable has a newline or not; by always chomping variables you always get the same behavior. There are scenarios where it is unnecessary, but if you're not sure it can't hurt to chomp it.

Hope this helps!

OTHER TIPS

OK, as of 1), perl doesn't add any \n at input. It is you that hit Enter when finished entering the number. If you don't specify $/, a default of \n will be put (under UNIX, at least).

As of 2), chomp will be needed whenever input comes from the user, or whenever you want to remove the line ending character (reading from a file, for example).

Finally, the error you're getting may be from perl not understanding your variable within the double quotes of the last print, because it does have a _ character. Try to write the string as follows:

print "$input can be reached at ${name_number{$input}}\n";

(note the {} around the last variable).

<STDIN> is a short-cut notation for readline( *STDIN );. What readline() does is reads the file handle until it encounters the contents of $/ (aka $INPUT_RECORD_SEPARATOR) and returns everything it has read including the contents of $/. What chomp() does is remove the last occurrence contents of $/, if present.

The contents is often called a newline character but it may be composed of more than one character. On Linux, it contains a LF character but on Windows, it contains CR-LF.

See:

perldoc -f readline
perldoc -f chomp
perldoc perlvar and search for /\$INPUT_RECORD_SEPARATOR/

I think best practice here is to write:

chomp(my $input = <STDIN>);

Here is quick example how chomp function ($/ meaning is explained there) works removing just one trailing new line (if any):

chomp (my $input = "Me\n"); # OK
chomp ($input = "Me"); # OK (nothing done)
chomp ($input = "Me\n\n"); # $input now is "Me\n";
chomp ($input); # finally "Me"

print "$input can be reached at $name_number{$input}\n";

BTW: That's funny thing is that I am learning Perl too and I reached hashes five minutes ago.

Though it may be obvious, it's still worth mentioning why the chomp is needed here.

The hash created contains 4 lookup keys: "Me", "Home", "Emergency" and "Lookup"

When $input is specified from <STDIN>, it'll contain "Me\n", "Me\r\n" or some other line-ending variant depending on what operating system is being used.

The uninitialized value error comes about because the "Me\n" key does not exist in the hash. And this is why the chomp is needed:

my $input = <STDIN>; # "Me\n" --> Key DNE, $name_number{$input} not defined
chomp $input;        # "Me"   --> Key exists, $name_number{$input} defined

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow