If your columns are separated by multiple spaces, Text::CSV is useless. Your code contains a lot of repeated code, trying to work around of Text::CSV limitations.
Also, your code has bad style, contains multiple syntax errors and typos, and confused variable names.
So You Want To Parse A Header.
We need a definition of the header line for our code. Let's take “the first comment line that contains non-space characters”. It may not be preceded by non-comment lines.
use strict; use warnings; use autodie;
open my $fh, '<:encoding(UTF-8)', "filename.tsv"; # error handling by autodie
my @headers;
while (<$fh>) {
# no need to copy to a $line variable, the $_ is just fine.
chomp; # remove line ending
s/\A#\s*// or die "No header line found"; # remove comment char, or die
/\S/ or next; # skip if there is nothing here
@headers = split; # split the header names.
# The `split` defaults to `split /\s+/, $_`
last; # break out of the loop: the header was found
}
The \s
character class matches space characters (spaces, tabs, newlines, etc.). The \S
is the inverse and matches all non-space characters.
The Rest
Now we have our header names, and can proceed to normal parsing:
my @records;
while (<$fh>) {
chomp;
next if /\A#/; # skip comments
my @fields = split;
my %hash;
@hash{@headers} = @fields; # use hash slice to assign fields to headers
push @records, \%hash; # add this hashref to our records
}
Voilà.
The Result
This code produces the following data structure from your example data:
@records = (
{
address => "0x1234fde0",
name => "test.data.one",
scale => 32768,
type => "float",
},
{
address => "0x1234fde4",
name => "test.data.two",
scale => 32768,
type => "float",
},
{
address => "0x1234fde8",
name => "test.data.the",
scale => 32768,
type => "float",
},
{
address => "0x1234fdec",
name => "test.data.for",
scale => 32768,
type => "float",
},
{
address => "0x1234fdf0",
name => "test.data.fiv",
scale => 32768,
type => "float",
},
);
This data structure could be used like
for my $record (@records) {
say $record->{name};
}
or
for my $i (0 .. $#records) {
say "$i: $records[$i]{name}";
}
Criticism Of Your Code
You declare all your variables at the top of your script, effectively making them global variables. Don't. Create your variables in the smallest scope possible. My code uses just three variables in the outer scope:
$fh
,@headers
and@records
.This line
my $csv=Text::CSV({sep_char = ","})
doesn't work as expected.Text::CSV
is not a function; it is the name of a module. You meantText::CSV->new(...)
.- The options should be a hashref, but
sep_char = ","
tries to assign something tosep_char
sadly, this could be valid syntax. But you actually meant to specify a key-value relationship. Use the=>
operator instead (called fat comma or hash rocket).
Neither does this work:
or die "Text::CSV error: " Text::CSV=error_diag
.- To concatenate strings, use the
.
concatenation operator. What you wrote is a syntax error: A literal string is always followed by an operator. - You really like assignments? The
Text::CSV=error_diag
does not work. You intended to call theerror_diag
method on theText::CSV
class. Therefore, use the correct operator->
:Text::CSV->error_diag
.
- To concatenate strings, use the
The substitution
s/t+/,/g
replaces all sequences oft
s by commas. To replace tabs, use the\t
charclass.%arrayofhashes
is not an array of hashes: It is a hash (as evidenced by the%
sigil), but you use integer numbers as keys. Arrays have the@
sigil.To add something to the end of an array, I'd rather not keep the index of the last item in an extra variable. Rather, use the
push
function to add an item to the end. This reduces the amount of bookkeeping code.if you find yourself writing a loop like
my $i = 0; while (condition) { do stuff; $i++}
, then you usually want to have a C-stylefor
loop:for (my $i = 0; condition; $i++) { do stuff; }
This also helps with proper scoping of variables.