帮助将Perl代码例程合并为文件处理
-
15-11-2019 - |
题
我需要一些perl帮助将这些(2)进程/代码一起工作。我能够将它们单独工作以单独测试,但我需要帮助将它们带到一起,特别是使用循环构造。我不确定我是否应该使用foreach..aneways代码在下面。
此外,在我学习这种语言时,任何最佳实践也会很好。谢谢你的帮助。
这是我正在寻找的过程流程:
- 读取一个目录
- 寻找特定文件
- 使用文件名来删除一些关键信息以创建新处理的文件
- 处理输入文件
- 为每个输入文件创建新处理的文件读取(如果我在10中读取,我创建了10个新文件)
第1部分:
.my $target_dir = "/backups/test/"; opendir my $dh, $target_dir or die "can't opendir $target_dir: $!"; while (defined(my $file = readdir($dh))) { next if ($file =~ /^\.+$/); #Get filename attributes if ($file =~ /^foo(\d{3})\.name\.(\w{3})-foo_p(\d{1,4})\.\d+.csv$/) { print "$1\n"; print "$2\n"; print "$3\n"; } print "$file\n"; }
第2部分:
.use strict; use Digest::MD5 qw(md5_hex); #Create new file open (NEWFILE, ">/backups/processed/foo$1.name.$2-foo_p$3.out") || die "cannot create file"; my $data = ''; my $line1 = <>; chomp $line1; my @heading = split /,/, $line1; my ($sep1, $sep2, $eorec) = ( "^A", "^E", "^D"); while (<>) { my $digest = md5_hex($data); chomp; my (@values) = split /,/; my $extra = "__mykey__$sep1$digest$sep2" ; $extra .= "$heading[$_]$sep1$values[$_]$sep2" for (0..scalar(@values)); $data .= "$extra$eorec"; print NEWFILE "$data"; } #print $data; close (NEWFILE);
解决方案
I've bashed your two code fragments together (making the second a sub
that the first calls for each matching file) and, if I understood your description of the objective correctly, this should do what you want. Comments on style and syntax are inline:
#!/usr/bin/env perl
# - Never forget these!
use strict;
use warnings;
use Digest::MD5 qw(md5_hex);
my $target_dir = "/backups/test/";
opendir my $dh, $target_dir or die "can't opendir $target_dir: $!";
while (defined(my $file = readdir($dh))) {
# Parens on postfix "if" are optional; I prefer to omit them
next if $file =~ /^\.+$/;
if ($file =~ /^foo(\d{3})\.name\.(\w{3})-foo_p(\d{1,4})\.\d+.csv$/) {
process_file($file, $1, $2, $3);
}
print "$file\n";
}
sub process_file {
my ($orig_name, $foo_x, $name_x, $p_x) = @_;
my $new_name = "/backups/processed/foo$foo_x.name.$name_x-foo_p$p_x.out";
# - From your description of the task, it sounds like we actually want to
# read from the found file, not from <>, so opening it here to read
# - Better to use lexical ("my") filehandle and three-arg form of open
# - "or" has lower operator precedence than "||", so less chance of
# things being grouped in the wrong order (though either works here)
# - Including $! in the error will tell why the file open failed
open my $in_fh, '<', $orig_name or die "cannot read $orig_name: $!";
open(my $out_fh, '>', $new_name) or die "cannot create $new_name: $!";
my $data = '';
my $line1 = <$in_fh>;
chomp $line1;
my @heading = split /,/, $line1;
my ($sep1, $sep2, $eorec) = ("^A", "^E", "^D");
while (<$in_fh>) {
chomp;
my $digest = md5_hex($data);
my (@values) = split /,/;
my $extra = "__mykey__$sep1$digest$sep2";
$extra .= "$heading[$_]$sep1$values[$_]$sep2"
for (0 .. scalar(@values));
# - Useless use of double quotes removed on next two lines
$data .= $extra . $eorec;
#print $out_fh $data;
}
# - Moved print to output file to here (where it will print the complete
# output all at once) rather than within the loop (where it will print
# all previous lines each time a new line is read in) to prevent
# duplicate output records. This could also be achieved by printing
# $extra inside the loop. Printing $data at the end will be slightly
# faster, but requires more memory; printing $extra within the loop and
# getting rid of $data entirely would require less memory, so that may
# be the better option if you find yourself needing to read huge input
# files.
print $out_fh $data;
# - $in_fh and $out_fh will be closed automatically when it goes out of
# scope at the end of the block/sub, so there's no real point to
# explicitly closing it unless you're going to check whether the close
# succeeded or failed (which can happen in odd cases usually involving
# full or failing disks when writing; I'm not aware of any way that
# closing a file open for reading can fail, so that's just being left
# implicit)
close $out_fh or die "Failed to close file: $!";
}
Disclaimer: perl -c
reports that this code is syntactically valid, but it is otherwise untested.
其他提示
You are using an old-style of Perl programming. I recommend you to use functions and CPAN modules (http://search.cpan.org). Perl pseudocode:
use Modern::Perl;
# use...
sub get_input_files {
# return an array of files (@)
}
sub extract_file_info {
# takes the file name and returs an array of values (filename attrs)
}
sub process_file {
# reads the input file, takes the previous attribs and build the output file
}
my @ifiles = get_input_files;
foreach my $ifile(@ifiles) {
my @attrs = extract_file_info($ifile);
process_file($ifile, @attrs);
}
Hope it helps
不隶属于 StackOverflow