문제

Forgive my n00bosity:

I am looking to do a find and replace on a large file of MARC records. I want to search for all strings starting with newline =586 and then remove the period at the end of the line, keeping the data in between intact.

I have tried quite a few permutations and none of them seemed to work. I feel I am missing something obvious here. Help?!?

도움이 되었습니까?

해결책 2

Try this

Search: (^=586.*)\.$
Replace: \1

I think this would be the command:

/(^=586.*)\.$/\1/

Note: I don't speak perl, so the syntax might be a little off

다른 팁

While a regex may help you in this case, if you manipulate MARC records regularly, I suggest that you use one of the MARC processing modules on CPAN. You can read your modules out of the file, manipulate what you need to in objects, and then write them back out.

http://search.cpan.org/dist/MARC-Record/ is the one that I wrote back in 2001 and is still being maintained today.

You may also be interested in perl4lib: http://perl4lib.perl.org/

Inline replace,

perl -i -pe '/^ =586/x and s| [.]$||x' file

I imagine that you tried building a regex that would understand the entire line, matching each part of it as precisely as possible, and then failed to get this right. In general, if you want to perform a quick change on every line of some distinction from some file, just start with:

perl -pe 'if (distinctive) { changes }' oldfile > newfile

So in this case:

perl -pe 'if (/^=586/) { s/\.$// }' oldfile > newfile

Or:

# saves original in thefile.bak
perl -i.bak -pe 'if (/^=586/) { s/\.$// }' thefile

If what's distinctive about the line is a matter of a distinctive column (when there are no missing columns), pass the -a flag and find the columns in the @F array:

# censor 4k-sized files
ls -l|perl -ape 'if ($F[4] == 4096) { s/./-/g }'

If you don't want to change the file, but rather get some information from it, -n and final processing in a BEGIN block can take you quite far:

# sum file sizes
ls -l|perl -lane 'next if /^d/; $bytes += $F[4]; END { print $bytes }'

# print unique owners of files in this directory, preceded by the
# number of occurrences of the owner
ls -l|perl -lane '$users{$F[2]}++; END { print "$users{$_} $_" for keys %users }'

mpapec's answer is neatly expressed if you know at the outset that there will only be one change (you could also write it s/\.$// if /^=586/).

Note that this isn't the kind of Perl you'd want to write in a fully-featured, not-once-off, for-use-even-by-people-who-may-not-know-what-they're-doing program. It doesn't use strict, and it doesn't declare its variables. I don't even pass the -w flag, and that only costs a letter!

But Perl wants to be useful. If your thought is "I want to remove the final . from any line beginning with =586", then you can do this as simply as in the second or third example, above. You may still want to come here and say "hey, I'm modifying MARC records with a hundred untitled one-liners and I'm starting to feel bad...", to learn that MARC modules exist in CPAN, but there's no reason to have any difficulty coming up with the one-liners that get your job done.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top