Unable to separate codes in one file to many files in AWK/Python
Question
I need to put different codes in one file to many files. The file is apparantly shared by AWK's creators at their homepage. The file is also here for easy use.
My attempt to the problem
I can get the lines where each code locate by
awk '{ print $1 }'
However, I do no know how
- to get the exact line numbers so that I can use them
- to collect codes between the specific lines so that the first word of each line is ignored
- to put these separate codes into new files which are named by the first word at the line
I am sure that the problem can be solved by AWK and with Python too. Perhaps, we need to use them together.
[edit] after the first answer
I get the following error when I try to execute it with awk
$awk awkcode.txt
awk: syntax error at source line 1
context is
>>> awkcode <<< .txt
awk: bailing out at source line 1
Solution
Did you try to:
- Create a file unbundle.awk with the following content:
$1 != prev { close(prev); prev = $1 } { print substr($0, index($0, " ") + 1) >$1 }
Remove the following lines form the file awkcode.txt:
# unbundle - unpack a bundle into separate files
$1 != prev { close(prev); prev = $1 } { print substr($0, index($0, " ") + 1) >$1 }
- Run the following command:
awk -f unbundle.awk awkcode.txt
OTHER TIPS
Are you trying to unpack a file in that format? It's a kind of shell archive. For more information, see http://en.wikipedia.org/wiki/Shar
If you execute that program with awk, awk will create all those files. You don't need to write or rewrite much. You can simply run that awk program, and it should still work.
First, view the file in "plain" format. http://dpaste.com/12282/plain/
Second, save the plain version of the file as 'awkcode.shar'
Third, I think you need to use the following command.
awk -f awkcode.shar
If you want to replace it with a Python program, it would be something like this.
import urllib2, sys
data= urllib2.urlopen( "http://dpaste.com/12282/plain/" )
currName, currFile = None, sys.stdout
for line in data:
fileName, _, text= line.strip().partition(' ')
if fileName == currName:
currFile.write(line+"\n")
else:
if currFile is not None:
currFile.close()
currName= fileName
currFile= open( currName, "w" )
if currFile is not None:
currFile.close()
Awk file awkcode.txt should not contain ANY BLANK line. If any blank line is encountered, the awk program fails. There is no error check to filter out blank line in the code. This I could find out after several days of struggle.