Question

I'm writing a program in C that basically creates an archive file for a given list of file names. This is pretty similar to the ar command in linux. This is how the archive file would look like:

!<arch>
file1.txt/      1350248044  45503 13036 100660  28        `
hello
this is sample file 1
file2.txt/      1350512270  45503 13036 100660  72        `
hello
this is sample file 2
this file is a little larger than file1.txt

But I'm having difficulties trying to exract a file from the archive. Let's say the user wants to extract file1.txt. The idea is it should get the index/location of the file name (in this case file1.txt), skip 58 characters to reach the content of the file, read the content, and write it to a new file. So here's my questions:

1) How can I get the index/location of the file name in the archive file? Note that duplicate file names are NOT allowed, so I don't have to worry about having two different indecies.

2) How can I skip several characters (in this case 58) when reading a file?

3) How can I figure out when the content of a file ends? i.e. I need it to read the content and stop right before the file2.txt/ header.

Was it helpful?

Solution

My approach to solving this problem would be:

To have a header information that contains the size of each file, its name and its location in the file.

Then parse the header, use fseek() and ftell() as well as fgetc() or fread() functions to get bytes of the file and then, create+write that data to it. This is the simplest way I can think of.

http://en.wikipedia.org/wiki/Ar_(Unix)#File_header <- Header of ar archives.

EXAMPLE: @programmer93 Consider your header is 80 bytes long(header contains the meta-data of the archive file). You have two files one of 112 bytes and the other of 182 bytes. Now they're laid out in a flat file(the archive file). So it would be 80(header).112(file1.txt).182(file2.txt).EOF . Thus if you know the size of each file, you can easily navigate(using fseek()) to a particular file and extract only that file. [to extract file2.txt I will just fseek(FILE*,(112+80),SEEK_SET); and then fgetc() 182 times. I think I made myself clear?

OTHER TIPS

If the format of the file cannot be changed by adding additional header information to help, you'll have to search through it and work things out as you go.

This should not be too hard. Just read the file, and when you read a header line such as

file1.txt/      1350248044  45503 13036 100660  28        `

you can check the filename and size etc. (You know you'll have a header line at the start after the !<arch>). If this is the file you want, the ftell() function from stdio.h will tell you exactly where you are in the file. Since the file size in bytes is given in the header line, you can read the file by reading that particular number of bytes ahead in the normal manner. Similarly, if it is not the file you want, you can use fseek() to move forward the number of bytes in the file you are skipping and be ready to read in the header info for the next file and repeat the process.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top