Question

Suppose I have a text file that says: This file has plain text

Now, I want to divide this text file into 'n' number of parts with equal characters. Suppose the user wants three divisions, then 1.txt, 2.txt and 3.txt should be created, with about 8 characters each.

(The next part of this program is to rejoin these files back into the original state but I'm sure I'll be able to do that myself If I can get help with this first part).

Can you guys guide me in this matter?

Language used is C.

void main(int argc, char** argv)
{
int iChunkNo=0;
char cFileName[30];

iChunkNo = atoi(argv[2]);
strcpy(cFileName, argv[1]);

printf("The file will be divided into %d chunks \n", iChunkNo);

FILE* file_read_pointer;
file_read_pointer = fopen(cFileName, "r");

int iCount=0;
char ch;

while (1) 
{
        ch = fgetc(file_read_pointer);
        if (ch == EOF)
            break;
        ++iCount;
}

printf("The number of characters in the file is: %d \n", iCount);   

int iCharPerFile = 0;
iCharPerFile = iCount/iChunkNo;

printf("The number of characters per chunk file will be: %d \n", iCharPerFile);

FILE* file_write_pointer;   

int j=1;
for(j; j<=iChunkNo; j++)
{   
    char num[5] = {j};
    char ext[4] = {"txt"};

    char name[15];
    sprintf(name, "%d.%s", j, ext);     

    FILE* file_write_pointer;
    file_write_pointer = fopen(name, "w");
}

int i=0;

for(i; i<iCharPerFile; i++)
{
    char temp;
    temp = fgetc(file_read_pointer);
    //fputc(temp, file_write_pointer);
    fprintf(file_write_pointer, "%c", temp);
}
}
Was it helpful?

Solution

You could do something like this:

  • Find the size of the file (using fseek and ftell). Then seek back to the beginning
  • Determine the size you need to write to each file (size / n)
  • Read one character at a time (use getc) and write to the corresponding file (putc)
    • Use sprintf(fname, "%d.txt", index) to build names like "1.txt"
    • Use fopen to open files and keep a FILE *current to which you write at each step

That being said, you should start slow. First make a program that simply copies one file into another using getc + putc and work your way up.

OTHER TIPS

You're going to need to work out how much text there is in total, possibly by 'slurping' the whole file into memory. This will work fine for file sizes up to megabytes in size, but not for gigabytes and beyond.

Then you need to know how many parts to split it into (command line argument processing?), and therefore how much data to write to write to each part. This will be followed by creating each of the output files, writing the right segment of data to the file, and closing it.

Supposing the file has 29 bytes and you need to split it in 3 parts. Will you create 2 parts with 9 bytes and one with 11, or 2 with 10 and one with 9. What about if it is 28 bytes long and 5 parts? (4 x 5 bytes + 1 x 8 bytes? 3 x 6 bytes + 2 x 5 bytes? Or is that 2 x 5 bytes + 3 x 6 bytes?)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top