質問

Im trying to implement External Sorting in C.

I have to read N integers (fixed depending on main memory) from a file initially so that I can apply quicksort on them and then continue with the merging process.

I can think of these 2 ways:

  1. read N integers one by one from the file and put them in an array then sort them.
  2. read a bulk of data into a big char array and then reading integers from it using sscanf.

1st method is clearly slow and 2nd method is using lot of extra memory (but we have a limited main memory)

Is there any better way?

役に立ちましたか?

解決

Don't try to be more clever than your OS, it probably supports some clever memory management functions, which will make your life easier, and your code faster.

Assuming you are using a POSIX compliant operating system, then you can use mmap(2).

  1. Map your file into memory with mmap
  2. Sort it
  3. Sync it

This way the OS will handle swapping out data when room is tight, and swap it in when you need it.

他のヒント

Since stdio file operations are buffered, you won't really need to worry about the first option, especially if the file isn't huge. Remember you're not operating directly on a file, but a representation of that file in memory.

For example, if you scan in one number at a time, the system will read in a much larger section from the file (on my system it's 4096 bytes, or the entire file if it's shorter).

you can use below function to read ints from file one by one and continue sorting and merging on the go....

the function takes filename and integer count as argument and it returns int from file.

int read_int (const char *file_name, int count)
{
  int err = -1;
  int num = 0;

  int fd = open(filename, O_RDONLY);
  if(fd < 0)
  {
    printf("error opening file\n");
    return (fd);
  }

  err = pread(fd, &num, sizeof(int), count*sizeof(int));
  if(err < 0)
  {
    printf("End of file reached\n");
    return (err);
  }

  close(fd);
  return (num);  
}

Sort at the same time you read is the best way. and save your data into linked list instead of array is more efficient in the sort

you can use fscanf() to read integer by integer from file. and try to sort at the moment you read integer from the file. I mean when you read integer from the file put it in the array in the right place to get the array sorted when you finish reading.

The following example read from file integer by integer and insert them with sort at the same time of reading. the integer are saved into arrays and not into linked list

void sort_insert(int x, int *array, int len)
{
    int i=0, j;
    for(i=0; i<(len-1); i++)
    {
        if (x > array[i])
            continue;
        for (j=(len-1); j>i; j--)
            array[j] = array[j-1];
        break;
    }
    array[i] = x;
}

void main() {
    int x, i;
    int len = 0;
    int array[50];
    FILE *fp = fopen("myfile.txt", "r");

    while (len<50 && fscanf(fp, " %d",&x)>0)
    {
        len++;
        sort_insert(x, array, len);
    }
    for (i=0; i<len; i++)
    {
        printf("array[%d] = %d\n", i, array[i]);
    }

}
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top