Question

i'm working on a project and i have some problems.I have searched but can not find any satisfied answer.

i have a huge file consists of 0 and 1s. i'm getting 1024 (my chunk) bits into an array chunk and after that i apply SHA1() function which is implemented in openssl/sha.h library.

char chunk[1024]; while((fgets(chunk,1024,fp))!=NULL)

My intention is that my file can consist of same chunks and i want to count how many chunks are same.

After i get 1024 bits in my array chunk i apply :

unsigned char obuf[20];

SHA1(chunk,strlen(chunk), obuf); function to get the result of the hash function.

here how SHA1 function works

unsigned char *SHA1(const unsigned char *d, unsigned long n,unsigned char *md);

after that i want to store my hash function result in an array.After i read all of my file, i'll use this array to compare if there are same hash results or not, by this way i can start my project.But i stuck in this point. i can not put the result obuf to an array.

i tried : memcopy() strcopy() or just myarray[N][20]=obuf; etc.

if you suggest some way, i'll be happy thanks.

so the biggest problem is that finding the how many hashes are unique ?

Was it helpful?

Solution

Firstly, you say that your chunks of input file are of size 1024 - however this line will read at most 1023 characters from your file (it uses one space for the null terminator):

char chunk[1024]; while((fgets(chunk,1024,fp))!=NULL)

(I think fread might well be closer to what you're trying to do here)

Secondly, you can just do something like:

#define MAX_CHUNKS 1000

unsigned char chunk[1024];
unsigned char obuf[MAX_CHUNKS][20];
int chunk_n = 0;

while (fread(chunk, sizeof chunk, 1, fp) > 0 && chunk_n < MAX_CHUNKS)
{
    SHA1(chunk, sizeof chunk, obuf[chunk_n++]);
}

/* Now have chunk_n SHA1s stored in obuf[0] through obuf[chunk_n -1] */
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top