Question

I am trying to understand a code in word2vec project. The file I am referring to is word2vec.c. The snippet of the code is :

#define EXP_TABLE_SIZE 1000
#define MAX_EXP 6
//...snip...
expTable = (real *)malloc((EXP_TABLE_SIZE + 1) * sizeof(real));
  for (i = 0; i < EXP_TABLE_SIZE; i++) {
    // Precompute the exp() table
    expTable[i] = exp((i / (real)EXP_TABLE_SIZE * 2 - 1) * MAX_EXP); 
    // Precompute f(x) = x / (x + 1)
    expTable[i] = expTable[i] / (expTable[i] + 1);
  }
//...snip...

It is not clear what is the benefit of precomputing these values. Could someone explain?

Was it helpful?

Solution 2

The table holds the values of exp for arguments in the range -6 to 6. The function is sampled at 1001 points.

The following code in the same source file word2vec.c uses this table to calculate exponents:

    ...
    if (f <= -MAX_EXP) continue;
    else if (f >= MAX_EXP) continue;
    else f = expTable[(int)((f + MAX_EXP) * (EXP_TABLE_SIZE / MAX_EXP / 2))];
    ...

(So if you wonder what value there is at the first cell of the table - it's exp(-6))

OTHER TIPS

this this piece of code calculate a logistic function table in range [-6, 6) and the step size is 1 / EXP_TABLE_SIZE * 2 = 0.002. I will explain in detail.

if x in range [0, 1) then x * 2 - 1 will in range [-1, 1) and (x * 2 - 1) * MAX_EXP will in range [-MAX_EXP, MAX_EXP). Replace the x with i / EXP_TABLE_SIZE, you get the first line in the code. So, the first line calculate EXP(x), x in range [-6, 6), and the step size is 1 / EXP_TABLE_SIZE * 2. the second line, EXP(X) / (EXP(X) + 1) is a logistic function.

When using the table, you need to transform a real number in range (-6, 6) to an index number in range [0, 1000). In the code f is in range (-6, 6). f + MAX_EXP is in range (0, 12), (f + MAX_EXP) / MAX_EXP / 2 is in range (0, 1). The multiply this value with EXP_TABLE_SIZE you get the index according to the f value.

Finally, you get logistic(f)

You frequently see these types of tables used in finite state machines. These tables are often generated at runtime and basically offer a faster execution time at the expense of the memory allocated to store it.

Once the table is calculated/generated, all you need to do to get an answer is properly index the table. The idea is that finding and indexing an existing table is much faster than performing the calculations from scratch each time.

This is a great Question, there are some great answers!

What is an ExpTable used for? Writing exponential functions from tables

It might be worth going through the excellent Kahn Academy Series on exp if you are not already familiar with theie uses.

You can see here:

enter image description here

The 'exp' function is applied to Vectors (v"sub wo Transpose) multiplied by (v sub wI)

You can read the paper on Word2Vec here, Distributed Representations of Words and Phrases and their Compositionality

else f = expTable[(int)((f + MAX_EXP) * (EXP_TABLE_SIZE / MAX_EXP / 2))];

As you can see, at line: 443, above snippet, again at: 468, and 499 of the word2vec.c Code File a Lookup is performed as Peng Liu says.

A fantastic blog can be read here for further information: Word2Vec Implementations of Gradient and expTable

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top