Why Heaps' Law Equation looks so different in this NLP course?

https://datascience.stackexchange.com/questions/48969

01-11-2019
|

Question

I'm actually not sure if this question is allowed on this community since it's more of a linguistics question than it is a data science question. I've searched extensively on the Web and have failed to find an answer and also the Linguistics Beta Stack Exchange community also doesn't seem to be able to help. If it's not allowed here please close it.

Heaps' Law basically is an empirical function that says the number of distinct words you'll find in a document grows as a function to the length of the document. The equation given in the Wikipedia link is

$$V_R(n) = Kn^\beta$$

where $V_R$ is the number of distinct words in a document of size $n$, and $K$ and $\beta$ are free parameters that are chosen empirically (usually $0 \le K \le 100$ and $0.4 \le \beta \le 0.6$).

I'm currently following a course on Youtube called Deep Learning for NLP by Oxford University and DeepMind. There is a slide in a lecture that demonstrates Heaps' Law in a rather different way:

The equation given with the logarithms apparently is also Heaps' Law. The fastest growing curve is a corpus for Twitter data and the slowest is for the Wall Street Journal. Tweets usually have less structure and more spelling errors, etc. compared to the WSJ which would explain the faster-growing curve.

The main question that I had is how Heaps' Law seems to have taken on the form that the author has given? It's a bit of a reach but the author didn't specify what any of these parameters are (i.e. $C$, $\alpha$, $r(w)$, $b$) and I was wondering if anybody might be familiar with Heaps' Law to give me some advise on how to solve my question.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange