How to do vocabulary estimation based on observed writings?
-
01-11-2019 - |
Pregunta
Below is a scatter plot of the data set I am dealing with. The X axis is the total number of words per essay for a particular individual, and they Y axis is the number of unique words. In principle, the number of unique words should approach the individuals vocabulary.
I am attempting to find that individual's vocabulary from the data below, but I don't know what kind of a fit would work. A logarithm would have no limit, a quadratic fit doesn't make sense (the gradient should remain non-negative over the entire domain).
In short, I am looking for a decent model to fit the data below, and don't know where to start.
Thank you.
No hay solución correcta
Licenciado bajo: CC-BY-SA con atribución
No afiliado a datascience.stackexchange