What does the number after a machine learning model name mean?

https://datascience.stackexchange.com/questions/69045

09-12-2020
|

Question

I'm not sure if this is off-topic, but I'm posting here anyway.

So I saw lots of machine learning models have like an ID after their names, for example, resnet101, resnet152, densenet201 etc. What exactly do those numbers 101, 152 and 201 mean? And how it's determined?

Solution

As @Icrmorin said the naming conventions may vary but for the examples you gave, ResNet and DenseNet, the numbers in the name correspond to the number of layers:

DenseNet

Table 1 in the Densenet paper provides an overview:

As you can see, for example, in the DenseNet-121 column this network has $1+6*2+1+12*2+1+24*2+1+16*2 + 1 = 121$ layers and that is where the name is derived from.

ResNet

The ResNet paper provides a similar overview:

Again, you can see how the names are derived: for example ResNet-18 has $1+2*2+2*2+2*2+2*2+1=18$ layers.

Note that in both papers only conv. and dense layers are counted but not the pooling layers.

OTHER TIPS

Sometimes it refers to a version (like windows 10), sometimes it refers to a size of the parameters, like the size of a layer, the number of parameters (like for GPT models), or the size of parameters in memory. They are just names so we can know what we are talking about. I don't think there is a general convention.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange