Gumbel Softmax vs Vanilla Softmax for GAN training

https://datascience.stackexchange.com/questions/36551

machine-learning
sampling
training
text-generation
gan

31-10-2019
|

Question

When training a GAN for text generation, i have seen many people feeding the gumbel-softmax from the generator output and feed into the discriminator. This is to bypass the problem of having to sample from the softmax which is a non-differentiable process and hence prevents training.

My question is though, why not just feed the regular softmax (no argmax!) from the generator directly into the discriminator? What is the benefit of using the gumbel-softmax?

Thanks.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange