Question

BERT is an NLP model developed by Google. The original BERT model is built by Tensorflow team there is also a version of BERT which is built using PyTorch. What is the main difference between these two models?

Was it helpful?

Solution

There are not only 2, but many implementations of BERT. Most are basically equivalent.

The implementations that you mentioned are:

These are the differences regarding different aspects:

  • In terms of results, there is no difference in using one or the other, as they both use the same checkpoints (same weights) and their results have been checked to be equal.
  • In terms of reusability, HuggingFace library is probably more reusable, as it is designed specifically for that. Also, it gives you the freedom of choosing TensorFlow or Pytorch as deep learning framework.
  • In terms of performance, they should be the same.
  • In terms of community support (e.g. asking questions in github or stackoverflow about them), HuggingFace library is better suited, as there are a lot of people using it.

Apart from BERT, the transformers library by HuggingFace has implementations for lots of models: OpenAI GPT-2, RoBERTa, ELECTRA, ...

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top