DistilBERT Text classification using Keras
Oct 19, 2020
In the previous blog, I covered the text classification task using BERT. In this blog let’s cover the smaller version of BERT and that is DistilBERT. DistilBERT is a smaller version of BERT developed and open-sourced by the team at HuggingFace. It’s a lighter and faster version of BERT that roughly matches its performance.
DistilBERT has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT’s performances as measured on the GLUE language understanding benchmark.
DistilBERT Implementation in Keras.
- First, the trained distilBERT was used to generate sentence embedding (768 dimensions) for the dataset.
- Then a basic NN Architecture (with Dense and Dropout layers) was used for the further classification task and the training.
- Finally, the evaluation of the model.
- The Tensorboard visualization is not clearly visible here, the output of Cell 23 will be something like this:
Happy Learning:)