Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Expanded DeBERTa implementation with support for hierarchical encoder for open-vocabulary language modeling.

License

Notifications You must be signed in to change notification settings

kevinkrahn/CTW_DeBERTa

 
 

Repository files navigation

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

This repository is a fork of the official implementation of DeBERTa: Decoding-enhanced BERT with Disentangled Attention and DeBERTa V3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

This repository is a modification of DeBERTa, adding a hierarchical architecture which combines the advantages of both character and word tokenizers, with word-level self attention and unlimited vocabulary as described in From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding. This implementation combines its advantages with the DeBERTA architecture. Pass --token_format char_to_word to the data preparation and training scripts.

Supported token formats

--token_format char

--token_format subword

--token_format char_to_word

About

Expanded DeBERTa implementation with support for hierarchical encoder for open-vocabulary language modeling.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 90.0%
  • Shell 10.0%