BERT model functionality
Adding BERT functionality, including WordPiece subword tokenizer (as well as other common sub word tokenizers), pretraining, finetuning and saving BERT models, loading and using pretrained BERT models.
-
Interface into huggingface tokenizers library for training new tokenizers -
Interface into BERT models in transformers library. For each task class: initialization, saving and loading, training for associated tasks, extracting outputs for specific task, task prediction, unit tests and integration tests as needed -
Base class for extracting features (aka hidden states) from any BERT task model. -
BERT pretraining class -
BERT masked language model class -
BERT sequence classification class -
BERT token classification class optional (for now): -
BERT multiple choice class -
BERT question answering class -
BERT co-reference fine-tuning class
-
-
Add custom dataset and collator functionality for training -
Generalize interface for any Transformer model -
Documentation -
API docs -
Use cases -
Code snippet usage examples -
Update readme
-
-
Passing all tests -
Developer documentation
Edited by RENNER Joseph