| --- |
| license: afl-3.0 |
| datasets: |
| - WillHeld/hinglish_top |
| language: |
| - en |
| metrics: |
| - accuracy |
| library_name: transformers |
| pipeline_tag: fill-mask |
| --- |
| |
| ### SRDberta |
|
|
| This is a BERT model trained for Masked Language Modeling for English Data. |
|
|
| ### Dataset |
| Hinglish-Top [Dataset](https://huggingface.co/datasets/WillHeld/hinglish_top) columns |
| - en_query |
| - cs_query |
| - en_parse |
| - cs_parse |
| - domain |
|
|
| ### Training |
| |Epoch|Loss| |
| |:--:|:--:| |
| |1 |0.0485| |
| |2 |0.00837| |
| |3 |0.00812| |
| |4 |0.0029| |
| |5 |0.014| |
| |6 |0.00748| |
| |7 |0.0041| |
| |8 |0.00543| |
| |9 |0.00304| |
| |10 |0.000574| |
|
|
| ### Inference |
| ```python |
| from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline |
| |
| tokenizer = AutoTokenizer.from_pretrained("SRDdev/SRDBerta") |
| |
| model = AutoModelForMaskedLM.from_pretrained("SRDdev/SRDBerta") |
| |
| fill = pipeline('fill-mask', model='SRDberta', tokenizer='SRDberta') |
| ``` |
| ```python |
| fill_mask = fill.tokenizer.mask_token |
| fill(f'Aap {fill_mask} ho?') |
| ``` |
|
|
| ### Citation |
| Author: @[SRDdev](https://huggingface.co/SRDdev) |
| ``` |
| Name : Shreyas Dixit |
| framework : Pytorch |
| Year: Jan 2023 |
| Pipeline : fill-mask |
| Github : https://github.com/SRDdev |
| LinkedIn : https://www.linkedin.com/in/srddev/ |
| ``` |