We currently have indexed 37 BERT-based models,
21 Languages and 33 Tasks.
We have a total of 212 entries in this table; we also show
Multilingual Bert (mBERT) results if available! (see our paper)
Curious which BERT model is the best for named entity recognition in Italian? Just type "Italian NER" in the search bar!
Do you want to add your model or update some rows? Click Here
Language | Model | NLP Task | Dataset | Dataset-Domain | Measure | Performance | mBERT | Difference with mBERT | Source |
---|---|---|---|---|---|---|---|---|---|
Estonian | EstBERT (128) | POS (coarse) |
UDv2.5 EDT | fiction, newspapers, scientific texts | Accuracy | 97.89 |
97.42 |
0.47 |
|
Estonian | EstBERT (128) | XPOS |
UDv2.5 EDT | fiction, newspapers, scientific texts | Accuracy | 98.4 |
98.06 |
0.34 |
|
Estonian | EstBERT (128) | Morph |
UDv2.5 EDT | fiction, newspapers, scientific texts | Accuracy | 96.93 |
96.24 |
0.69 |
|
Estonian | EstBERT (128) | DP |
UDv2.5 EDT | fiction, newspapers, scientific texts | LAS | 83.94 |
N/A |
N/A |
|
Estonian | EstBERT (128) | DP |
UDv2.5 EDT | fiction, newspapers, scientific texts | UAS | 86.7 |
N/A |
N/A |
|
Estonian | EstBERT (128) | TC |
Estonian Valency Corpus | news | Accuracy | 81.7 |
75.67 |
6.03 |
|
Estonian | EstBERT (128) | SA |
Estonian Valency Corpus | news | Accuracy | 74.36 |
70.23 |
4.13 |
|
Estonian | EstBERT (128) | NER |
EstNER | news | F1 | 90.11 |
86.51 |
3.6 |
|
Estonian | EstBERT (512) | POS (coarse) |
UDv2.5 EDT | fiction, newspapers, scientific texts | Accuracy | 97.84 |
97.43 |
0.41 |
|
Estonian | EstBERT (512) | XPOS |
UDv2.5 EDT | fiction, newspapers, scientific texts | Accuracy | 98.43 |
98.13 |
0.3 |
|
Estonian | EstBERT (512) | Morph |
UDv2.5 EDT | fiction, newspapers, scientific texts | Accuracy | 96.8 |
96.13 |
0.67 |
|
Estonian | EstBERT (512) | TC |
Estonian Valency Corpus | news | Accuracy | 80.96 |
74.94 |
6.02 |
|
Estonian | EstBERT (512) | SA |
Estonian Valency Corpus | news | Accuracy | 74.5 |
69.52 |
4.98 |
|
Estonian | EstBERT (512) | NER |
EstNER | news | F1 | 89.04 |
88.37 |
0.67 |
|
Basque | BERTeus | NER |
EIEC | news | F1 (test) | 87.06 |
81.52 |
5.54 |
|
Basque | BERTeus | POS |
UD-1.2 | news | Accuracy (test) | 97.76 |
96.37 |
1.39 |
|
Basque | BERTeus | TC |
BHTC | news | F1 (test) | 76.77 |
68.42 |
8.35 |
|
Basque | BERTeus | SA |
Basque Cultural Heritage Tweets Corpus | tweets | F1 (test) | 78.1 |
71.02 |
7.08 |
|
Dutch | BERTje | NER |
CoNLL-2002 | news | F1 (test) | 88.3 |
80.7 |
7.6 |
|
Dutch | BERTje | NER |
SoNaR-1 | news, social, legal, manual, wiki, web, press, proceedings, misc | F1 (test) | 82.1 |
79.7 |
2.4 |
|
Dutch | BERTje | POS |
UD-LassySmall | wiki | Accuracy (test) | 96.3 |
92.5 |
3.8 |
|
Dutch | BERTje | POS (C) |
SoNaR-1 | news, social, legal, manual, wiki, web, press, proceedings, misc | Accuracy (test) | 98.5 |
98.3 |
0.2 |
|
Dutch | BERTje | POS (FG) |
SoNaR-1 | news, social, legal, manual, wiki, web, press, proceedings, misc | Accuracy (test) | 96.8 |
96.2 |
0.6 |
|
Dutch | BERTje | SRL-PA |
SoNaR-1 | news, social, legal, manual, wiki, web, press, proceedings, misc | F1 (test) | 85.3 |
80.4 |
4.9 |
|
Dutch | BERTje | SRL-M |
SoNaR-1 | news, social, legal, manual, wiki, web, press, proceedings, misc | F1 (test) | 67.2 |
62.4 |
4.8 |
|
Dutch | BERTje | SRT |
SoNaR-1 | news, social, legal, manual, wiki, web, press, proceedings, misc | macro F1 (test) | 64.3 |
57.3 |
7.0 |
|
Dutch | BERTje | SA |
110k Dutch Book Reviews Dataset | book Reviews | Accuracy (test) | 93.0 |
89.1 |
3.9 |
|
Dutch | BERTje | DDD |
Europarl | proceedings | Accuracy (test) | 98.27 |
98.28 |
-0.01 |
|
Dutch | RobBERT | SA |
110k Dutch Book Reviews Dataset | book Reviews | Accuracy (test) | 94.42 |
N/A |
N/A |
|
Dutch | RobBERT | DDD |
Europarl | proceedings | Accuracy (test) | 98.41 |
98.28 |
0.13 |
|
French | CamemBERT | POS |
GSD | blogs, news, reviews, wiki | UPOS | 98.19 |
97.48 |
0.71 |
|
French | CamemBERT | DP-UAS |
GSD | blogs, news, reviews, wiki | Accuracy (test) | 94.82 |
92.72 |
2.1 |
|
French | CamemBERT | DP-LAS |
GSD | blogs, news, reviews, wiki | Accuracy (test) | 92.47 |
89.73 |
2.74 |
|
French | CamemBERT | POS |
Sequoia | politics, news, wiki, agency | UPOS | 99.21 |
98.41 |
0.8 |
|
French | CamemBERT | DP-UAS |
Sequoia | politics, news, wiki, agency | Accuracy (test) | 95.56 |
93.24 |
2.32 |
|
French | CamemBERT | DP-LAS |
Sequoia | politics, news, wiki, agency | Accuracy (test) | 94.39 |
91.24 |
3.15 |
|
French | CamemBERT | POS |
Spoken | transcription | UPOS | 96.68 |
96.02 |
0.66 |
|
French | CamemBERT | DP-UAS |
Spoken | transcription | Accuracy (test) | 86.05 |
84.65 |
1.4 |
|
French | CamemBERT | DP-LAS |
Spoken | transcription | Accuracy (test) | 80.07 |
78.63 |
1.44 |
|
French | CamemBERT | POS |
ParTUT | legal, license, misc | UPOS | 97.63 |
97.35 |
0.28 |
|
French | CamemBERT | DP-UAS |
ParTUT | legal, license, misc | Accuracy (test) | 95.21 |
94.18 |
1.03 |
|
French | CamemBERT | DP-LAS |
ParTUT | legal, license, misc | Accuracy (test) | 92.2 |
91.37 |
0.83 |
|
French | CamemBERT | POS |
French Treebank | news | F1 | 87.93 |
82.75 |
5.18 |
|
French | CamemBERT | NLI |
XNLI (French) | transcription, politics, news, literature, misc | Accuracy | 81.2 |
76.9 |
4.3 |
|
French | CamemBERT | SA |
CLS (French) | book reviews | Accuracy | 93.4 |
86.15 |
7.25 |
|
French | CamemBERT | SA |
CLS (French) | dvd reviews | Accuracy | 92.7 |
86.9 |
5.8 |
|
French | CamemBERT | SA |
CLS (French) | music reviews | Accuracy | 94.15 |
86.65 |
7.5 |
|
French | CamemBERT | PI |
PAWS-X (French) | wiki | Accuracy | 89.8 |
89.3 |
0.5 |
|
French | CamemBERT | POS |
French Treebank | news | F1 (test) | 88.39 |
87.52 |
0.87 |
|
French | CamemBERT | VSD |
FrenchSemeval | wiki | F1 | 51.9 |
44.93 |
6.97 |
|
French | CamemBERT | NSD |
Semeval 2013 Task 12 (French) | news | F1 | 51.52 |
53.48 |
-1.96 |
|
French | FlauBERT | SA |
CLS (French) | book reviews | Accuracy | 93.4 |
86.15 |
7.25 |
|
French | FlauBERT | SA |
CLS (French) | dvd reviews | Accuracy | 92.5 |
86.9 |
5.6 |
|
French | FlauBERT | SA |
CLS (French) | music reviews | Accuracy | 94.3 |
86.65 |
7.65 |
|
French | FlauBERT | PI |
PAWS-X (French) | wiki | Accuracy | 89.9 |
89.3 |
0.6 |
|
French | FlauBERT | NLI |
XNLI | transcription, politics, news, literature, misc | Accuracy | 81.3 |
76.9 |
4.4 |
|
French | FlauBERT | NER |
French Treebank | news | F1 (test) | 89.05 |
87.52 |
1.53 |
|
French | FlauBERT | VSD |
FrenchSemeval | wiki | F1 | 47.4 |
44.93 |
2.47 |
|
French | FlauBERT | NSD |
Semeval 2013 Task 12 (French) | news | F1 | 50.78 |
53.48 |
-2.7 |
|
Finnish | FinBERT | POS |
Turku Dependency Treebank | wiki, news, blog, speech, legislative, fiction | UPOS | 98.23 |
96.97 |
1.26 |
|
Finnish | FinBERT | POS |
FinnTreeBank | grammar, news, literature, politics, legislative, misc | UPOS | 98.39 |
95.87 |
2.52 |
|
Finnish | FinBERT | POS |
Parallel UD treebank | wiki, news | UPOS | 98.08 |
97.58 |
0.5 |
|
Finnish | FinBERT | NER |
FiNER | wiki, news | F1 | 92.4 |
90.29 |
2.11 |
|
Finnish | FinBERT | DP |
Turku Dependency Treebank | wiki, news, blog, speech, legislative, fiction | LAS (predicted segmentation) | 91.93 |
86.32 |
5.61 |
|
Finnish | FinBERT | DP |
FinnTreeBank | grammar, news, literature, politics, legislative, misc | LAS (predicted segmentation) | 92.16 |
85.52 |
6.64 |
|
Finnish | FinBERT | DP |
Parallel UD treebank | wiki, news | LAS (predicted segmentation) | 92.54 |
89.18 |
3.36 |
|
Finnish | FinBERT | TC |
Yle news | news | Accuracy (test size 10K) | 90.57 |
88.44 |
2.13 |
|
Finnish | FinBERT | TC |
Ylilauta online discussion | social | Accuracy (test size 10K) | 79.18 |
67.92 |
11.26 |
|
Italian | Italian BERT (XXL) | NER |
WikiNER | wiki | F1 (test) | 93.61 |
93.53 |
0.08 |
|
Italian | Italian BERT (XXL) | NER |
I-CAB 2009 | news | F1 | 88.13 |
85.18 |
2.95 |
|
Italian | Italian BERT (XXL) | POS |
PoSTWITA | Accuracy | 93.75 |
91.54 |
2.21 |
||
Italian | ALBERTO | SA |
SENTIPOLC 2016 | F1 (test) | 72.23 |
N/A |
N/A |
||
Italian | ALBERTO | SC |
SENTIPOLC 2016 | F1 (test) | 79.06 |
N/A |
N/A |
||
Italian | ALBERTO | ID |
SENTIPOLC 2016 | F1 (test) | 60.9 |
N/A |
N/A |
||
Italian | Gilberto | POS |
ParTUT | legal, license, misc | UPOS | 98.8 |
98.0 |
0.8 |
|
Italian | Gilberto | POS |
ISDT | legal, news, wiki, misc | UPOS | 98.6 |
98.5 |
0.1 |
|
Italian | Gilberto | NER |
WikiNER | wiki | F1 | 92.7 |
92.2 |
0.5 |
|
Italian | Umberto | POS |
ParTUT | legal, license, misc | Accuracy | 98.9 |
N/A |
N/A |
|
Italian | Umberto | POS |
ISDT | legal, news, wiki, misc | Accuracy | 98.98 |
N/A |
N/A |
|
Italian | Umberto | NER |
WikiNER | wiki | F1 | 92.53 |
N/A |
N/A |
|
Italian | Umberto | NER |
I-CAB 2007 | news | F1 | 92.53 |
N/A |
N/A |
|
German | deepset-GermanBERT | IOL |
germEval18Fine | F1 | 74.7 |
71.0 |
3.7 |
||
German | deepset-GermanBERT | IOL |
germEval18coarse | F1 | 48.8 |
44.1 |
4.7 |
||
German | deepset-GermanBERT | NER |
germEval14 | wiki, news | F1 | 84.0 |
83.4 |
0.6 |
|
German | deepset-GermanBERT | NER |
CoNLL-2003 | news | F1 | 80.4 |
79.2 |
1.2 |
|
German | deepset-GermanBERT | TC |
10kGNAD | news | Accuracy | 90.5 |
88.8 |
1.7 |
|
German | deepset-GermanBERT | NER |
CoNLL-2003 | news | F1 (test) | 83.7 |
82.55 |
1.15 |
|
German | deepset-GermanBERT | NER |
germEval14 | wiki, news | F1 (test) | 86.61 |
86.26 |
0.35 |
|
German | deepset-GermanBERT | POS |
Parallel UD treebank | wiki, news | Accuracy (test) | 98.56 |
98.58 |
-0.02 |
|
German | German BERT | POS |
Parallel UD treebank | wiki, news | Accuracy (test) | 98.57 |
98.58 |
-0.01 |
|
German | German BERT | NER |
germEval14 | wiki, news | F1 (test) | 86.89 |
86.26 |
0.63 |
|
German | German BERT | NER |
CoNLL-2003 | news | F1 (test) | 84.52 |
82.55 |
1.97 |
|
German | German Europeana BERT | NER |
LFT | news | F1 | 80.55 |
77.26 |
3.29 |
|
German | German Europeana BERT | NER |
ONB | news | F1 | 85.5 |
83.44 |
2.06 |
|
Spanish | BETO | POS |
Turku Dependency Treebank | wiki, news, blog, speech, legislative, fiction | UPOS | 98.97 |
97.1 |
1.87 |
|
Spanish | BETO | NER |
CoNLL 2000, 2002, 2007 | news | F1 | 88.43 |
87.38 |
1.05 |
|
Spanish | BETO | TC |
MLDoc | news | Accuracy | 95.6 |
95.7 |
-0.1 |
|
Spanish | BETO | PI |
PAWS-X | wiki | Accuracy | 89.05 |
90.7 |
-1.65 |
|
Spanish | BETO | NLI |
XNLI | transcription, politics, news, literature, misc | Accuracy | 82.01 |
78.5 |
3.51 |
|
Spanish | BETO | SA |
TASS 2020 | F1 (test) | 66.5 |
N/A |
N/A |
||
Spanish | BETO | Emotion Analysis |
TASS 2020 | F1 (test) | 52.1 |
N/A |
N/A |
||
Spanish | BETO | Hate Speech Detection |
SemEval 2019 Task 5: HatEval | F1 (test) | 76.8 |
N/A |
N/A |
||
Spanish | BETO | ID |
SemEval 2019 Task 5: HatEval | social media | F1 (test) | 70.6 |
N/A |
N/A |
|
Russian | RuBERT | PI |
Paraphraser | news | Accuracy | 84.99 |
81.66 |
3.33 |
|
Russian | RuBERT | SA |
RuSentiment | social | F1 | 72.63 |
70.82 |
1.81 |
|
Russian | RuBERT | QA |
SDSJ Task B | wiki | F1 (dev) | 84.6 |
83.39 |
1.21 |
|
Slavic | SlavicBERT | NER |
BSNLP-2019 dataset | web | Recall | 91.8 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | MRC |
CMRC 2018 | wiki | F1 (test) | 90.0 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | MRC |
DRCD | wiki | F1 (test) | 94.1 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | MRC |
CJRC | law | F1 (test) | 81.0 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | NLI |
XNLI | transcription, politics, news, literature, misc | Accuracy (test) | 80.6 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | SA |
ChnSentiCorpz | social, misc | Accuracy (test) | 94.9 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | SPM |
LCQMC | social | Accuracy (test) | 86.8 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | SPM |
BQ Corpus | log | Accuracy (test) | 84.9 |
N/A |
N/A |
|
Chinese | Ch-RoBERTa-wwm-ext-large | TC |
THUCNews | news | Accuracy (test) | 97.6 |
N/A |
N/A |
|
Japanese | BERT Japanese | TC |
Livedoor | news | F1 (macro) | 97.0 |
N/A |
N/A |
|
Korean | KoBERT | SA |
Naver | movie review | Accuracy | 90.1 |
87.5 |
2.6 |
|
Thai | BERT-th | NLI |
XNLI | transcription, politics, news, literature, misc | Accuracy | 68.9 |
66.1 |
2.8 |
|
Thai | BERT-th | SA |
Wongnai Review Dataset | restaurant reviews | Accuracy | 57.06 |
N/A |
N/A |
|
Mongolian | Mongolian BERT | NER |
Mongolian NER | news | F1 (test) | 81.46 |
N/A |
N/A |
|
Turkish | BERTurk | POS |
IMST dataset | misc | Accuracy | 96.93 |
95.38 |
1.55 |
|
Turkish | BERTurk | NER |
nan | nan | F1 | 94.85 |
93.61 |
1.24 |
|
Arabic | Arabert v1 | SA |
AJGT | Accuracy | 93.8 |
83.6 |
10.2 |
||
Arabic | Arabert v1 | SA |
HARD | hotel reviews | Accuracy | 96.1 |
95.7 |
0.4 |
|
Arabic | Arabert v1 | SA |
ASTD | Accuracy | 92.6 |
80.1 |
12.5 |
||
Arabic | Arabert v1 | SA |
ArSenTD-Lev | Accuracy | 59.4 |
51.0 |
8.4 |
||
Arabic | Arabert v1 | SA |
LABR | book reviews | Accuracy | 86.7 |
83.0 |
3.7 |
|
Arabic | Arabert v1 | NER |
ANER-corp | news | F1 (macro) | 81.9 |
78.4 |
3.5 |
|
Arabic | Arabert v1 | QA |
ARCD | wiki | F1 (macro) | 62.7 |
61.3 |
1.4 |
|
Portuguese | BERT-Large Portuguese | NER |
Harem | web, politics, fiction, email, transcriptions, news, misc | F1 (5 classes) | 83.3 |
79.44 |
3.86 |
|
English | BERT-Base | NLI |
MNLI (matched) | misc | Accuracy (dev) | 84.6 |
N/A |
N/A |
|
English | BERT-Base | PI |
Quora Question Pairs | social | F1 (dev) | 71.2 |
N/A |
N/A |
|
English | BERT-Base | NLI |
QNLI | wiki | Accuracy (dev) | 90.5 |
N/A |
N/A |
|
English | BERT-Base | SA |
Stanford Sentiment Treebank | movie reviews | Accuracy (dev) | 93.5 |
N/A |
N/A |
|
English | BERT-Base | LA |
CoLA | misc | Matthew's Correlation (dev) | 52.1 |
N/A |
N/A |
|
English | BERT-Base | STS |
STS-B | misc | Pearson-Spearman Correlation (dev) | 85.8 |
N/A |
N/A |
|
English | BERT-Base | PI |
MRPC | news | F1 (dev) | 88.9 |
N/A |
N/A |
|
English | BERT-Base | TER |
RTE | news, wiki | Accuracy (dev) | 66.4 |
N/A |
N/A |
|
English | BERT-Base | QA |
SQuAD v1.1 | wiki | F1 (dev) | 88.5 |
N/A |
N/A |
|
English | BERT-Base | CI |
SWAG | video captions | Accuracy (dev) | 81.6 |
N/A |
N/A |
|
English | BERT-Base | NLI |
WNLI | fiction | Accuracy | 45.1 |
N/A |
N/A |
|
English | BERT-Base | SA |
IMDb | movie reviews | Accuracy | 93.46 |
N/A |
N/A |
|
English | BERT-Large | NLI |
MNLI (matched) | misc | Accuracy (dev) | 86.7 |
N/A |
N/A |
|
English | BERT-Large | PI |
Quora Question Pairs | social | F1 (dev) | 72.1 |
N/A |
N/A |
|
English | BERT-Large | NLI |
QNLI | wiki | Accuracy (dev) | 92.7 |
N/A |
N/A |
|
English | BERT-Large | SA |
Stanford Sentiment Treebank | movie reviews | Accuracy (dev) | 94.9 |
N/A |
N/A |
|
English | BERT-Large | LA |
CoLA | misc | Matthew's Correlation (dev) | 60.5 |
N/A |
N/A |
|
English | BERT-Large | STS |
STS-B | misc | Pearson-Spearman Correlation (dev) | 86.5 |
N/A |
N/A |
|
English | BERT-Large | PI |
MRPC | news | F1 (dev) | 89.3 |
N/A |
N/A |
|
English | BERT-Large | TER |
RTE | news, wiki | Accuracy (dev) | 70.1 |
N/A |
N/A |
|
English | BERT-Large | QA |
SQuAD v1.1 | wiki | F1 (dev) | 90.9 |
N/A |
N/A |
|
English | BERT-Large | QA |
SQuAD v2.0 | wiki | F1 (dev) | 81.9 |
N/A |
N/A |
|
English | BERT-Large | CI |
SWAG | video captions | Accuracy (dev) | 86.6 |
N/A |
N/A |
|
English | BERT-Large | RC |
RACE | examinations | Accuracy | 72.0 |
N/A |
N/A |
|
English | RoBERTa | NLI |
MNLI (matched) | misc | Accuracy (dev) | 90.2 |
N/A |
N/A |
|
English | RoBERTa | PI |
QQP | social | F1 (dev) | 92.2 |
N/A |
N/A |
|
English | RoBERTa | NLI |
QNLI | wiki | Accuracy (dev) | 94.7 |
N/A |
N/A |
|
English | RoBERTa | SA |
Stanford Sentiment Treebank | movie reviews | Accuracy (dev) | 96.4 |
N/A |
N/A |
|
English | RoBERTa | LA |
CoLA | misc | Matthew's Correlation (dev) | 68.0 |
N/A |
N/A |
|
English | RoBERTa | STS |
STS-B | misc | Pearson-Spearman Correlation (dev) | 92.4 |
N/A |
N/A |
|
English | RoBERTa | PI |
MRPC | news | F1 (dev) | 90.9 |
N/A |
N/A |
|
English | RoBERTa | TER |
RTE | news, wiki | Accuracy (dev) | 86.6 |
N/A |
N/A |
|
English | RoBERTa | QA |
SQuAD v1.1 | wiki | F1 (dev) | 94.6 |
N/A |
N/A |
|
English | RoBERTa | QA |
SQuAD v2.0 | wiki | F1 (dev) | 89.4 |
N/A |
N/A |
|
English | RoBERTa | RC |
RACE | examinations | Accuracy | 83.2 |
N/A |
N/A |
|
English | ALBERT (1M) | NLI |
MNLI (matched) | misc | Accuracy (dev) | 90.4 |
N/A |
N/A |
|
English | ALBERT (1M) | PI |
QQP | social | F1 (dev) | 92.0 |
N/A |
N/A |
|
English | ALBERT (1M) | NLI |
QNLI | wiki | Accuracy (dev) | 95.2 |
N/A |
N/A |
|
English | ALBERT (1M) | SA |
Stanford Sentiment Treebank | movie reviews | Accuracy (dev) | 96.8 |
N/A |
N/A |
|
English | ALBERT (1M) | LA |
CoLA | misc | Matthew's Correlation (dev) | 68.7 |
N/A |
N/A |
|
English | ALBERT (1M) | STS |
STS-B | misc | Pearson-Spearman Correlation (dev) | 92.7 |
N/A |
N/A |
|
English | ALBERT (1M) | PI |
MRPC | news | F1 (dev) | 90.2 |
N/A |
N/A |
|
English | ALBERT (1M) | TER |
RTE | news, wiki | Accuracy (dev) | 88.1 |
N/A |
N/A |
|
English | ALBERT (1M) | QA |
SQuAD v1.1 | wiki | F1 (dev) | 94.8 |
N/A |
N/A |
|
English | ALBERT (1M) | QA |
SQuAD v2.0 | wiki | F1 (dev) | 89.9 |
N/A |
N/A |
|
English | ALBERT (1M) | RC |
RACE | examinations | Accuracy | 86.0 |
N/A |
N/A |
|
English | ALBERT (1.5M) | NLI |
MNLI (matched) | misc | Accuracy (dev) | 90.8 |
N/A |
N/A |
|
English | ALBERT (1.5M) | PI |
QQP | social | F1 (dev) | 92.2 |
N/A |
N/A |
|
English | ALBERT (1.5M) | NLI |
QNLI | wiki | Accuracy (dev) | 95.3 |
N/A |
N/A |
|
English | ALBERT (1.5M) | SA |
Stanford Sentiment Treebank | movie reviews | Accuracy (dev) | 96.9 |
N/A |
N/A |
|
English | ALBERT (1.5M) | LA |
CoLA | misc | Matthew's Correlation (dev) | 71.4 |
N/A |
N/A |
|
English | ALBERT (1.5M) | STS |
STS-B | misc | Pearson-Spearman Correlation (dev) | 93.0 |
N/A |
N/A |
|
English | ALBERT (1.5M) | PI |
MRPC | news | F1 (dev) | 90.9 |
N/A |
N/A |
|
English | ALBERT (1.5M) | TER |
RTE | news, wiki | Accuracy (dev) | 89.2 |
N/A |
N/A |
|
English | ALBERT (1.5M) | QA |
SQuAD v1.1 | wiki | F1 (dev) | 94.8 |
N/A |
N/A |
|
English | ALBERT (1.5M) | QA |
SQuAD v2.0 | wiki | F1 (dev) | 90.2 |
N/A |
N/A |
|
English | ALBERT (1.5M) | RC |
RACE | examinations | Accuracy | 86.5 |
N/A |
N/A |
|
English | DistilBERT | NLI |
MNLI (matched) | misc | Accuracy (dev) | 79.0 |
N/A |
N/A |
|
English | DistilBERT | PI |
QQP | social | F1 (dev) | 84.9 |
N/A |
N/A |
|
English | DistilBERT | NLI |
QNLI | wiki | Accuracy (dev) | 85.3 |
N/A |
N/A |
|
English | DistilBERT | SA |
Stanford Sentiment Treebank | movie reviews | Accuracy (dev) | 90.7 |
N/A |
N/A |
|
English | DistilBERT | LA |
CoLA | misc | Matthew's Correlation (dev) | 43.6 |
N/A |
N/A |
|
English | DistilBERT | STS |
STS-B | misc | Pearson-Spearman Correlation (dev) | 81.2 |
N/A |
N/A |
|
English | DistilBERT | PI |
MRPC | news | F1 (dev) | 87.5 |
N/A |
N/A |
|
English | DistilBERT | TER |
RTE | news, wiki | Accuracy (dev) | 59.9 |
N/A |
N/A |
|
English | DistilBERT | QA |
SQuAD v1.1 | wiki | F1 (dev) | 78.7 |
N/A |
N/A |
|
English | DistilBERT | NLI |
WNLI | fiction | Accuracy | 56.3 |
N/A |
N/A |
|
English | DistilBERT | SA |
IMDb | movie reviews | Accuracy | 92.82 |
N/A |
N/A |
|
Yorùbá | Fine-Tuned mBERT | NER |
Global Voices Yorùbá | news | F1 | 52.5 |
0.0 |
52.5 |
|
Philipino | BERT-Tagalog | SA |
- | electronic products reviews | Accuracy | 88.17 |
N/A |
N/A |
|
Spanish | RoBERTuito | SA |
TASS 2020 | F1 (test) | 70.07 |
N/A |
N/A |
||
Spanish | RoBERTuito | Emotion Analysis |
TASS 2020 | F1 (test) | 55.1 |
N/A |
N/A |
||
Spanish | RoBERTuito | Hate Speech Detection |
SemEval 2019 Task 5: HatEval | F1 (test) | 80.1 |
N/A |
N/A |
||
Spanish | RoBERTuito | ID |
SemEval 2019 Task 5: HatEval | social media | F1 (test) | 74.0 |
N/A |
N/A |
|
Spanish | Spanish RoBERTa | SA |
TASS 2020 | F1 (test) | 66.9 |
N/A |
N/A |
||
Spanish | Spanish RoBERTa | Emotion Analysis |
TASS 2020 | F1 (test) | 53.3 |
N/A |
N/A |
||
Spanish | Spanish RoBERTa | Hate Speech Detection |
SemEval 2019 Task 5: HatEval | F1 (test) | 76.6 |
N/A |
N/A |
||
Spanish | Spanish RoBERTa | ID |
SemEval 2019 Task 5: HatEval | social media | F1 (test) | 72.3 |
N/A |
N/A |
|
Spanish | BERTin | SA |
TASS 2020 | F1 (test) | 66.5 |
N/A |
N/A |
||
Spanish | BERTin | Emotion Analysis |
TASS 2020 | F1 (test) | 51.8 |
N/A |
N/A |
||
Spanish | BERTin | Hate Speech Detection |
SemEval 2019 Task 5: HatEval | F1 (test) | 76.7 |
N/A |
N/A |
||
Spanish | BERTin | ID |
SemEval 2019 Task 5: HatEval | social media | F1 (test) | 71.6 |
N/A |
N/A |
What the [MASK]? Making Sense of Language-Specific BERT Models
@article{nozza2020what, title={What the [MASK]? Making Sense of Language-Specific BERT Models}, author={Nozza, Debora and Bianchi, Federico and Hovy, Dirk}, journal={arXiv preprint arXiv:2003.02912}, year={2020} }
This is a collaborative resource to help researchers understand and find the best BERT model for a given dataset, task and language. The numbers here rely on self reported performance (we can give no guarantees for their accuracy. In the future, we hope to independently verify each of the models).
Do you want to add your model? Click Here