ronak commited on
Commit
1b0ca67
·
verified ·
1 Parent(s): 68e1b61

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ base_model:
5
+ - google/electra-base-discriminator
6
+ pipeline_tag: text-ranking
7
+ ---
8
+
9
+ monoELECTRA is a highly effective cross-encoder reranker built on `google/electra-base-discriminator` and trained on MS MARCO passage data for 300K steps with a batch size of 16.
10
+ It uses hard negatives from strong first-stage retrievers and the Localized Contrastive Estimation (LCE) loss with large group sizes (up to 31 negatives per positive).
11
+ This setup consistently outperforms standard monoBERT and hinge/CE baselines, especially in the top-$k$ pool where near-duplicate passages matter.
12
+ If you want a compact, supervised reranker that was tuned to squeeze every last bit of signal from hard negatives, use this one.
13
+
14
+ If you use the monoELECTRA model, please cite the following relevant paper:
15
+
16
+ [Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking](https://arxiv.org/abs/2312.02724)
17
+ <!-- {% raw %} -->
18
+ ```
19
+ @inproceedings{squeezemonoelectra2022,
20
+ author = {Pradeep, Ronak and Liu, Yuqi and Zhang, Xinyu and Li, Yilin and Yates, Andrew and Lin, Jimmy},
21
+ title = {Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking},
22
+ year = {2022},
23
+ publisher = {Springer-Verlag},
24
+ address = {Berlin, Heidelberg},
25
+ booktitle = {Advances in Information Retrieval: 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part I},
26
+ pages = {655–670},
27
+ numpages = {16},
28
+ location = {Stavanger, Norway}
29
+ }
30
+ ```