Spaces:

HiTZ
/

Critical_Questions_Leaderboard

Running

App Files Files Community

Blanca commited on May 12

Commit

38ffc99

verified ·

1 Parent(s): 6bf01ee

Upload content.py

Browse files

Files changed (1) hide show

content.py +70 -0

content.py ADDED Viewed

	@@ -0,0 +1,70 @@

+TITLE = """<h1 align="center" id="space-title">Critical Questions Leaderboard</h1>"""
+INTRODUCTION_TEXT = """
+The Critical Questions Leaderboard is a benchmark which aims at evaluating READ. (See our [paper](https://aclanthology.org/2024.conll-1.9/) for more details.)
+## Data
+The Critical Questions Dataset is made of more than 220 interventions associated to potentially critical questions.
+The data can be found in [this dataset](Blanca/critical_question_generation). The test set is contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
+## Leaderboard
+Submission made by our team are labelled "CQ authors". While we report average scores over different runs when possible in our paper, we only report the best run in the leaderboard.
+See below for submissions.
+"""
+SUBMISSION_TEXT = """
+## Submissions
+Results can be submitted for the test set only. Scores are expressed as the percentage of correct answers for a given split.
+Each question calls for an answer that is either a string (one or a few words), a number, or a comma separated list of strings or floats, unless specified otherwise. There is only one correct answer.
+Hence, evaluation is done via quasi exact match between a model’s answer and the ground truth (up to some normalization that is tied to the “type” of the ground truth).
+We expect submissions to be json-line files with the following format. The first two fields are mandatory, `reasoning_trace` is optional:
+```json
+{
+    "CLINTON_1_1": {
+        "intervention_id": "CLINTON_1_1",
+        "intervention": "CLINTON: \"The central question in this election is really what kind of country we want to be and what kind of future we 'll build together\nToday is my granddaughter 's second birthday\nI think about this a lot\nwe have to build an economy that works for everyone , not just those at the top\nwe need new jobs , good jobs , with rising incomes\nI want us to invest in you\nI want us to invest in your future\njobs in infrastructure , in advanced manufacturing , innovation and technology , clean , renewable energy , and small business\nmost of the new jobs will come from small business\nWe also have to make the economy fairer\nThat starts with raising the national minimum wage and also guarantee , finally , equal pay for women 's work\nI also want to see more companies do profit-sharing\"",
+        "dataset": "US2016",
+        "cqs": [
+            {
+                "id": 0,
+                "cq": "What does the author mean by \"build an economy that works for everyone, not just those at the top\"?"
+            },
+            {
+                "id": 1,
+                "cq": "What is the author's definition of \"new jobs\" and \"good jobs\"?"
+            },
+            {
+                "id": 2,
+                "cq": "How will the author's plan to \"make the economy fairer\" benefit the working class?"
+            }
+        ]
+    },
+...
+}
+```
+Our scoring function can be found [here](https://huggingface.co/spaces/gaia-benchmark/leaderboard/blob/main/scorer.py).
+"""
+CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
+CITATION_BUTTON_TEXT = r"""BIBTEX HERE
+}"""
+def format_error(msg):
+    return f"<p style='color: red; font-size: 20px; text-align: center;'>{msg}</p>"
+def format_warning(msg):
+    return f"<p style='color: orange; font-size: 20px; text-align: center;'>{msg}</p>"
+def format_log(msg):
+    return f"<p style='color: green; font-size: 20px; text-align: center;'>{msg}</p>"
+def model_hyperlink(link, model_name):
+    return f'<a target="_blank" href="{link}" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">{model_name}</a>'