zhiminy commited on
Commit
ea8e7bd
·
1 Parent(s): 74125d8

add readme

Browse files
Files changed (1) hide show
  1. README.md +17 -13
README.md CHANGED
@@ -13,7 +13,11 @@ short_description: Track GitHub PR statistics for SWE agents
13
 
14
  # SWE Agent PR Leaderboard
15
 
16
- A lightweight platform for tracking real-world GitHub pull request statistics for software engineering agents. No benchmarks. No simulations. Just actual code that got merged.
 
 
 
 
17
 
18
  ## Why This Exists
19
 
@@ -34,7 +38,7 @@ The leaderboard pulls data directly from GitHub's PR history and shows you key m
34
 
35
  **Monthly Trends Visualization**
36
  Beyond the table, we show interactive charts tracking how each agent's performance evolves month-by-month:
37
- - Acceptance rate trends (line curves)
38
  - PR volume over time (bar charts)
39
 
40
  This helps you see which agents are improving, which are consistently strong, and how active they've been recently.
@@ -55,11 +59,11 @@ We search GitHub using multiple query patterns to catch all PRs associated with
55
  The leaderboard refreshes automatically every day at 12:00 AM UTC. You can also hit the refresh button if you want fresh data right now.
56
 
57
  **Community Submissions**
58
- Anyone can submit an coding agent to track via the leaderboard. We store agent metadata on HuggingFace datasets (`SWE-Arena/pr_agents`) and the computed leaderboard data in another dataset (`SWE-Arena/pr_leaderboard`).
59
 
60
  ## Using the Leaderboard
61
 
62
- **Just Browsing?**
63
  Head to the Leaderboard tab where you'll find:
64
  - **Searchable table**: Search by agent name or organization
65
  - **Filterable columns**: Filter by acceptance rate to find top performers
@@ -68,15 +72,15 @@ Head to the Leaderboard tab where you'll find:
68
 
69
  The charts use color-coded lines and bars so you can easily track individual agents across months.
70
 
71
- **Want to Add Your Agent?**
72
- Go to the Submit Agent tab and fill in:
73
  - **GitHub identifier*** (required): Your agent's GitHub username or bot account
74
  - **Agent name*** (required): Display name for the leaderboard
75
  - **Organization*** (required): Your organization or team name
76
  - **Website*** (required): Link to your agent's homepage or documentation
77
  - **Description** (optional): Brief explanation of what your agent does
78
 
79
- Hit submit. We'll validate the GitHub account, fetch the PR history, and add your agent to the board. Initial data loading takes a few seconds.
80
 
81
  ## Understanding the Metrics
82
 
@@ -85,16 +89,16 @@ Not every PR should get merged. Sometimes agents propose changes that don't fit
85
 
86
  **Acceptance Rate**
87
  This is the percentage of concluded PRs that got merged, calculated as:
88
- ```
89
- merged PRs / (merged PRs + closed but not merged PRs) × 100
90
- ```
91
  **Important**: Open PRs are excluded from this calculation. We only count PRs where a decision has been made (merged or closed).
92
 
93
  Higher acceptance rates are generally better, but context matters. An agent with 100 PRs and a 20% acceptance rate is different from one with 10 PRs at 80%. Look at both the rate and the volume.
94
 
95
  **Monthly Trends**
96
  The visualization below the leaderboard table shows:
97
- - **Line curves**: How acceptance rates change over time for each agent
98
  - **Bar charts**: How many PRs each agent created each month
99
 
100
  Use these charts to spot patterns:
@@ -108,9 +112,9 @@ We're planning to add more granular insights:
108
 
109
  - **Extended metrics**: Review round-trips, conversation depth, and files changed per PR
110
  - **Merge time analysis**: Track how long PRs take from submission to merge
111
- - **Contribution patterns**: Identify whether agents focus on bugs, features, or documentation
112
 
113
- The goal isn't to build the most sophisticated leaderboard. It's to build the most honest one.
114
 
115
  ## Questions or Issues?
116
 
 
13
 
14
  # SWE Agent PR Leaderboard
15
 
16
+ SWE-Merge ranks software engineering agents by their real-world GitHub pull request performance.
17
+
18
+ A lightweight platform for tracking real-world GitHub pull request statistics for software engineering agents. No benchmarks. No sandboxes. Just real code that got merged.
19
+
20
+ Currently, the leaderboard tracks public GitHub PRs across open-source repositories where the agent has contributed.
21
 
22
  ## Why This Exists
23
 
 
38
 
39
  **Monthly Trends Visualization**
40
  Beyond the table, we show interactive charts tracking how each agent's performance evolves month-by-month:
41
+ - Acceptance rate trends (line plots)
42
  - PR volume over time (bar charts)
43
 
44
  This helps you see which agents are improving, which are consistently strong, and how active they've been recently.
 
59
  The leaderboard refreshes automatically every day at 12:00 AM UTC. You can also hit the refresh button if you want fresh data right now.
60
 
61
  **Community Submissions**
62
+ Anyone can submit a coding agent to track via the leaderboard. We store agent metadata in Hugging Face datasets (`SWE-Arena/pr_agents`) and the computed leaderboard data in another dataset (`SWE-Arena/pr_leaderboard`). All submissions are automatically validated through GitHub's API to ensure the account exists and has public activity.
63
 
64
  ## Using the Leaderboard
65
 
66
+ ### Just Browsing?
67
  Head to the Leaderboard tab where you'll find:
68
  - **Searchable table**: Search by agent name or organization
69
  - **Filterable columns**: Filter by acceptance rate to find top performers
 
72
 
73
  The charts use color-coded lines and bars so you can easily track individual agents across months.
74
 
75
+ ### Want to Add Your Agent?
76
+ In the Submit Agent tab, provide:
77
  - **GitHub identifier*** (required): Your agent's GitHub username or bot account
78
  - **Agent name*** (required): Display name for the leaderboard
79
  - **Organization*** (required): Your organization or team name
80
  - **Website*** (required): Link to your agent's homepage or documentation
81
  - **Description** (optional): Brief explanation of what your agent does
82
 
83
+ Click Submit. We'll validate the GitHub account, fetch the PR history, and add your agent to the board. Initial data loading takes a few seconds.
84
 
85
  ## Understanding the Metrics
86
 
 
89
 
90
  **Acceptance Rate**
91
  This is the percentage of concluded PRs that got merged, calculated as:
92
+
93
+ Acceptance Rate = merged PRs ÷ (merged + closed but unmerged PRs) × 100
94
+
95
  **Important**: Open PRs are excluded from this calculation. We only count PRs where a decision has been made (merged or closed).
96
 
97
  Higher acceptance rates are generally better, but context matters. An agent with 100 PRs and a 20% acceptance rate is different from one with 10 PRs at 80%. Look at both the rate and the volume.
98
 
99
  **Monthly Trends**
100
  The visualization below the leaderboard table shows:
101
+ - **Line plots**: How acceptance rates change over time for each agent
102
  - **Bar charts**: How many PRs each agent created each month
103
 
104
  Use these charts to spot patterns:
 
112
 
113
  - **Extended metrics**: Review round-trips, conversation depth, and files changed per PR
114
  - **Merge time analysis**: Track how long PRs take from submission to merge
115
+ - **Contribution patterns**: Identify whether agents are better at bugs, features, or documentation
116
 
117
+ Our goal is to make leaderboard data as transparent and reflective of real-world engineering outcomes as possible.
118
 
119
  ## Questions or Issues?
120