Spaces:
Sleeping
Sleeping
File size: 9,950 Bytes
c266ed1 f209cc2 c266ed1 f209cc2 c266ed1 8c1f582 24d65a0 f209cc2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 3478ac2 8c1f582 3478ac2 8c1f582 f209cc2 8c1f582 3478ac2 f209cc2 8c1f582 3478ac2 8c1f582 3478ac2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 3478ac2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 3478ac2 8c1f582 f209cc2 8c1f582 3478ac2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 f209cc2 8c1f582 c266ed1 8c1f582 f209cc2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 |
---
title: AI Agent UN - Multi-Agent Simulation Framework
emoji: 🏛️
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---

# AI Agent United Nations: Multi-Agent Simulation Framework
A structured system for simulating international diplomatic decision-making using 195 AI agents with constrained JSON outputs.
## System Overview
This is an experimental framework demonstrating:
- **Multi-agent coordination** across 195 independent AI agents
- **Structured output constraints** with strict JSON schema validation
- **Generic prompt templates** producing country-specific behaviors
- **Task execution model** for running resolutions through all agents
### High-Level Concept
```mermaid
graph TB
subgraph "Input Layer"
RES[UN Resolution Text]
end
subgraph "Agent Layer - 195 Independent Agents"
A1[Agent: USA<br/>System Prompt]
A2[Agent: China<br/>System Prompt]
A3[Agent: Russia<br/>System Prompt]
ADOT[...]
A195[Agent: Tuvalu<br/>System Prompt]
end
subgraph "LLM Processing"
LLM[Claude 3.5 Sonnet<br/>Structured JSON Output]
end
subgraph "Output Layer"
V1[Vote: yes<br/>Statement: ...]
V2[Vote: no<br/>Statement: ...]
V3[Vote: yes<br/>Statement: ...]
VDOT[...]
V195[Vote: yes<br/>Statement: ...]
end
subgraph "Aggregation"
AGG[Combined Results<br/>Vote Counts + All Statements]
end
RES --> A1
RES --> A2
RES --> A3
RES --> ADOT
RES --> A195
A1 --> LLM
A2 --> LLM
A3 --> LLM
ADOT --> LLM
A195 --> LLM
LLM --> V1
LLM --> V2
LLM --> V3
LLM --> VDOT
LLM --> V195
V1 --> AGG
V2 --> AGG
V3 --> AGG
VDOT --> AGG
V195 --> AGG
style RES fill:#6366f1
style LLM fill:#8b5cf6
style AGG fill:#22c55e
style A1 fill:#f59e0b
style A2 fill:#f59e0b
style A3 fill:#f59e0b
style A195 fill:#f59e0b
```
## System Architecture
```mermaid
graph TB
subgraph Input
M[Motion Text<br/>tasks/motions/]
C[Country List<br/>195 UN Members]
end
subgraph "Agent Processing"
SP[System Prompt<br/>Generic Template]
UP[User Prompt<br/>+ Motion Text]
LLM[Claude 3.5 Sonnet<br/>Temperature: 0.7]
end
subgraph "Output Validation"
JSON[JSON Parser]
V[Schema Validator]
E[Error Handler]
end
subgraph Results
AGG[Aggregated Results]
META[Metadata]
FILE[JSON Output File]
end
M --> UP
C --> SP
SP --> LLM
UP --> LLM
LLM --> JSON
JSON --> V
V --> E
E --> AGG
AGG --> META
META --> FILE
style LLM fill:#6366f1
style JSON fill:#22c55e
style V fill:#f59e0b
style FILE fill:#8b5cf6
```
## Agent Processing Flow
```mermaid
sequenceDiagram
participant CLI as CLI Runner
participant Agent as Country Agent
participant LLM as Claude 3.5
participant Val as Validator
participant Store as Storage
CLI->>Agent: Load system prompt
CLI->>Agent: Send motion text
Agent->>LLM: System + User Prompt
LLM->>Agent: Raw text response
Agent->>Val: Parse JSON
alt Valid JSON
Val->>Val: Check schema
alt Valid Schema
Val->>Store: Save vote + statement
else Invalid Schema
Val->>Store: Save as abstain + error
end
else Invalid JSON
Val->>Store: Save as abstain + error
end
Store->>CLI: Continue to next country
```
## Core Components
### 1. Agent System Prompts
```mermaid
graph LR
subgraph "Generic Template"
T[Template Structure]
end
subgraph "Variables"
CN[Country Name]
P5[P5 Status]
end
subgraph "195 Agents"
US[United States]
CN2[China]
RU[Russia]
DOT[...]
TV[Tuvalu]
end
T --> CN
T --> P5
CN --> US
CN --> CN2
CN --> RU
CN --> DOT
CN --> TV
style T fill:#6366f1
style US fill:#22c55e
style CN2 fill:#22c55e
style RU fill:#22c55e
style TV fill:#22c55e
```
- 195 country-specific agents (one per UN member state)
- Generic template structure (identical for all countries)
- Only country name and P5 status differ between prompts
- AI infers policy positions from training data
### 2. Structured Output Schema
```json
{
"vote": "yes" | "no" | "abstain",
"statement": "Brief explanation (2-4 sentences)"
}
```
### 3. Validation Pipeline
```mermaid
graph TD
A[LLM Response] --> B{Valid JSON?}
B -->|Yes| C{Has vote field?}
B -->|No| ERR1[Error: Parse Failure]
C -->|Yes| D{Has statement field?}
C -->|No| ERR2[Error: Missing Vote]
D -->|Yes| E{Vote is yes/no/abstain?}
D -->|No| ERR3[Error: Missing Statement]
E -->|Yes| SUCCESS[Save Response]
E -->|No| ERR4[Error: Invalid Vote]
ERR1 --> DEFAULT[Save as Abstain + Error Flag]
ERR2 --> DEFAULT
ERR3 --> DEFAULT
ERR4 --> DEFAULT
style SUCCESS fill:#22c55e
style DEFAULT fill:#f59e0b
style ERR1 fill:#ef4444
style ERR2 fill:#ef4444
style ERR3 fill:#ef4444
style ERR4 fill:#ef4444
```
### 4. Model Configuration
- **Primary:** Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
- **Temperature:** 0.7 (balance consistency + variation)
- **Max tokens:** 800 per response
- **Provider:** Anthropic API
## What This Tests
- **LLM Geopolitical Knowledge**: How well models understand different countries' foreign policies
- **Structured Outputs**: Consistency in producing valid JSON under constraints
- **Multi-Agent Systems**: Coordinating hundreds of independent AI agents
- **Prompt Engineering**: Generic templates yielding specific behaviors
- **Error Handling**: Graceful degradation when agents produce invalid outputs
## Technical Implementation
### Execution Flow
```mermaid
graph TD
START[Start Simulation] --> LOAD_MOTION[Load Motion Text<br/>tasks/motions/motion_id.md]
LOAD_MOTION --> LOAD_COUNTRIES[Load Country List<br/>195 UN Members]
LOAD_COUNTRIES --> LOOP_START{For Each Country}
LOOP_START -->|Country 1-195| LOAD_PROMPT[Load System Prompt<br/>agents/representatives/country/]
LOAD_PROMPT --> BUILD_USER[Build User Prompt<br/>Motion + Instructions]
BUILD_USER --> API_CALL[API Call to Claude<br/>System + User Prompt]
API_CALL --> PARSE[Parse JSON Response]
PARSE --> VALIDATE[Validate Schema]
VALIDATE -->|Valid| STORE[Store Result]
VALIDATE -->|Invalid| ERROR[Store Error + Abstain]
STORE --> LOOP_START
ERROR --> LOOP_START
LOOP_START -->|All Done| AGGREGATE[Aggregate Results]
AGGREGATE --> CALC_STATS[Calculate Vote Summary]
CALC_STATS --> ADD_META[Add Metadata<br/>model, timestamp, etc]
ADD_META --> SAVE_TIME[Save Timestamped File<br/>motion_id_timestamp.json]
SAVE_TIME --> SAVE_LATEST[Save Latest File<br/>motion_id_latest.json]
SAVE_LATEST --> END[Complete]
style API_CALL fill:#6366f1
style VALIDATE fill:#f59e0b
style STORE fill:#22c55e
style ERROR fill:#ef4444
style END fill:#8b5cf6
```
### Command Line Interface
```bash
# Run simulation
python scripts/run_motion.py 01_gaza_ceasefire_resolution
# With specific model
python scripts/run_motion.py 01_gaza_ceasefire_resolution --model claude-3-5-sonnet-20241022
# Test with sample
python scripts/run_motion.py 01_gaza_ceasefire_resolution --sample 5
```
### Output Structure
```mermaid
graph LR
subgraph "JSON Output"
ROOT[Root Object]
META[Metadata]
VOTES[Votes Array]
end
subgraph "Metadata Fields"
ID[motion_id]
TS[timestamp]
MODEL[model]
TOTAL[total_votes]
SUMMARY[vote_summary]
end
subgraph "Vote Summary"
YES[yes: count]
NO[no: count]
ABS[abstain: count]
end
subgraph "Individual Votes"
V1[Vote 1: Country, vote, statement]
V2[Vote 2: Country, vote, statement]
V3[...]
V195[Vote 195: Country, vote, statement]
end
ROOT --> META
ROOT --> VOTES
META --> ID
META --> TS
META --> MODEL
META --> TOTAL
META --> SUMMARY
SUMMARY --> YES
SUMMARY --> NO
SUMMARY --> ABS
VOTES --> V1
VOTES --> V2
VOTES --> V3
VOTES --> V195
style ROOT fill:#8b5cf6
style META fill:#6366f1
style VOTES fill:#22c55e
```
## Case Study: Gaza Ceasefire Resolution
The Space includes a case study demonstrating the system with a Gaza ceasefire resolution voted on by all 195 agents.
### Results Overview
```mermaid
pie title Vote Distribution (195 Countries)
"Yes" : 190
"No" : 3
"Abstain" : 2
```
**Key Statistics:**
- **Yes:** 190 countries (97.4%)
- **No:** 3 countries (1.5%)
- **Abstain:** 2 countries (1.0%)
This serves as a concrete example of the framework in action, showing how generic prompts + model knowledge produce diverse, country-specific diplomatic responses.
## Research Applications
- Testing LLM knowledge of international relations
- Evaluating structured output consistency
- Studying emergent behavior in multi-agent systems
- Educational demonstrations of diplomatic complexity
## Limitations
This is a simulation for research and education:
- AI positions based on training data, not actual policies
- Does NOT predict real government decisions
- Should NOT be considered authoritative
- Real diplomacy involves classified information and human judgment
## Open Source
All code, prompts, and data available on GitHub:
- Repository: https://github.com/danielrosehill/AI-Agent-UN
- System Prompts: https://github.com/danielrosehill/AI-Agent-UN/tree/main/agents/representatives
- Execution Script: https://github.com/danielrosehill/AI-Agent-UN/blob/main/scripts/run_motion.py
---
Built with Gradio | Powered by Anthropic Claude
|