danielrosehill Claude commited on
Commit
3478ac2
·
1 Parent(s): f209cc2

Add visual flowcharts and diagrams to README

Browse files

Enhanced README with Mermaid diagrams for better visual understanding:

New diagrams added:
1. High-Level Concept - multi-agent overview showing all 195 agents
2. System Architecture - complete data flow from input to output
3. Agent Processing Flow - sequence diagram of execution
4. Agent System Prompts - template to agents visualization
5. Validation Pipeline - decision tree for error handling
6. Execution Flow - detailed loop through 195 countries
7. Output Structure - JSON result organization
8. Case Study Results - pie chart of vote distribution

Visual improvements:
- Color-coded nodes (blue=input, purple=processing, green=success, orange=validation, red=errors)
- Clear subgraphs for logical grouping
- Arrows showing data flow
- Consistent styling throughout

Makes the technical architecture immediately understandable through visuals rather than text-only descriptions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show
  1. README.md +305 -23
README.md CHANGED
@@ -22,17 +22,188 @@ This is an experimental framework demonstrating:
22
  - **Generic prompt templates** producing country-specific behaviors
23
  - **Task execution model** for running resolutions through all agents
24
 
25
- ## Architecture
26
 
27
- ### Core Components
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
- **Agent System Prompts**
30
  - 195 country-specific agents (one per UN member state)
31
  - Generic template structure (identical for all countries)
32
  - Only country name and P5 status differ between prompts
33
  - AI infers policy positions from training data
34
 
35
- **Structured Output Schema**
 
36
  ```json
37
  {
38
  "vote": "yes" | "no" | "abstain",
@@ -40,17 +211,39 @@ This is an experimental framework demonstrating:
40
  }
41
  ```
42
 
43
- **Task Execution**
44
- - Python CLI for running simulations
45
- - Sequential processing of all 195 agents
46
- - JSON validation and error handling
47
- - Aggregated results with metadata
48
 
49
- **Model Configuration**
50
- - Primary: Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
51
- - Temperature: 0.7
52
- - Max tokens: 800 per response
53
- - Provider: Anthropic API
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  ## What This Tests
56
 
@@ -62,14 +255,40 @@ This is an experimental framework demonstrating:
62
 
63
  ## Technical Implementation
64
 
65
- **Execution Flow:**
66
- 1. Load motion text from `tasks/motions/`
67
- 2. Load 195 country agents
68
- 3. For each agent: system prompt + user prompt → JSON response
69
- 4. Validate and aggregate responses
70
- 5. Save results with metadata
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
- **Command Line Interface:**
73
  ```bash
74
  # Run simulation
75
  python scripts/run_motion.py 01_gaza_ceasefire_resolution
@@ -81,11 +300,74 @@ python scripts/run_motion.py 01_gaza_ceasefire_resolution --model claude-3-5-son
81
  python scripts/run_motion.py 01_gaza_ceasefire_resolution --sample 5
82
  ```
83
 
84
- ## Case Study
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
  The Space includes a case study demonstrating the system with a Gaza ceasefire resolution voted on by all 195 agents.
87
 
88
- **Results:** 190 Yes, 3 No, 2 Abstain
 
 
 
 
 
 
 
 
 
 
 
 
89
 
90
  This serves as a concrete example of the framework in action, showing how generic prompts + model knowledge produce diverse, country-specific diplomatic responses.
91
 
 
22
  - **Generic prompt templates** producing country-specific behaviors
23
  - **Task execution model** for running resolutions through all agents
24
 
25
+ ### High-Level Concept
26
 
27
+ ```mermaid
28
+ graph TB
29
+ subgraph "Input Layer"
30
+ RES[UN Resolution Text]
31
+ end
32
+
33
+ subgraph "Agent Layer - 195 Independent Agents"
34
+ A1[Agent: USA<br/>System Prompt]
35
+ A2[Agent: China<br/>System Prompt]
36
+ A3[Agent: Russia<br/>System Prompt]
37
+ ADOT[...]
38
+ A195[Agent: Tuvalu<br/>System Prompt]
39
+ end
40
+
41
+ subgraph "LLM Processing"
42
+ LLM[Claude 3.5 Sonnet<br/>Structured JSON Output]
43
+ end
44
+
45
+ subgraph "Output Layer"
46
+ V1[Vote: yes<br/>Statement: ...]
47
+ V2[Vote: no<br/>Statement: ...]
48
+ V3[Vote: yes<br/>Statement: ...]
49
+ VDOT[...]
50
+ V195[Vote: yes<br/>Statement: ...]
51
+ end
52
+
53
+ subgraph "Aggregation"
54
+ AGG[Combined Results<br/>Vote Counts + All Statements]
55
+ end
56
+
57
+ RES --> A1
58
+ RES --> A2
59
+ RES --> A3
60
+ RES --> ADOT
61
+ RES --> A195
62
+
63
+ A1 --> LLM
64
+ A2 --> LLM
65
+ A3 --> LLM
66
+ ADOT --> LLM
67
+ A195 --> LLM
68
+
69
+ LLM --> V1
70
+ LLM --> V2
71
+ LLM --> V3
72
+ LLM --> VDOT
73
+ LLM --> V195
74
+
75
+ V1 --> AGG
76
+ V2 --> AGG
77
+ V3 --> AGG
78
+ VDOT --> AGG
79
+ V195 --> AGG
80
+
81
+ style RES fill:#6366f1
82
+ style LLM fill:#8b5cf6
83
+ style AGG fill:#22c55e
84
+ style A1 fill:#f59e0b
85
+ style A2 fill:#f59e0b
86
+ style A3 fill:#f59e0b
87
+ style A195 fill:#f59e0b
88
+ ```
89
+
90
+ ## System Architecture
91
+
92
+ ```mermaid
93
+ graph TB
94
+ subgraph Input
95
+ M[Motion Text<br/>tasks/motions/]
96
+ C[Country List<br/>195 UN Members]
97
+ end
98
+
99
+ subgraph "Agent Processing"
100
+ SP[System Prompt<br/>Generic Template]
101
+ UP[User Prompt<br/>+ Motion Text]
102
+ LLM[Claude 3.5 Sonnet<br/>Temperature: 0.7]
103
+ end
104
+
105
+ subgraph "Output Validation"
106
+ JSON[JSON Parser]
107
+ V[Schema Validator]
108
+ E[Error Handler]
109
+ end
110
+
111
+ subgraph Results
112
+ AGG[Aggregated Results]
113
+ META[Metadata]
114
+ FILE[JSON Output File]
115
+ end
116
+
117
+ M --> UP
118
+ C --> SP
119
+ SP --> LLM
120
+ UP --> LLM
121
+ LLM --> JSON
122
+ JSON --> V
123
+ V --> E
124
+ E --> AGG
125
+ AGG --> META
126
+ META --> FILE
127
+
128
+ style LLM fill:#6366f1
129
+ style JSON fill:#22c55e
130
+ style V fill:#f59e0b
131
+ style FILE fill:#8b5cf6
132
+ ```
133
+
134
+ ## Agent Processing Flow
135
+
136
+ ```mermaid
137
+ sequenceDiagram
138
+ participant CLI as CLI Runner
139
+ participant Agent as Country Agent
140
+ participant LLM as Claude 3.5
141
+ participant Val as Validator
142
+ participant Store as Storage
143
+
144
+ CLI->>Agent: Load system prompt
145
+ CLI->>Agent: Send motion text
146
+ Agent->>LLM: System + User Prompt
147
+ LLM->>Agent: Raw text response
148
+ Agent->>Val: Parse JSON
149
+ alt Valid JSON
150
+ Val->>Val: Check schema
151
+ alt Valid Schema
152
+ Val->>Store: Save vote + statement
153
+ else Invalid Schema
154
+ Val->>Store: Save as abstain + error
155
+ end
156
+ else Invalid JSON
157
+ Val->>Store: Save as abstain + error
158
+ end
159
+ Store->>CLI: Continue to next country
160
+ ```
161
+
162
+ ## Core Components
163
+
164
+ ### 1. Agent System Prompts
165
+
166
+ ```mermaid
167
+ graph LR
168
+ subgraph "Generic Template"
169
+ T[Template Structure]
170
+ end
171
+
172
+ subgraph "Variables"
173
+ CN[Country Name]
174
+ P5[P5 Status]
175
+ end
176
+
177
+ subgraph "195 Agents"
178
+ US[United States]
179
+ CN2[China]
180
+ RU[Russia]
181
+ DOT[...]
182
+ TV[Tuvalu]
183
+ end
184
+
185
+ T --> CN
186
+ T --> P5
187
+ CN --> US
188
+ CN --> CN2
189
+ CN --> RU
190
+ CN --> DOT
191
+ CN --> TV
192
+
193
+ style T fill:#6366f1
194
+ style US fill:#22c55e
195
+ style CN2 fill:#22c55e
196
+ style RU fill:#22c55e
197
+ style TV fill:#22c55e
198
+ ```
199
 
 
200
  - 195 country-specific agents (one per UN member state)
201
  - Generic template structure (identical for all countries)
202
  - Only country name and P5 status differ between prompts
203
  - AI infers policy positions from training data
204
 
205
+ ### 2. Structured Output Schema
206
+
207
  ```json
208
  {
209
  "vote": "yes" | "no" | "abstain",
 
211
  }
212
  ```
213
 
214
+ ### 3. Validation Pipeline
 
 
 
 
215
 
216
+ ```mermaid
217
+ graph TD
218
+ A[LLM Response] --> B{Valid JSON?}
219
+ B -->|Yes| C{Has vote field?}
220
+ B -->|No| ERR1[Error: Parse Failure]
221
+ C -->|Yes| D{Has statement field?}
222
+ C -->|No| ERR2[Error: Missing Vote]
223
+ D -->|Yes| E{Vote is yes/no/abstain?}
224
+ D -->|No| ERR3[Error: Missing Statement]
225
+ E -->|Yes| SUCCESS[Save Response]
226
+ E -->|No| ERR4[Error: Invalid Vote]
227
+
228
+ ERR1 --> DEFAULT[Save as Abstain + Error Flag]
229
+ ERR2 --> DEFAULT
230
+ ERR3 --> DEFAULT
231
+ ERR4 --> DEFAULT
232
+
233
+ style SUCCESS fill:#22c55e
234
+ style DEFAULT fill:#f59e0b
235
+ style ERR1 fill:#ef4444
236
+ style ERR2 fill:#ef4444
237
+ style ERR3 fill:#ef4444
238
+ style ERR4 fill:#ef4444
239
+ ```
240
+
241
+ ### 4. Model Configuration
242
+
243
+ - **Primary:** Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
244
+ - **Temperature:** 0.7 (balance consistency + variation)
245
+ - **Max tokens:** 800 per response
246
+ - **Provider:** Anthropic API
247
 
248
  ## What This Tests
249
 
 
255
 
256
  ## Technical Implementation
257
 
258
+ ### Execution Flow
259
+
260
+ ```mermaid
261
+ graph TD
262
+ START[Start Simulation] --> LOAD_MOTION[Load Motion Text<br/>tasks/motions/motion_id.md]
263
+ LOAD_MOTION --> LOAD_COUNTRIES[Load Country List<br/>195 UN Members]
264
+ LOAD_COUNTRIES --> LOOP_START{For Each Country}
265
+
266
+ LOOP_START -->|Country 1-195| LOAD_PROMPT[Load System Prompt<br/>agents/representatives/country/]
267
+ LOAD_PROMPT --> BUILD_USER[Build User Prompt<br/>Motion + Instructions]
268
+ BUILD_USER --> API_CALL[API Call to Claude<br/>System + User Prompt]
269
+ API_CALL --> PARSE[Parse JSON Response]
270
+ PARSE --> VALIDATE[Validate Schema]
271
+ VALIDATE -->|Valid| STORE[Store Result]
272
+ VALIDATE -->|Invalid| ERROR[Store Error + Abstain]
273
+ STORE --> LOOP_START
274
+ ERROR --> LOOP_START
275
+
276
+ LOOP_START -->|All Done| AGGREGATE[Aggregate Results]
277
+ AGGREGATE --> CALC_STATS[Calculate Vote Summary]
278
+ CALC_STATS --> ADD_META[Add Metadata<br/>model, timestamp, etc]
279
+ ADD_META --> SAVE_TIME[Save Timestamped File<br/>motion_id_timestamp.json]
280
+ SAVE_TIME --> SAVE_LATEST[Save Latest File<br/>motion_id_latest.json]
281
+ SAVE_LATEST --> END[Complete]
282
+
283
+ style API_CALL fill:#6366f1
284
+ style VALIDATE fill:#f59e0b
285
+ style STORE fill:#22c55e
286
+ style ERROR fill:#ef4444
287
+ style END fill:#8b5cf6
288
+ ```
289
+
290
+ ### Command Line Interface
291
 
 
292
  ```bash
293
  # Run simulation
294
  python scripts/run_motion.py 01_gaza_ceasefire_resolution
 
300
  python scripts/run_motion.py 01_gaza_ceasefire_resolution --sample 5
301
  ```
302
 
303
+ ### Output Structure
304
+
305
+ ```mermaid
306
+ graph LR
307
+ subgraph "JSON Output"
308
+ ROOT[Root Object]
309
+ META[Metadata]
310
+ VOTES[Votes Array]
311
+ end
312
+
313
+ subgraph "Metadata Fields"
314
+ ID[motion_id]
315
+ TS[timestamp]
316
+ MODEL[model]
317
+ TOTAL[total_votes]
318
+ SUMMARY[vote_summary]
319
+ end
320
+
321
+ subgraph "Vote Summary"
322
+ YES[yes: count]
323
+ NO[no: count]
324
+ ABS[abstain: count]
325
+ end
326
+
327
+ subgraph "Individual Votes"
328
+ V1[Vote 1: Country, vote, statement]
329
+ V2[Vote 2: Country, vote, statement]
330
+ V3[...]
331
+ V195[Vote 195: Country, vote, statement]
332
+ end
333
+
334
+ ROOT --> META
335
+ ROOT --> VOTES
336
+ META --> ID
337
+ META --> TS
338
+ META --> MODEL
339
+ META --> TOTAL
340
+ META --> SUMMARY
341
+ SUMMARY --> YES
342
+ SUMMARY --> NO
343
+ SUMMARY --> ABS
344
+ VOTES --> V1
345
+ VOTES --> V2
346
+ VOTES --> V3
347
+ VOTES --> V195
348
+
349
+ style ROOT fill:#8b5cf6
350
+ style META fill:#6366f1
351
+ style VOTES fill:#22c55e
352
+ ```
353
+
354
+ ## Case Study: Gaza Ceasefire Resolution
355
 
356
  The Space includes a case study demonstrating the system with a Gaza ceasefire resolution voted on by all 195 agents.
357
 
358
+ ### Results Overview
359
+
360
+ ```mermaid
361
+ pie title Vote Distribution (195 Countries)
362
+ "Yes" : 190
363
+ "No" : 3
364
+ "Abstain" : 2
365
+ ```
366
+
367
+ **Key Statistics:**
368
+ - **Yes:** 190 countries (97.4%)
369
+ - **No:** 3 countries (1.5%)
370
+ - **Abstain:** 2 countries (1.0%)
371
 
372
  This serves as a concrete example of the framework in action, showing how generic prompts + model knowledge produce diverse, country-specific diplomatic responses.
373