Spaces:
Running
Running
Tom
Claude
commited on
Commit
·
2a10e9c
1
Parent(s):
df042c8
Add Datawrapper chart generation mode with clean iframe display
Browse files- Add Chart Generation Mode with CSV upload and AI-powered chart creation
- Integrate Datawrapper API via custom MCP handlers for create, publish, and retrieve operations
- Implement RAG-powered chart type selection and configuration
- Display charts as embedded iframes with reasoning and edit button
- Clean up debug output for production-ready UI
- Update README with dual-mode functionality
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- .env.example +7 -0
- README.md +34 -246
- app.py +188 -21
- datawrapper_mcp/__init__.py +1 -0
- datawrapper_mcp/config.py +24 -0
- datawrapper_mcp/handlers/__init__.py +19 -0
- datawrapper_mcp/handlers/create.py +52 -0
- datawrapper_mcp/handlers/delete.py +25 -0
- datawrapper_mcp/handlers/export.py +48 -0
- datawrapper_mcp/handlers/publish.py +26 -0
- datawrapper_mcp/handlers/retrieve.py +27 -0
- datawrapper_mcp/handlers/schema.py +31 -0
- datawrapper_mcp/handlers/update.py +61 -0
- datawrapper_mcp/server.py +101 -0
- datawrapper_mcp/tools.py +286 -0
- datawrapper_mcp/utils.py +118 -0
- requirements.txt +5 -0
- src/datawrapper_client.py +336 -0
- src/prompts.py +107 -0
- start.sh +31 -0
.env.example
CHANGED
|
@@ -21,6 +21,13 @@ HF_TOKEN=hf_your_token_here
|
|
| 21 |
# This is used for Jina-CLIP-v2 embeddings
|
| 22 |
JINA_API_KEY=jina_your_token_here
|
| 23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
# =============================================================================
|
| 25 |
# OPTIONAL: LLM Configuration
|
| 26 |
# =============================================================================
|
|
|
|
| 21 |
# This is used for Jina-CLIP-v2 embeddings
|
| 22 |
JINA_API_KEY=jina_your_token_here
|
| 23 |
|
| 24 |
+
# =============================================================================
|
| 25 |
+
# REQUIRED: Datawrapper API Token
|
| 26 |
+
# =============================================================================
|
| 27 |
+
# Get your token from: https://app.datawrapper.de/account/api-tokens
|
| 28 |
+
# This is used for creating and publishing charts via Datawrapper API
|
| 29 |
+
DATAWRAPPER_ACCESS_TOKEN=your_datawrapper_token_here
|
| 30 |
+
|
| 31 |
# =============================================================================
|
| 32 |
# OPTIONAL: LLM Configuration
|
| 33 |
# =============================================================================
|
README.md
CHANGED
|
@@ -7,278 +7,66 @@ sdk: gradio
|
|
| 7 |
sdk_version: 5.49.1
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
-
short_description: AI assistant for visualization guidance and
|
| 11 |
license: mit
|
| 12 |
---
|
| 13 |
|
| 14 |
-
# 📊
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
|
|
|
| 19 |
|
| 20 |
-
|
| 21 |
-
- **📚 Research-Backed Guidance**: Access insights from academic papers and design best practices
|
| 22 |
-
- **🔍 Context-Aware Retrieval**: Semantic search finds the most relevant examples and knowledge for your needs
|
| 23 |
-
- **🚀 API Access**: Built-in REST API for integration with external applications
|
| 24 |
-
- **💬 Chat Interface**: User-friendly conversational interface
|
| 25 |
-
- **⚡ Technical Implementation**: Practical guidance on tools, techniques, and code examples
|
| 26 |
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
┌──────────────▼───────────────────────┐
|
| 35 |
-
│ RAG Pipeline │
|
| 36 |
-
│ • Query Understanding │
|
| 37 |
-
│ • Document Retrieval (PGVector) │
|
| 38 |
-
│ • Response Generation (LLM) │
|
| 39 |
-
└──────────────┬───────────────────────┘
|
| 40 |
-
│
|
| 41 |
-
┌──────────┴──────────┐
|
| 42 |
-
│ │
|
| 43 |
-
┌───▼───────────┐ ┌─────▼────────────┐
|
| 44 |
-
│ Supabase │ │ HF Inference │
|
| 45 |
-
│ PGVector DB │ │ Providers │
|
| 46 |
-
│ (198 docs) │ │ (Llama 3.1) │
|
| 47 |
-
└───────────────┘ └──────────────────┘
|
| 48 |
-
```
|
| 49 |
|
| 50 |
-
##
|
| 51 |
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
1. **Clone the repository**
|
| 55 |
-
```bash
|
| 56 |
-
git clone <your-repo-url>
|
| 57 |
-
cd graphics-llm
|
| 58 |
-
```
|
| 59 |
-
|
| 60 |
-
2. **Install dependencies**
|
| 61 |
```bash
|
| 62 |
pip install -r requirements.txt
|
| 63 |
```
|
| 64 |
|
| 65 |
-
|
| 66 |
```bash
|
| 67 |
cp .env.example .env
|
| 68 |
-
# Edit .env with your credentials
|
| 69 |
```
|
| 70 |
|
| 71 |
-
Required
|
| 72 |
-
- `SUPABASE_URL
|
| 73 |
-
- `SUPABASE_KEY
|
| 74 |
-
- `HF_TOKEN
|
| 75 |
-
- `
|
| 76 |
|
| 77 |
-
|
| 78 |
```bash
|
| 79 |
python app.py
|
| 80 |
```
|
| 81 |
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
### Hugging Face Spaces Deployment
|
| 85 |
-
|
| 86 |
-
1. **Create a new Space** on Hugging Face
|
| 87 |
-
2. **Push this repository** to your Space
|
| 88 |
-
3. **Set environment variables** in Space settings:
|
| 89 |
-
- `SUPABASE_URL`
|
| 90 |
-
- `SUPABASE_KEY`
|
| 91 |
-
- `HF_TOKEN`
|
| 92 |
-
- `JINA_API_KEY`
|
| 93 |
-
4. **Deploy** - The Space will automatically build and launch
|
| 94 |
-
|
| 95 |
-
## 📚 Usage
|
| 96 |
-
|
| 97 |
-
### Chat Interface
|
| 98 |
-
|
| 99 |
-
Simply ask your design questions:
|
| 100 |
-
|
| 101 |
-
```
|
| 102 |
-
"What's the best chart type for showing trends over time?"
|
| 103 |
-
"How do I create an effective infographic for complex data?"
|
| 104 |
-
"What are best practices for data visualization accessibility?"
|
| 105 |
-
```
|
| 106 |
-
|
| 107 |
-
The assistant will provide:
|
| 108 |
-
1. Design recommendations based on your intent
|
| 109 |
-
2. WHY each visualization type is suitable
|
| 110 |
-
3. HOW to implement it (tools, techniques, code)
|
| 111 |
-
4. Best practices from research and examples
|
| 112 |
-
5. Accessibility and effectiveness considerations
|
| 113 |
-
|
| 114 |
-
### API Access
|
| 115 |
-
|
| 116 |
-
This app automatically exposes REST API endpoints for external integration.
|
| 117 |
-
|
| 118 |
-
**Python Client:**
|
| 119 |
-
|
| 120 |
-
```python
|
| 121 |
-
from gradio_client import Client
|
| 122 |
-
|
| 123 |
-
client = Client("your-space-url")
|
| 124 |
-
result = client.predict(
|
| 125 |
-
"What's the best chart for time series?",
|
| 126 |
-
api_name="/recommend"
|
| 127 |
-
)
|
| 128 |
-
print(result)
|
| 129 |
-
```
|
| 130 |
-
|
| 131 |
-
**JavaScript Client:**
|
| 132 |
-
|
| 133 |
-
```javascript
|
| 134 |
-
import { Client } from "@gradio/client";
|
| 135 |
-
|
| 136 |
-
const client = await Client.connect("your-space-url");
|
| 137 |
-
const result = await client.predict("/recommend", {
|
| 138 |
-
message: "What's the best chart for time series?"
|
| 139 |
-
});
|
| 140 |
-
console.log(result.data);
|
| 141 |
-
```
|
| 142 |
-
|
| 143 |
-
**cURL:**
|
| 144 |
-
|
| 145 |
-
```bash
|
| 146 |
-
curl -X POST "https://your-space.hf.space/call/recommend" \
|
| 147 |
-
-H "Content-Type: application/json" \
|
| 148 |
-
-d '{"data": ["What's the best chart for time series?"]}'
|
| 149 |
-
```
|
| 150 |
-
|
| 151 |
-
**Available Endpoints:**
|
| 152 |
-
- `/call/recommend` - Main design recommendation assistant
|
| 153 |
-
- `/gradio_api/openapi.json` - OpenAPI specification
|
| 154 |
-
|
| 155 |
-
## 🗄️ Database
|
| 156 |
-
|
| 157 |
-
The app uses Supabase with PGVector extension to store and retrieve document chunks from graphics research and examples.
|
| 158 |
-
|
| 159 |
-
**Database Schema:**
|
| 160 |
-
```sql
|
| 161 |
-
CREATE TABLE document_embeddings (
|
| 162 |
-
id BIGINT PRIMARY KEY,
|
| 163 |
-
source_type TEXT, -- pdf, url, or image
|
| 164 |
-
source_id TEXT, -- filename or URL
|
| 165 |
-
title TEXT,
|
| 166 |
-
content_type TEXT, -- text or image
|
| 167 |
-
chunk_index INTEGER,
|
| 168 |
-
chunk_text TEXT,
|
| 169 |
-
page_number INTEGER,
|
| 170 |
-
embedding VECTOR(1024), -- 1024-dimensional vectors
|
| 171 |
-
metadata JSONB,
|
| 172 |
-
word_count INTEGER,
|
| 173 |
-
image_metadata JSONB,
|
| 174 |
-
created_at TIMESTAMPTZ
|
| 175 |
-
);
|
| 176 |
-
```
|
| 177 |
-
|
| 178 |
-
**Knowledge Base Content:**
|
| 179 |
-
- Research papers on data visualization
|
| 180 |
-
- Design principles and best practices
|
| 181 |
-
- Visual narrative techniques
|
| 182 |
-
- Accessibility guidelines
|
| 183 |
-
- Chart type selection guidance
|
| 184 |
-
- Real-world examples and case studies
|
| 185 |
-
|
| 186 |
-
## 🛠️ Technology Stack
|
| 187 |
-
|
| 188 |
-
- **UI/API**: [Gradio](https://gradio.app/) - Automatic API generation
|
| 189 |
-
- **Vector Database**: [Supabase](https://supabase.com/) with PGVector extension
|
| 190 |
-
- **Embeddings**: Jina-CLIP-v2 (1024-dimensional)
|
| 191 |
-
- **LLM**: [Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers/) - Llama 3.1
|
| 192 |
-
- **Language**: Python 3.9+
|
| 193 |
-
|
| 194 |
-
## 📁 Project Structure
|
| 195 |
-
|
| 196 |
-
```
|
| 197 |
-
graphics-llm/
|
| 198 |
-
├── app.py # Main Gradio application
|
| 199 |
-
├── requirements.txt # Python dependencies
|
| 200 |
-
├── .env.example # Environment variables template
|
| 201 |
-
├── README.md # This file
|
| 202 |
-
└── src/
|
| 203 |
-
├── __init__.py
|
| 204 |
-
├── vectorstore.py # Supabase PGVector connection
|
| 205 |
-
├── rag_pipeline.py # RAG pipeline logic
|
| 206 |
-
├── llm_client.py # Inference Provider client
|
| 207 |
-
└── prompts.py # Design recommendation prompt templates
|
| 208 |
-
```
|
| 209 |
-
|
| 210 |
-
## ⚙️ Configuration
|
| 211 |
-
|
| 212 |
-
### Environment Variables
|
| 213 |
-
|
| 214 |
-
See `.env.example` for all available configuration options.
|
| 215 |
-
|
| 216 |
-
**Required:**
|
| 217 |
-
- `SUPABASE_URL` - Supabase project URL
|
| 218 |
-
- `SUPABASE_KEY` - Supabase anon key
|
| 219 |
-
- `HF_TOKEN` - Hugging Face API token (for LLM generation)
|
| 220 |
-
- `JINA_API_KEY` - Jina AI API token (for Jina-CLIP-v2 embeddings)
|
| 221 |
-
|
| 222 |
-
**Optional:**
|
| 223 |
-
- `LLM_MODEL` - Model to use (default: meta-llama/Llama-3.1-8B-Instruct)
|
| 224 |
-
- `LLM_TEMPERATURE` - Generation temperature (default: 0.2)
|
| 225 |
-
- `LLM_MAX_TOKENS` - Max tokens to generate (default: 2000)
|
| 226 |
-
- `RETRIEVAL_K` - Number of documents to retrieve (default: 5)
|
| 227 |
-
- `EMBEDDING_MODEL` - Embedding model (default: jina-clip-v2)
|
| 228 |
-
|
| 229 |
-
### Supported LLM Models
|
| 230 |
-
|
| 231 |
-
- `meta-llama/Llama-3.1-8B-Instruct` (recommended)
|
| 232 |
-
- `meta-llama/Meta-Llama-3-8B-Instruct`
|
| 233 |
-
- `Qwen/Qwen2.5-72B-Instruct`
|
| 234 |
-
- `mistralai/Mistral-7B-Instruct-v0.3`
|
| 235 |
-
|
| 236 |
-
## 💰 Cost Considerations
|
| 237 |
-
|
| 238 |
-
### Hugging Face Inference Providers
|
| 239 |
-
- Free tier: $0.10/month credits
|
| 240 |
-
- PRO tier: $2.00/month credits + pay-as-you-go
|
| 241 |
-
- Typical cost: ~$0.001-0.01 per query
|
| 242 |
-
- Recommended budget: $10-50/month for moderate usage
|
| 243 |
-
|
| 244 |
-
### Supabase
|
| 245 |
-
- Free tier sufficient for most use cases
|
| 246 |
-
- PGVector operations are standard database queries
|
| 247 |
-
|
| 248 |
-
### Hugging Face Spaces
|
| 249 |
-
- Free CPU hosting available
|
| 250 |
-
- GPU upgrade: ~$0.60/hour (optional, not required)
|
| 251 |
-
|
| 252 |
-
## 🔮 Future Enhancements
|
| 253 |
-
|
| 254 |
-
- [ ] Multi-turn conversation with memory
|
| 255 |
-
- [ ] Code generation for visualization implementations
|
| 256 |
-
- [ ] Interactive visualization previews
|
| 257 |
-
- [ ] User-uploaded data analysis
|
| 258 |
-
- [ ] Export recommendations as PDF/markdown
|
| 259 |
-
- [ ] Community-contributed examples
|
| 260 |
-
- [ ] Support for more design domains (UI/UX, print graphics)
|
| 261 |
-
|
| 262 |
-
## 🤝 Contributing
|
| 263 |
-
|
| 264 |
-
Contributions are welcome! Please feel free to submit issues or pull requests.
|
| 265 |
-
|
| 266 |
-
## 📄 License
|
| 267 |
-
|
| 268 |
-
MIT License - See LICENSE file for details
|
| 269 |
-
|
| 270 |
-
## 🙏 Acknowledgments
|
| 271 |
|
| 272 |
-
-
|
| 273 |
-
-
|
|
|
|
|
|
|
|
|
|
| 274 |
|
| 275 |
-
##
|
| 276 |
|
| 277 |
-
|
| 278 |
-
- Open an issue on GitHub
|
| 279 |
-
- Check the [Hugging Face Spaces documentation](https://huggingface.co/docs/hub/spaces)
|
| 280 |
-
- Review the [Gradio documentation](https://gradio.app/docs/)
|
| 281 |
|
| 282 |
---
|
| 283 |
|
| 284 |
-
Built
|
|
|
|
| 7 |
sdk_version: 5.49.1
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
+
short_description: AI assistant for visualization guidance and chart generation
|
| 11 |
license: mit
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# 📊 Viz LLM
|
| 15 |
|
| 16 |
+
AI-powered data visualization assistant with two modes:
|
| 17 |
|
| 18 |
+
- **💡 Ideation Mode**: Get design recommendations based on research and best practices
|
| 19 |
+
- **📊 Chart Generation Mode**: Upload CSV data and automatically generate publication-ready charts
|
| 20 |
|
| 21 |
+
## Features
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
+
**Ideation Mode:**
|
| 24 |
+
- Research-backed visualization guidance
|
| 25 |
+
- Chart type recommendations
|
| 26 |
+
- Design best practices and accessibility advice
|
| 27 |
+
- Powered by RAG with Jina-CLIP-v2 embeddings
|
| 28 |
|
| 29 |
+
**Chart Generation Mode:**
|
| 30 |
+
- Upload CSV data
|
| 31 |
+
- AI analyzes your data and selects optimal chart type
|
| 32 |
+
- Automatic chart creation via Datawrapper API
|
| 33 |
+
- Publication-ready visualizations with one click
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
+
## Quick Start
|
| 36 |
|
| 37 |
+
1. **Install dependencies:**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
```bash
|
| 39 |
pip install -r requirements.txt
|
| 40 |
```
|
| 41 |
|
| 42 |
+
2. **Set up environment variables:**
|
| 43 |
```bash
|
| 44 |
cp .env.example .env
|
|
|
|
| 45 |
```
|
| 46 |
|
| 47 |
+
Required:
|
| 48 |
+
- `SUPABASE_URL` - Your Supabase project URL
|
| 49 |
+
- `SUPABASE_KEY` - Your Supabase anon key
|
| 50 |
+
- `HF_TOKEN` - Hugging Face API token
|
| 51 |
+
- `DATAWRAPPER_ACCESS_TOKEN` - Datawrapper API token
|
| 52 |
|
| 53 |
+
3. **Run the app:**
|
| 54 |
```bash
|
| 55 |
python app.py
|
| 56 |
```
|
| 57 |
|
| 58 |
+
## Technology Stack
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
+
- **UI**: Gradio
|
| 61 |
+
- **Vector Database**: Supabase PGVector
|
| 62 |
+
- **Embeddings**: Jina-CLIP-v2
|
| 63 |
+
- **LLM**: Llama 3.1 via Hugging Face Inference Providers
|
| 64 |
+
- **Charts**: Datawrapper API
|
| 65 |
|
| 66 |
+
## License
|
| 67 |
|
| 68 |
+
MIT License
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
---
|
| 71 |
|
| 72 |
+
Built for the data visualization community
|
app.py
CHANGED
|
@@ -3,12 +3,17 @@ Viz LLM - Gradio App
|
|
| 3 |
|
| 4 |
A RAG-powered assistant for data visualization guidance, powered by Jina-CLIP-v2
|
| 5 |
embeddings and research from the field of information graphics.
|
|
|
|
|
|
|
| 6 |
"""
|
| 7 |
|
| 8 |
import os
|
|
|
|
|
|
|
| 9 |
import gradio as gr
|
| 10 |
from dotenv import load_dotenv
|
| 11 |
from src.rag_pipeline import create_pipeline
|
|
|
|
| 12 |
from datetime import datetime, timedelta
|
| 13 |
from collections import defaultdict
|
| 14 |
|
|
@@ -90,7 +95,94 @@ def recommend_stream(message: str, history: list, request: gr.Request):
|
|
| 90 |
yield f"Error generating response: {str(e)}\n\nPlease check your environment variables (HF_TOKEN, SUPABASE_URL, SUPABASE_KEY) and try again."
|
| 91 |
|
| 92 |
|
| 93 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
custom_css = """
|
| 95 |
/* Hide retry/undo buttons that appear as artifacts */
|
| 96 |
.chatbot button[aria-label="Retry"],
|
|
@@ -102,9 +194,16 @@ custom_css = """
|
|
| 102 |
textarea[data-testid="textbox"] {
|
| 103 |
overflow-y: hidden !important;
|
| 104 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
"""
|
| 106 |
|
| 107 |
-
# Create Gradio interface
|
| 108 |
with gr.Blocks(
|
| 109 |
title="Viz LLM",
|
| 110 |
css=custom_css
|
|
@@ -112,29 +211,95 @@ with gr.Blocks(
|
|
| 112 |
gr.Markdown("""
|
| 113 |
# 📊 Viz LLM
|
| 114 |
|
| 115 |
-
Get design recommendations
|
| 116 |
""")
|
| 117 |
|
| 118 |
-
#
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
"
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 131 |
)
|
| 132 |
|
| 133 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
gr.Markdown("""
|
| 135 |
-
###
|
|
|
|
|
|
|
| 136 |
|
| 137 |
-
|
| 138 |
|
| 139 |
**Credits:** Special thanks to the researchers whose work informed this model: Robert Kosara, Edward Segel, Jeffrey Heer, Matthew Conlen, John Maeda, Kennedy Elliott, Scott McCloud, and many others.
|
| 140 |
|
|
@@ -143,19 +308,21 @@ with gr.Blocks(
|
|
| 143 |
**Usage Limits:** This service is limited to 20 queries per day per user to manage costs. Responses are optimized for English.
|
| 144 |
|
| 145 |
<div style="text-align: center; margin-top: 20px; opacity: 0.6; font-size: 0.9em;">
|
| 146 |
-
Embeddings: Jina-CLIP-v2
|
| 147 |
</div>
|
| 148 |
""")
|
| 149 |
|
| 150 |
# Launch configuration
|
| 151 |
if __name__ == "__main__":
|
| 152 |
# Check for required environment variables
|
| 153 |
-
required_vars = ["SUPABASE_URL", "SUPABASE_KEY", "HF_TOKEN"]
|
| 154 |
missing_vars = [var for var in required_vars if not os.getenv(var)]
|
| 155 |
|
| 156 |
if missing_vars:
|
| 157 |
print(f"⚠️ Warning: Missing environment variables: {', '.join(missing_vars)}")
|
| 158 |
print("Please set these in your .env file or as environment variables")
|
|
|
|
|
|
|
| 159 |
|
| 160 |
# Launch the app
|
| 161 |
demo.launch(
|
|
|
|
| 3 |
|
| 4 |
A RAG-powered assistant for data visualization guidance, powered by Jina-CLIP-v2
|
| 5 |
embeddings and research from the field of information graphics.
|
| 6 |
+
|
| 7 |
+
Now with Datawrapper integration for chart generation!
|
| 8 |
"""
|
| 9 |
|
| 10 |
import os
|
| 11 |
+
import asyncio
|
| 12 |
+
import pandas as pd
|
| 13 |
import gradio as gr
|
| 14 |
from dotenv import load_dotenv
|
| 15 |
from src.rag_pipeline import create_pipeline
|
| 16 |
+
from src.datawrapper_client import create_and_publish_chart, get_iframe_html
|
| 17 |
from datetime import datetime, timedelta
|
| 18 |
from collections import defaultdict
|
| 19 |
|
|
|
|
| 95 |
yield f"Error generating response: {str(e)}\n\nPlease check your environment variables (HF_TOKEN, SUPABASE_URL, SUPABASE_KEY) and try again."
|
| 96 |
|
| 97 |
|
| 98 |
+
def generate_chart_from_csv(csv_file, user_prompt):
|
| 99 |
+
"""
|
| 100 |
+
Generate a Datawrapper chart from uploaded CSV and user prompt.
|
| 101 |
+
|
| 102 |
+
Args:
|
| 103 |
+
csv_file: Uploaded CSV file
|
| 104 |
+
user_prompt: User's description of the chart
|
| 105 |
+
|
| 106 |
+
Returns:
|
| 107 |
+
HTML string with iframe or error message
|
| 108 |
+
"""
|
| 109 |
+
if not csv_file:
|
| 110 |
+
return "<div style='padding: 50px; text-align: center;'>Please upload a CSV file to generate a chart.</div>"
|
| 111 |
+
|
| 112 |
+
if not user_prompt or user_prompt.strip() == "":
|
| 113 |
+
return "<div style='padding: 50px; text-align: center;'>Please describe what chart you want to create.</div>"
|
| 114 |
+
|
| 115 |
+
try:
|
| 116 |
+
# Show loading message
|
| 117 |
+
loading_html = """
|
| 118 |
+
<div style='padding: 100px; text-align: center;'>
|
| 119 |
+
<h3>🎨 Creating your chart...</h3>
|
| 120 |
+
<p>Analyzing your data and selecting the best visualization...</p>
|
| 121 |
+
</div>
|
| 122 |
+
"""
|
| 123 |
+
|
| 124 |
+
# Read CSV file
|
| 125 |
+
df = pd.read_csv(csv_file)
|
| 126 |
+
|
| 127 |
+
# Create and publish chart (async function, need to run in event loop)
|
| 128 |
+
loop = asyncio.new_event_loop()
|
| 129 |
+
asyncio.set_event_loop(loop)
|
| 130 |
+
result = loop.run_until_complete(
|
| 131 |
+
create_and_publish_chart(df, user_prompt, pipeline)
|
| 132 |
+
)
|
| 133 |
+
loop.close()
|
| 134 |
+
|
| 135 |
+
if result.get("success"):
|
| 136 |
+
# Get the iframe HTML
|
| 137 |
+
iframe_html = get_iframe_html(result.get('public_url'), height=500)
|
| 138 |
+
|
| 139 |
+
# Create HTML with iframe, reasoning, and edit button
|
| 140 |
+
chart_html = f"""
|
| 141 |
+
<div style='padding: 20px;'>
|
| 142 |
+
<!-- Chart iframe -->
|
| 143 |
+
<div style='margin-bottom: 20px;'>
|
| 144 |
+
{iframe_html}
|
| 145 |
+
</div>
|
| 146 |
+
|
| 147 |
+
<!-- Why this chart? -->
|
| 148 |
+
<div style='background: #f9f9f9; padding: 15px; border-radius: 5px; margin-bottom: 15px;'>
|
| 149 |
+
<strong>Why this chart?</strong><br>
|
| 150 |
+
<p style='margin: 10px 0 0 0;'>{result['reasoning']}</p>
|
| 151 |
+
</div>
|
| 152 |
+
|
| 153 |
+
<!-- Edit button -->
|
| 154 |
+
<div>
|
| 155 |
+
<a href="{result['edit_url']}" target="_blank"
|
| 156 |
+
style="display: inline-block; padding: 12px 24px; background: #1976d2; color: white;
|
| 157 |
+
text-decoration: none; border-radius: 5px; font-weight: bold;">
|
| 158 |
+
✏️ Open in Datawrapper
|
| 159 |
+
</a>
|
| 160 |
+
</div>
|
| 161 |
+
</div>
|
| 162 |
+
"""
|
| 163 |
+
|
| 164 |
+
return chart_html
|
| 165 |
+
else:
|
| 166 |
+
error_msg = result.get("error", "Unknown error")
|
| 167 |
+
return f"""
|
| 168 |
+
<div style='padding: 50px; text-align: center; color: red;'>
|
| 169 |
+
<h3>❌ Chart Generation Failed</h3>
|
| 170 |
+
<p>{error_msg}</p>
|
| 171 |
+
<p style='font-size: 0.9em; color: #666;'>Please check your CSV format and try again.</p>
|
| 172 |
+
</div>
|
| 173 |
+
"""
|
| 174 |
+
|
| 175 |
+
except Exception as e:
|
| 176 |
+
return f"""
|
| 177 |
+
<div style='padding: 50px; text-align: center; color: red;'>
|
| 178 |
+
<h3>❌ Error</h3>
|
| 179 |
+
<p>{str(e)}</p>
|
| 180 |
+
<p style='font-size: 0.9em; color: #666;'>Please ensure your CSV is properly formatted and try again.</p>
|
| 181 |
+
</div>
|
| 182 |
+
"""
|
| 183 |
+
|
| 184 |
+
|
| 185 |
+
# Minimal CSS to fix UI artifacts and style the mode selector
|
| 186 |
custom_css = """
|
| 187 |
/* Hide retry/undo buttons that appear as artifacts */
|
| 188 |
.chatbot button[aria-label="Retry"],
|
|
|
|
| 194 |
textarea[data-testid="textbox"] {
|
| 195 |
overflow-y: hidden !important;
|
| 196 |
}
|
| 197 |
+
|
| 198 |
+
/* Mode selector buttons */
|
| 199 |
+
.mode-button {
|
| 200 |
+
font-size: 1.1em;
|
| 201 |
+
padding: 12px 24px;
|
| 202 |
+
margin: 5px;
|
| 203 |
+
}
|
| 204 |
"""
|
| 205 |
|
| 206 |
+
# Create Gradio interface with dual-mode layout
|
| 207 |
with gr.Blocks(
|
| 208 |
title="Viz LLM",
|
| 209 |
css=custom_css
|
|
|
|
| 211 |
gr.Markdown("""
|
| 212 |
# 📊 Viz LLM
|
| 213 |
|
| 214 |
+
Get design recommendations or generate charts with AI-powered data visualization assistance.
|
| 215 |
""")
|
| 216 |
|
| 217 |
+
# Mode selector buttons
|
| 218 |
+
with gr.Row():
|
| 219 |
+
ideation_btn = gr.Button("💡 Ideation Mode", variant="primary", elem_classes="mode-button")
|
| 220 |
+
chart_gen_btn = gr.Button("📊 Chart Generation Mode", variant="secondary", elem_classes="mode-button")
|
| 221 |
+
|
| 222 |
+
# Ideation Mode: Chat interface (shown by default, wrapped in Column)
|
| 223 |
+
with gr.Column(visible=True) as ideation_container:
|
| 224 |
+
ideation_interface = gr.ChatInterface(
|
| 225 |
+
fn=recommend_stream,
|
| 226 |
+
type="messages",
|
| 227 |
+
examples=[
|
| 228 |
+
"What's the best chart type for showing trends over time?",
|
| 229 |
+
"How do I create an effective infographic for complex data?",
|
| 230 |
+
"What are best practices for data visualization accessibility?",
|
| 231 |
+
"How should I design a dashboard for storytelling?",
|
| 232 |
+
"What visualization works best for comparing categories?"
|
| 233 |
+
],
|
| 234 |
+
cache_examples=False,
|
| 235 |
+
api_name="recommend"
|
| 236 |
+
)
|
| 237 |
+
|
| 238 |
+
# Chart Generation Mode: Chart controls and output (hidden by default)
|
| 239 |
+
with gr.Column(visible=False) as chart_gen_container:
|
| 240 |
+
csv_upload = gr.File(
|
| 241 |
+
label="📁 Upload CSV File",
|
| 242 |
+
file_types=[".csv"],
|
| 243 |
+
type="filepath"
|
| 244 |
+
)
|
| 245 |
+
|
| 246 |
+
chart_prompt_input = gr.Textbox(
|
| 247 |
+
label="Describe your chart",
|
| 248 |
+
placeholder="E.g., 'Show sales trends over time' or 'Compare revenue by category'",
|
| 249 |
+
lines=2
|
| 250 |
+
)
|
| 251 |
+
|
| 252 |
+
generate_chart_btn = gr.Button("Generate Chart", variant="primary", size="lg")
|
| 253 |
+
|
| 254 |
+
chart_output = gr.HTML(
|
| 255 |
+
value="<div style='text-align:center; padding:100px; color: #666;'>Upload a CSV file and describe your visualization above, then click Generate Chart.</div>",
|
| 256 |
+
label="Generated Chart"
|
| 257 |
+
)
|
| 258 |
+
|
| 259 |
+
# Mode switching functions
|
| 260 |
+
def switch_to_ideation():
|
| 261 |
+
return [
|
| 262 |
+
gr.update(variant="primary"), # ideation_btn
|
| 263 |
+
gr.update(variant="secondary"), # chart_gen_btn
|
| 264 |
+
gr.update(visible=True), # ideation_container
|
| 265 |
+
gr.update(visible=False), # chart_gen_container
|
| 266 |
+
]
|
| 267 |
+
|
| 268 |
+
def switch_to_chart_gen():
|
| 269 |
+
return [
|
| 270 |
+
gr.update(variant="secondary"), # ideation_btn
|
| 271 |
+
gr.update(variant="primary"), # chart_gen_btn
|
| 272 |
+
gr.update(visible=False), # ideation_container
|
| 273 |
+
gr.update(visible=True), # chart_gen_container
|
| 274 |
+
]
|
| 275 |
+
|
| 276 |
+
# Wire up mode switching
|
| 277 |
+
ideation_btn.click(
|
| 278 |
+
fn=switch_to_ideation,
|
| 279 |
+
inputs=[],
|
| 280 |
+
outputs=[ideation_btn, chart_gen_btn, ideation_container, chart_gen_container]
|
| 281 |
)
|
| 282 |
|
| 283 |
+
chart_gen_btn.click(
|
| 284 |
+
fn=switch_to_chart_gen,
|
| 285 |
+
inputs=[],
|
| 286 |
+
outputs=[ideation_btn, chart_gen_btn, ideation_container, chart_gen_container]
|
| 287 |
+
)
|
| 288 |
+
|
| 289 |
+
# Generate chart when button is clicked
|
| 290 |
+
generate_chart_btn.click(
|
| 291 |
+
fn=generate_chart_from_csv,
|
| 292 |
+
inputs=[csv_upload, chart_prompt_input],
|
| 293 |
+
outputs=[chart_output]
|
| 294 |
+
)
|
| 295 |
+
|
| 296 |
+
# Knowledge base section (below both interfaces)
|
| 297 |
gr.Markdown("""
|
| 298 |
+
### About Viz LLM
|
| 299 |
+
|
| 300 |
+
**Ideation Mode:** Get design recommendations based on research papers, design principles, and examples from the field of information graphics and data visualization.
|
| 301 |
|
| 302 |
+
**Chart Generation Mode:** Upload your CSV data and describe your visualization goal. The AI will analyze your data, select the optimal chart type, and generate a publication-ready chart using Datawrapper.
|
| 303 |
|
| 304 |
**Credits:** Special thanks to the researchers whose work informed this model: Robert Kosara, Edward Segel, Jeffrey Heer, Matthew Conlen, John Maeda, Kennedy Elliott, Scott McCloud, and many others.
|
| 305 |
|
|
|
|
| 308 |
**Usage Limits:** This service is limited to 20 queries per day per user to manage costs. Responses are optimized for English.
|
| 309 |
|
| 310 |
<div style="text-align: center; margin-top: 20px; opacity: 0.6; font-size: 0.9em;">
|
| 311 |
+
Embeddings: Jina-CLIP-v2 | Charts: Datawrapper API
|
| 312 |
</div>
|
| 313 |
""")
|
| 314 |
|
| 315 |
# Launch configuration
|
| 316 |
if __name__ == "__main__":
|
| 317 |
# Check for required environment variables
|
| 318 |
+
required_vars = ["SUPABASE_URL", "SUPABASE_KEY", "HF_TOKEN", "DATAWRAPPER_ACCESS_TOKEN"]
|
| 319 |
missing_vars = [var for var in required_vars if not os.getenv(var)]
|
| 320 |
|
| 321 |
if missing_vars:
|
| 322 |
print(f"⚠️ Warning: Missing environment variables: {', '.join(missing_vars)}")
|
| 323 |
print("Please set these in your .env file or as environment variables")
|
| 324 |
+
if "DATAWRAPPER_ACCESS_TOKEN" in missing_vars:
|
| 325 |
+
print("Note: DATAWRAPPER_ACCESS_TOKEN is required for chart generation mode")
|
| 326 |
|
| 327 |
# Launch the app
|
| 328 |
demo.launch(
|
datawrapper_mcp/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
"""A Model Context Protocol server for creating Datawrapper charts."""
|
datawrapper_mcp/config.py
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Configuration and constants for the Datawrapper MCP server."""
|
| 2 |
+
|
| 3 |
+
from datawrapper import (
|
| 4 |
+
AreaChart,
|
| 5 |
+
ArrowChart,
|
| 6 |
+
BarChart,
|
| 7 |
+
ColumnChart,
|
| 8 |
+
LineChart,
|
| 9 |
+
MultipleColumnChart,
|
| 10 |
+
ScatterPlot,
|
| 11 |
+
StackedBarChart,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
# Map of chart type names to their Pydantic classes
|
| 15 |
+
CHART_CLASSES = {
|
| 16 |
+
"bar": BarChart,
|
| 17 |
+
"line": LineChart,
|
| 18 |
+
"area": AreaChart,
|
| 19 |
+
"arrow": ArrowChart,
|
| 20 |
+
"column": ColumnChart,
|
| 21 |
+
"multiple_column": MultipleColumnChart,
|
| 22 |
+
"scatter": ScatterPlot,
|
| 23 |
+
"stacked_bar": StackedBarChart,
|
| 24 |
+
}
|
datawrapper_mcp/handlers/__init__.py
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler functions for MCP tool implementations."""
|
| 2 |
+
|
| 3 |
+
from .create import create_chart
|
| 4 |
+
from .delete import delete_chart
|
| 5 |
+
from .export import export_chart_png
|
| 6 |
+
from .publish import publish_chart
|
| 7 |
+
from .retrieve import get_chart_info
|
| 8 |
+
from .schema import get_chart_schema
|
| 9 |
+
from .update import update_chart
|
| 10 |
+
|
| 11 |
+
__all__ = [
|
| 12 |
+
"create_chart",
|
| 13 |
+
"delete_chart",
|
| 14 |
+
"export_chart_png",
|
| 15 |
+
"get_chart_info",
|
| 16 |
+
"get_chart_schema",
|
| 17 |
+
"publish_chart",
|
| 18 |
+
"update_chart",
|
| 19 |
+
]
|
datawrapper_mcp/handlers/create.py
ADDED
|
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for creating Datawrapper charts."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
|
| 5 |
+
from mcp.types import TextContent
|
| 6 |
+
|
| 7 |
+
from ..config import CHART_CLASSES
|
| 8 |
+
from ..utils import get_api_token, json_to_dataframe
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def create_chart(arguments: dict) -> list[TextContent]:
|
| 12 |
+
"""Create a chart with full Pydantic model configuration."""
|
| 13 |
+
api_token = get_api_token()
|
| 14 |
+
|
| 15 |
+
# Convert data to DataFrame
|
| 16 |
+
df = json_to_dataframe(arguments["data"])
|
| 17 |
+
|
| 18 |
+
# Get chart class and validate config
|
| 19 |
+
chart_type = arguments["chart_type"]
|
| 20 |
+
chart_class = CHART_CLASSES[chart_type]
|
| 21 |
+
|
| 22 |
+
# Validate and create chart using Pydantic model
|
| 23 |
+
try:
|
| 24 |
+
chart = chart_class.model_validate(arguments["chart_config"])
|
| 25 |
+
except Exception as e:
|
| 26 |
+
return [
|
| 27 |
+
TextContent(
|
| 28 |
+
type="text",
|
| 29 |
+
text=f"Invalid chart configuration: {str(e)}\n\n"
|
| 30 |
+
f"Use get_chart_schema with chart_type '{chart_type}' "
|
| 31 |
+
f"to see the valid schema.",
|
| 32 |
+
)
|
| 33 |
+
]
|
| 34 |
+
|
| 35 |
+
# Set data on chart instance
|
| 36 |
+
chart.data = df
|
| 37 |
+
|
| 38 |
+
# Create chart using Pydantic instance method
|
| 39 |
+
chart.create(access_token=api_token)
|
| 40 |
+
|
| 41 |
+
result = {
|
| 42 |
+
"chart_id": chart.chart_id,
|
| 43 |
+
"chart_type": chart_type,
|
| 44 |
+
"title": chart.title,
|
| 45 |
+
"edit_url": chart.get_editor_url(),
|
| 46 |
+
"message": (
|
| 47 |
+
f"Chart created successfully! Edit it at: {chart.get_editor_url()}\n"
|
| 48 |
+
f"Use publish_chart with chart_id '{chart.chart_id}' to make it public."
|
| 49 |
+
),
|
| 50 |
+
}
|
| 51 |
+
|
| 52 |
+
return [TextContent(type="text", text=json.dumps(result, indent=2))]
|
datawrapper_mcp/handlers/delete.py
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for deleting Datawrapper charts."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
|
| 5 |
+
from datawrapper import get_chart
|
| 6 |
+
from mcp.types import TextContent
|
| 7 |
+
|
| 8 |
+
from ..utils import get_api_token
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def delete_chart(arguments: dict) -> list[TextContent]:
|
| 12 |
+
"""Delete a chart permanently."""
|
| 13 |
+
api_token = get_api_token()
|
| 14 |
+
chart_id = arguments["chart_id"]
|
| 15 |
+
|
| 16 |
+
# Get chart and delete using Pydantic instance method
|
| 17 |
+
chart = get_chart(chart_id, access_token=api_token)
|
| 18 |
+
chart.delete(access_token=api_token)
|
| 19 |
+
|
| 20 |
+
result = {
|
| 21 |
+
"chart_id": chart_id,
|
| 22 |
+
"message": "Chart deleted successfully!",
|
| 23 |
+
}
|
| 24 |
+
|
| 25 |
+
return [TextContent(type="text", text=json.dumps(result, indent=2))]
|
datawrapper_mcp/handlers/export.py
ADDED
|
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for exporting Datawrapper charts."""
|
| 2 |
+
|
| 3 |
+
import base64
|
| 4 |
+
|
| 5 |
+
from datawrapper import get_chart
|
| 6 |
+
from mcp.types import ImageContent
|
| 7 |
+
|
| 8 |
+
from ..utils import get_api_token
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def export_chart_png(arguments: dict) -> list[ImageContent]:
|
| 12 |
+
"""Export a chart as PNG and return it as inline image."""
|
| 13 |
+
api_token = get_api_token()
|
| 14 |
+
chart_id = arguments["chart_id"]
|
| 15 |
+
|
| 16 |
+
# Get chart using factory function
|
| 17 |
+
chart = get_chart(chart_id, access_token=api_token)
|
| 18 |
+
|
| 19 |
+
# Build export parameters
|
| 20 |
+
export_params = {}
|
| 21 |
+
if "width" in arguments:
|
| 22 |
+
export_params["width"] = arguments["width"]
|
| 23 |
+
if "height" in arguments:
|
| 24 |
+
export_params["height"] = arguments["height"]
|
| 25 |
+
if "plain" in arguments:
|
| 26 |
+
export_params["plain"] = arguments["plain"]
|
| 27 |
+
if "zoom" in arguments:
|
| 28 |
+
export_params["zoom"] = arguments["zoom"]
|
| 29 |
+
if "transparent" in arguments:
|
| 30 |
+
export_params["transparent"] = arguments["transparent"]
|
| 31 |
+
if "border_width" in arguments:
|
| 32 |
+
export_params["borderWidth"] = arguments["border_width"]
|
| 33 |
+
if "border_color" in arguments:
|
| 34 |
+
export_params["borderColor"] = arguments["border_color"]
|
| 35 |
+
|
| 36 |
+
# Export PNG using Pydantic instance method
|
| 37 |
+
png_bytes = chart.export_png(access_token=api_token, **export_params)
|
| 38 |
+
|
| 39 |
+
# Encode to base64
|
| 40 |
+
base64_data = base64.b64encode(png_bytes).decode("utf-8")
|
| 41 |
+
|
| 42 |
+
return [
|
| 43 |
+
ImageContent(
|
| 44 |
+
type="image",
|
| 45 |
+
data=base64_data,
|
| 46 |
+
mimeType="image/png",
|
| 47 |
+
)
|
| 48 |
+
]
|
datawrapper_mcp/handlers/publish.py
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for publishing Datawrapper charts."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
|
| 5 |
+
from datawrapper import get_chart
|
| 6 |
+
from mcp.types import TextContent
|
| 7 |
+
|
| 8 |
+
from ..utils import get_api_token
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def publish_chart(arguments: dict) -> list[TextContent]:
|
| 12 |
+
"""Publish a chart to make it publicly accessible."""
|
| 13 |
+
api_token = get_api_token()
|
| 14 |
+
chart_id = arguments["chart_id"]
|
| 15 |
+
|
| 16 |
+
# Get chart and publish using Pydantic instance method
|
| 17 |
+
chart = get_chart(chart_id, access_token=api_token)
|
| 18 |
+
chart.publish(access_token=api_token)
|
| 19 |
+
|
| 20 |
+
result = {
|
| 21 |
+
"chart_id": chart_id,
|
| 22 |
+
"public_url": chart.get_public_url(),
|
| 23 |
+
"message": "Chart published successfully!",
|
| 24 |
+
}
|
| 25 |
+
|
| 26 |
+
return [TextContent(type="text", text=json.dumps(result, indent=2))]
|
datawrapper_mcp/handlers/retrieve.py
ADDED
|
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for retrieving chart information."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
|
| 5 |
+
from datawrapper import get_chart
|
| 6 |
+
from mcp.types import TextContent
|
| 7 |
+
|
| 8 |
+
from ..utils import get_api_token
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def get_chart_info(arguments: dict) -> list[TextContent]:
|
| 12 |
+
"""Get information about an existing chart."""
|
| 13 |
+
api_token = get_api_token()
|
| 14 |
+
chart_id = arguments["chart_id"]
|
| 15 |
+
|
| 16 |
+
# Get chart using factory function
|
| 17 |
+
chart = get_chart(chart_id, access_token=api_token)
|
| 18 |
+
|
| 19 |
+
result = {
|
| 20 |
+
"chart_id": chart.chart_id,
|
| 21 |
+
"title": chart.title,
|
| 22 |
+
"type": chart.chart_type,
|
| 23 |
+
"public_url": chart.get_public_url(),
|
| 24 |
+
"edit_url": chart.get_editor_url(),
|
| 25 |
+
}
|
| 26 |
+
|
| 27 |
+
return [TextContent(type="text", text=json.dumps(result, indent=2))]
|
datawrapper_mcp/handlers/schema.py
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for retrieving chart schemas."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
|
| 5 |
+
from mcp.types import TextContent
|
| 6 |
+
|
| 7 |
+
from ..config import CHART_CLASSES
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
async def get_chart_schema(arguments: dict) -> list[TextContent]:
|
| 11 |
+
"""Get the Pydantic schema for a chart type."""
|
| 12 |
+
chart_type = arguments["chart_type"]
|
| 13 |
+
chart_class = CHART_CLASSES[chart_type]
|
| 14 |
+
|
| 15 |
+
schema = chart_class.model_json_schema()
|
| 16 |
+
|
| 17 |
+
# Remove examples that contain DataFrames (not JSON serializable)
|
| 18 |
+
if "examples" in schema:
|
| 19 |
+
del schema["examples"]
|
| 20 |
+
|
| 21 |
+
result = {
|
| 22 |
+
"chart_type": chart_type,
|
| 23 |
+
"class_name": chart_class.__name__,
|
| 24 |
+
"schema": schema,
|
| 25 |
+
"usage": (
|
| 26 |
+
"Use this schema to construct a chart_config dict for create_chart_advanced. "
|
| 27 |
+
"The schema shows all available properties, their types, and descriptions."
|
| 28 |
+
),
|
| 29 |
+
}
|
| 30 |
+
|
| 31 |
+
return [TextContent(type="text", text=json.dumps(result, indent=2))]
|
datawrapper_mcp/handlers/update.py
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler for updating Datawrapper charts."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
|
| 5 |
+
from datawrapper import get_chart
|
| 6 |
+
from mcp.types import TextContent
|
| 7 |
+
|
| 8 |
+
from ..utils import get_api_token, json_to_dataframe
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def update_chart(arguments: dict) -> list[TextContent]:
|
| 12 |
+
"""Update an existing chart's data or configuration."""
|
| 13 |
+
api_token = get_api_token()
|
| 14 |
+
chart_id = arguments["chart_id"]
|
| 15 |
+
|
| 16 |
+
# Get chart using factory function - returns correct Pydantic class instance
|
| 17 |
+
chart = get_chart(chart_id, access_token=api_token)
|
| 18 |
+
|
| 19 |
+
# Update data if provided
|
| 20 |
+
if "data" in arguments:
|
| 21 |
+
df = json_to_dataframe(arguments["data"])
|
| 22 |
+
chart.data = df
|
| 23 |
+
|
| 24 |
+
# Update config if provided
|
| 25 |
+
if "chart_config" in arguments:
|
| 26 |
+
# Directly set attributes on the chart instance
|
| 27 |
+
# Pydantic will validate each assignment automatically due to validate_assignment=True
|
| 28 |
+
try:
|
| 29 |
+
# Build a mapping of aliases to field names
|
| 30 |
+
alias_to_field = {}
|
| 31 |
+
for field_name, field_info in chart.model_fields.items():
|
| 32 |
+
# Add the field name itself
|
| 33 |
+
alias_to_field[field_name] = field_name
|
| 34 |
+
# Add any aliases
|
| 35 |
+
if field_info.alias:
|
| 36 |
+
alias_to_field[field_info.alias] = field_name
|
| 37 |
+
|
| 38 |
+
for key, value in arguments["chart_config"].items():
|
| 39 |
+
# Convert alias to field name if needed
|
| 40 |
+
field_name = alias_to_field.get(key, key)
|
| 41 |
+
setattr(chart, field_name, value)
|
| 42 |
+
except Exception as e:
|
| 43 |
+
return [
|
| 44 |
+
TextContent(
|
| 45 |
+
type="text",
|
| 46 |
+
text=f"Invalid chart configuration: {str(e)}\n\n"
|
| 47 |
+
f"Use get_chart_schema to see the valid schema for this chart type. "
|
| 48 |
+
f"Only high-level Pydantic fields are accepted.",
|
| 49 |
+
)
|
| 50 |
+
]
|
| 51 |
+
|
| 52 |
+
# Update using Pydantic instance method
|
| 53 |
+
chart.update(access_token=api_token)
|
| 54 |
+
|
| 55 |
+
result = {
|
| 56 |
+
"chart_id": chart.chart_id,
|
| 57 |
+
"message": "Chart updated successfully!",
|
| 58 |
+
"edit_url": chart.get_editor_url(),
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
return [TextContent(type="text", text=json.dumps(result, indent=2))]
|
datawrapper_mcp/server.py
ADDED
|
@@ -0,0 +1,101 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Main MCP server implementation for Datawrapper chart creation."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
from typing import Any, Sequence
|
| 5 |
+
|
| 6 |
+
from mcp.server import Server
|
| 7 |
+
from mcp.types import ImageContent, Resource, TextContent
|
| 8 |
+
from pydantic import AnyUrl
|
| 9 |
+
|
| 10 |
+
from .config import CHART_CLASSES
|
| 11 |
+
from .handlers import (
|
| 12 |
+
create_chart,
|
| 13 |
+
delete_chart,
|
| 14 |
+
export_chart_png,
|
| 15 |
+
get_chart_info,
|
| 16 |
+
get_chart_schema,
|
| 17 |
+
publish_chart,
|
| 18 |
+
update_chart,
|
| 19 |
+
)
|
| 20 |
+
from .tools import list_tools as get_tool_list
|
| 21 |
+
|
| 22 |
+
# Initialize the MCP server
|
| 23 |
+
app = Server("datawrapper-mcp")
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
@app.list_resources()
|
| 27 |
+
async def list_resources() -> list[Resource]:
|
| 28 |
+
"""List available resources."""
|
| 29 |
+
return [
|
| 30 |
+
Resource(
|
| 31 |
+
uri=AnyUrl("datawrapper://chart-types"),
|
| 32 |
+
name="Available Chart Types",
|
| 33 |
+
mimeType="application/json",
|
| 34 |
+
description="List of available Datawrapper chart types and their Pydantic schemas",
|
| 35 |
+
)
|
| 36 |
+
]
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
@app.read_resource()
|
| 40 |
+
async def read_resource(uri: AnyUrl) -> str:
|
| 41 |
+
"""Read a resource by URI."""
|
| 42 |
+
if str(uri) == "datawrapper://chart-types":
|
| 43 |
+
chart_info = {}
|
| 44 |
+
for name, chart_class in CHART_CLASSES.items():
|
| 45 |
+
chart_info[name] = {
|
| 46 |
+
"class_name": chart_class.__name__,
|
| 47 |
+
"schema": chart_class.model_json_schema(),
|
| 48 |
+
}
|
| 49 |
+
return json.dumps(chart_info, indent=2)
|
| 50 |
+
|
| 51 |
+
raise ValueError(f"Unknown resource URI: {uri}")
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
@app.list_tools()
|
| 55 |
+
async def list_tools():
|
| 56 |
+
"""List available tools."""
|
| 57 |
+
return await get_tool_list()
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
@app.call_tool()
|
| 61 |
+
async def call_tool(name: str, arguments: Any) -> Sequence[TextContent | ImageContent]:
|
| 62 |
+
"""Handle tool calls."""
|
| 63 |
+
try:
|
| 64 |
+
if name == "create_chart":
|
| 65 |
+
return await create_chart(arguments)
|
| 66 |
+
elif name == "get_chart_schema":
|
| 67 |
+
return await get_chart_schema(arguments)
|
| 68 |
+
elif name == "publish_chart":
|
| 69 |
+
return await publish_chart(arguments)
|
| 70 |
+
elif name == "get_chart":
|
| 71 |
+
return await get_chart_info(arguments)
|
| 72 |
+
elif name == "update_chart":
|
| 73 |
+
return await update_chart(arguments)
|
| 74 |
+
elif name == "delete_chart":
|
| 75 |
+
return await delete_chart(arguments)
|
| 76 |
+
elif name == "export_chart_png":
|
| 77 |
+
return await export_chart_png(arguments)
|
| 78 |
+
else:
|
| 79 |
+
raise ValueError(f"Unknown tool: {name}")
|
| 80 |
+
except Exception as e:
|
| 81 |
+
return [TextContent(type="text", text=f"Error: {str(e)}")]
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
def main():
|
| 85 |
+
"""Run the MCP server."""
|
| 86 |
+
import asyncio
|
| 87 |
+
from mcp.server.stdio import stdio_server
|
| 88 |
+
|
| 89 |
+
async def run():
|
| 90 |
+
async with stdio_server() as (read_stream, write_stream):
|
| 91 |
+
await app.run(
|
| 92 |
+
read_stream,
|
| 93 |
+
write_stream,
|
| 94 |
+
app.create_initialization_options(),
|
| 95 |
+
)
|
| 96 |
+
|
| 97 |
+
asyncio.run(run())
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
if __name__ == "__main__":
|
| 101 |
+
main()
|
datawrapper_mcp/tools.py
ADDED
|
@@ -0,0 +1,286 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Tool definitions for the Datawrapper MCP server."""
|
| 2 |
+
|
| 3 |
+
from mcp.types import Tool
|
| 4 |
+
|
| 5 |
+
from .config import CHART_CLASSES
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
async def list_tools() -> list[Tool]:
|
| 9 |
+
"""List available tools."""
|
| 10 |
+
return [
|
| 11 |
+
Tool(
|
| 12 |
+
name="create_chart",
|
| 13 |
+
description=(
|
| 14 |
+
"⚠️ THIS IS THE DATAWRAPPER INTEGRATION ⚠️\n"
|
| 15 |
+
"Use this MCP tool for ALL Datawrapper chart creation.\n\n"
|
| 16 |
+
"DO NOT:\n"
|
| 17 |
+
"❌ Install the 'datawrapper' Python package\n"
|
| 18 |
+
"❌ Use the Datawrapper API directly\n"
|
| 19 |
+
"❌ Import 'from datawrapper import ...'\n"
|
| 20 |
+
"❌ Run pip install datawrapper\n\n"
|
| 21 |
+
"This MCP server IS the complete Datawrapper integration. All Datawrapper operations "
|
| 22 |
+
"should use the MCP tools provided by this server.\n\n"
|
| 23 |
+
"---\n\n"
|
| 24 |
+
"Create a Datawrapper chart with full control using Pydantic models. "
|
| 25 |
+
"This allows you to specify all chart properties including title, description, "
|
| 26 |
+
"visualization settings, axes, colors, and more. The chart_config should "
|
| 27 |
+
"be a complete Pydantic model dict matching the schema for the chosen chart type.\n\n"
|
| 28 |
+
"STYLING WORKFLOW:\n"
|
| 29 |
+
"1. Use get_chart_schema to explore all available options for your chart type\n"
|
| 30 |
+
"2. Refer to https://datawrapper.readthedocs.io/en/latest/ for detailed examples\n"
|
| 31 |
+
"3. Build your chart_config with the desired styling properties\n\n"
|
| 32 |
+
"Common styling patterns:\n"
|
| 33 |
+
'- Colors: {"color_category": {"sales": "#1d81a2", "profit": "#15607a"}}\n'
|
| 34 |
+
'- Line styling: {"lines": [{"column": "sales", "width": "style1", "interpolation": "curved"}]}\n'
|
| 35 |
+
'- Axis ranges: {"custom_range_y": [0, 100], "custom_range_x": [2020, 2024]}\n'
|
| 36 |
+
'- Grid formatting: {"y_grid_format": "0", "x_grid": "on", "y_grid": "on"}\n'
|
| 37 |
+
'- Tooltips: {"tooltip_number_format": "00.00", "tooltip_x_format": "YYYY"}\n'
|
| 38 |
+
'- Annotations: {"text_annotations": [{"x": "2023", "y": 50, "text": "Peak"}]}\n\n'
|
| 39 |
+
"See the documentation for chart-type specific examples and advanced patterns.\n\n"
|
| 40 |
+
'Example data format: [{"date": "2024-01", "value": 100}, {"date": "2024-02", "value": 150}]'
|
| 41 |
+
),
|
| 42 |
+
inputSchema={
|
| 43 |
+
"type": "object",
|
| 44 |
+
"properties": {
|
| 45 |
+
"data": {
|
| 46 |
+
"type": ["string", "array", "object"],
|
| 47 |
+
"description": (
|
| 48 |
+
"Chart data. RECOMMENDED: Pass data inline as a list or dict.\n\n"
|
| 49 |
+
"PREFERRED FORMATS (use these first):\n\n"
|
| 50 |
+
"1. List of records (RECOMMENDED):\n"
|
| 51 |
+
' [{"year": 2020, "sales": 100}, {"year": 2021, "sales": 150}]\n\n'
|
| 52 |
+
"2. Dict of arrays:\n"
|
| 53 |
+
' {"year": [2020, 2021], "sales": [100, 150]}\n\n'
|
| 54 |
+
"3. JSON string of format 1 or 2:\n"
|
| 55 |
+
' \'[{"year": 2020, "sales": 100}]\'\n\n'
|
| 56 |
+
"ALTERNATIVE (only for extremely large datasets where inline data is impractical):\n\n"
|
| 57 |
+
"4. File path to CSV or JSON:\n"
|
| 58 |
+
' "/path/to/data.csv" or "/path/to/data.json"\n'
|
| 59 |
+
" - Use only when inline data would be too large to pass directly\n"
|
| 60 |
+
" - CSV files are read directly\n"
|
| 61 |
+
" - JSON files must contain list of dicts or dict of arrays"
|
| 62 |
+
),
|
| 63 |
+
},
|
| 64 |
+
"chart_type": {
|
| 65 |
+
"type": "string",
|
| 66 |
+
"enum": list(CHART_CLASSES.keys()),
|
| 67 |
+
"description": "Type of chart to create",
|
| 68 |
+
},
|
| 69 |
+
"chart_config": {
|
| 70 |
+
"type": "object",
|
| 71 |
+
"description": (
|
| 72 |
+
"Complete chart configuration as a Pydantic model dict. "
|
| 73 |
+
"Must match the schema for the chosen chart_type. "
|
| 74 |
+
"Use get_chart_schema to see the full schema."
|
| 75 |
+
),
|
| 76 |
+
},
|
| 77 |
+
},
|
| 78 |
+
"required": ["data", "chart_type", "chart_config"],
|
| 79 |
+
},
|
| 80 |
+
),
|
| 81 |
+
Tool(
|
| 82 |
+
name="get_chart_schema",
|
| 83 |
+
description=(
|
| 84 |
+
"⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
|
| 85 |
+
"This is part of the Datawrapper MCP server integration.\n\n"
|
| 86 |
+
"---\n\n"
|
| 87 |
+
"Get the Pydantic JSON schema for a specific chart type. This is your primary tool "
|
| 88 |
+
"for discovering styling and configuration options.\n\n"
|
| 89 |
+
"The schema shows:\n"
|
| 90 |
+
"- All available properties and their types\n"
|
| 91 |
+
"- Enum values (e.g., line widths, interpolation methods)\n"
|
| 92 |
+
"- Default values\n"
|
| 93 |
+
"- Detailed descriptions for each property\n\n"
|
| 94 |
+
"WORKFLOW: Use this tool first to explore options, then refer to "
|
| 95 |
+
"https://datawrapper.readthedocs.io/en/latest/ for detailed examples and patterns "
|
| 96 |
+
"showing how to use these properties in practice."
|
| 97 |
+
),
|
| 98 |
+
inputSchema={
|
| 99 |
+
"type": "object",
|
| 100 |
+
"properties": {
|
| 101 |
+
"chart_type": {
|
| 102 |
+
"type": "string",
|
| 103 |
+
"enum": list(CHART_CLASSES.keys()),
|
| 104 |
+
"description": "Chart type to get schema for",
|
| 105 |
+
},
|
| 106 |
+
},
|
| 107 |
+
"required": ["chart_type"],
|
| 108 |
+
},
|
| 109 |
+
),
|
| 110 |
+
Tool(
|
| 111 |
+
name="publish_chart",
|
| 112 |
+
description=(
|
| 113 |
+
"⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
|
| 114 |
+
"This is part of the Datawrapper MCP server integration.\n\n"
|
| 115 |
+
"---\n\n"
|
| 116 |
+
"Publish a Datawrapper chart to make it publicly accessible. "
|
| 117 |
+
"Returns the public URL of the published chart. "
|
| 118 |
+
"IMPORTANT: Only use this tool when the user explicitly requests to publish the chart. "
|
| 119 |
+
"Do not automatically publish charts after creation unless specifically asked."
|
| 120 |
+
),
|
| 121 |
+
inputSchema={
|
| 122 |
+
"type": "object",
|
| 123 |
+
"properties": {
|
| 124 |
+
"chart_id": {
|
| 125 |
+
"type": "string",
|
| 126 |
+
"description": "ID of the chart to publish",
|
| 127 |
+
},
|
| 128 |
+
},
|
| 129 |
+
"required": ["chart_id"],
|
| 130 |
+
},
|
| 131 |
+
),
|
| 132 |
+
Tool(
|
| 133 |
+
name="get_chart",
|
| 134 |
+
description=(
|
| 135 |
+
"⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
|
| 136 |
+
"This is part of the Datawrapper MCP server integration.\n\n"
|
| 137 |
+
"---\n\n"
|
| 138 |
+
"Get information about an existing Datawrapper chart, "
|
| 139 |
+
"including its metadata, data, and public URL if published."
|
| 140 |
+
),
|
| 141 |
+
inputSchema={
|
| 142 |
+
"type": "object",
|
| 143 |
+
"properties": {
|
| 144 |
+
"chart_id": {
|
| 145 |
+
"type": "string",
|
| 146 |
+
"description": "ID of the chart to retrieve",
|
| 147 |
+
},
|
| 148 |
+
},
|
| 149 |
+
"required": ["chart_id"],
|
| 150 |
+
},
|
| 151 |
+
),
|
| 152 |
+
Tool(
|
| 153 |
+
name="update_chart",
|
| 154 |
+
description=(
|
| 155 |
+
"⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
|
| 156 |
+
"This is part of the Datawrapper MCP server integration.\n\n"
|
| 157 |
+
"---\n\n"
|
| 158 |
+
"Update an existing Datawrapper chart's data or configuration using Pydantic models. "
|
| 159 |
+
"IMPORTANT: The chart_config must use high-level Pydantic fields only (title, intro, "
|
| 160 |
+
"byline, source_name, source_url, etc.). Do NOT use low-level serialized structures "
|
| 161 |
+
"like 'metadata', 'visualize', or other internal API fields.\n\n"
|
| 162 |
+
"STYLING UPDATES:\n"
|
| 163 |
+
"Use get_chart_schema to see available fields, then apply styling changes:\n"
|
| 164 |
+
'- Colors: {"color_category": {"sales": "#ff0000"}}\n'
|
| 165 |
+
'- Line properties: {"lines": [{"column": "sales", "width": "style2"}]}\n'
|
| 166 |
+
'- Axis settings: {"custom_range_y": [0, 200], "y_grid_format": "0,0"}\n'
|
| 167 |
+
'- Tooltips: {"tooltip_number_format": "0.0"}\n\n'
|
| 168 |
+
"See https://datawrapper.readthedocs.io/en/latest/ for detailed examples. "
|
| 169 |
+
"The provided config will be validated through Pydantic and merged with the existing "
|
| 170 |
+
"chart configuration."
|
| 171 |
+
),
|
| 172 |
+
inputSchema={
|
| 173 |
+
"type": "object",
|
| 174 |
+
"properties": {
|
| 175 |
+
"chart_id": {
|
| 176 |
+
"type": "string",
|
| 177 |
+
"description": "ID of the chart to update",
|
| 178 |
+
},
|
| 179 |
+
"data": {
|
| 180 |
+
"type": ["string", "array", "object"],
|
| 181 |
+
"description": (
|
| 182 |
+
"Chart data. RECOMMENDED: Pass data inline as a list or dict.\n\n"
|
| 183 |
+
"PREFERRED FORMATS (use these first):\n\n"
|
| 184 |
+
"1. List of records (RECOMMENDED):\n"
|
| 185 |
+
' [{"year": 2020, "sales": 100}, {"year": 2021, "sales": 150}]\n\n'
|
| 186 |
+
"2. Dict of arrays:\n"
|
| 187 |
+
' {"year": [2020, 2021], "sales": [100, 150]}\n\n'
|
| 188 |
+
"3. JSON string of format 1 or 2:\n"
|
| 189 |
+
' \'[{"year": 2020, "sales": 100}]\'\n\n'
|
| 190 |
+
"ALTERNATIVE (only for extremely large datasets where inline data is impractical):\n\n"
|
| 191 |
+
"4. File path to CSV or JSON:\n"
|
| 192 |
+
' "/path/to/data.csv" or "/path/to/data.json"\n'
|
| 193 |
+
" - Use only when inline data would be too large to pass directly\n"
|
| 194 |
+
" - CSV files are read directly\n"
|
| 195 |
+
" - JSON files must contain list of dicts or dict of arrays"
|
| 196 |
+
),
|
| 197 |
+
},
|
| 198 |
+
"chart_config": {
|
| 199 |
+
"type": "object",
|
| 200 |
+
"description": (
|
| 201 |
+
"Updated chart configuration using high-level Pydantic fields (optional). "
|
| 202 |
+
"Must use Pydantic model fields like 'title', 'intro', 'byline', etc. "
|
| 203 |
+
"Do NOT use raw API structures like 'metadata' or 'visualize'. "
|
| 204 |
+
"Use get_chart_schema to see valid fields. Will be validated and merged "
|
| 205 |
+
"with existing config."
|
| 206 |
+
),
|
| 207 |
+
},
|
| 208 |
+
},
|
| 209 |
+
"required": ["chart_id"],
|
| 210 |
+
},
|
| 211 |
+
),
|
| 212 |
+
Tool(
|
| 213 |
+
name="delete_chart",
|
| 214 |
+
description=(
|
| 215 |
+
"⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
|
| 216 |
+
"This is part of the Datawrapper MCP server integration.\n\n"
|
| 217 |
+
"---\n\n"
|
| 218 |
+
"Delete a Datawrapper chart permanently."
|
| 219 |
+
),
|
| 220 |
+
inputSchema={
|
| 221 |
+
"type": "object",
|
| 222 |
+
"properties": {
|
| 223 |
+
"chart_id": {
|
| 224 |
+
"type": "string",
|
| 225 |
+
"description": "ID of the chart to delete",
|
| 226 |
+
},
|
| 227 |
+
},
|
| 228 |
+
"required": ["chart_id"],
|
| 229 |
+
},
|
| 230 |
+
),
|
| 231 |
+
Tool(
|
| 232 |
+
name="export_chart_png",
|
| 233 |
+
description=(
|
| 234 |
+
"⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
|
| 235 |
+
"This is part of the Datawrapper MCP server integration.\n\n"
|
| 236 |
+
"---\n\n"
|
| 237 |
+
"Export a Datawrapper chart as PNG and display it inline. "
|
| 238 |
+
"The chart must be created first using create_chart. "
|
| 239 |
+
"Supports high-resolution output via the zoom parameter. "
|
| 240 |
+
"IMPORTANT: Only use this tool when the user explicitly requests to see the chart image "
|
| 241 |
+
"or export it as PNG. Do not automatically export charts after creation unless specifically asked."
|
| 242 |
+
),
|
| 243 |
+
inputSchema={
|
| 244 |
+
"type": "object",
|
| 245 |
+
"properties": {
|
| 246 |
+
"chart_id": {
|
| 247 |
+
"type": "string",
|
| 248 |
+
"description": "ID of the chart to export",
|
| 249 |
+
},
|
| 250 |
+
"width": {
|
| 251 |
+
"type": "integer",
|
| 252 |
+
"description": "Width of the image in pixels (optional, uses chart width if not specified)",
|
| 253 |
+
},
|
| 254 |
+
"height": {
|
| 255 |
+
"type": "integer",
|
| 256 |
+
"description": "Height of the image in pixels (optional, uses chart height if not specified)",
|
| 257 |
+
},
|
| 258 |
+
"plain": {
|
| 259 |
+
"type": "boolean",
|
| 260 |
+
"description": "If true, exports only the visualization without header/footer (default: false)",
|
| 261 |
+
"default": False,
|
| 262 |
+
},
|
| 263 |
+
"zoom": {
|
| 264 |
+
"type": "integer",
|
| 265 |
+
"description": "Scale multiplier for resolution, e.g., 2 = 2x resolution (default: 2)",
|
| 266 |
+
"default": 2,
|
| 267 |
+
},
|
| 268 |
+
"transparent": {
|
| 269 |
+
"type": "boolean",
|
| 270 |
+
"description": "If true, exports with transparent background (default: false)",
|
| 271 |
+
"default": False,
|
| 272 |
+
},
|
| 273 |
+
"border_width": {
|
| 274 |
+
"type": "integer",
|
| 275 |
+
"description": "Margin around visualization in pixels (default: 0)",
|
| 276 |
+
"default": 0,
|
| 277 |
+
},
|
| 278 |
+
"border_color": {
|
| 279 |
+
"type": "string",
|
| 280 |
+
"description": "Color of the border, e.g., '#FFFFFF' (optional, uses chart background if not specified)",
|
| 281 |
+
},
|
| 282 |
+
},
|
| 283 |
+
"required": ["chart_id"],
|
| 284 |
+
},
|
| 285 |
+
),
|
| 286 |
+
]
|
datawrapper_mcp/utils.py
ADDED
|
@@ -0,0 +1,118 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Utility functions for the Datawrapper MCP server."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import os
|
| 5 |
+
|
| 6 |
+
import pandas as pd
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def get_api_token() -> str:
|
| 10 |
+
"""Get the Datawrapper API token from environment."""
|
| 11 |
+
api_token = os.environ.get("DATAWRAPPER_ACCESS_TOKEN")
|
| 12 |
+
if not api_token:
|
| 13 |
+
raise ValueError(
|
| 14 |
+
"DATAWRAPPER_ACCESS_TOKEN environment variable is required. "
|
| 15 |
+
"Get your token from https://app.datawrapper.de/account/api-tokens"
|
| 16 |
+
)
|
| 17 |
+
return api_token
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
def json_to_dataframe(data: str | list | dict) -> pd.DataFrame:
|
| 21 |
+
"""Convert JSON data to a pandas DataFrame.
|
| 22 |
+
|
| 23 |
+
Args:
|
| 24 |
+
data: One of:
|
| 25 |
+
- File path to CSV or JSON file (e.g., "/path/to/data.csv")
|
| 26 |
+
- List of records: [{"col1": val1, "col2": val2}, ...]
|
| 27 |
+
- Dict of arrays: {"col1": [val1, val2], "col2": [val3, val4]}
|
| 28 |
+
- JSON string in either format above
|
| 29 |
+
|
| 30 |
+
Returns:
|
| 31 |
+
pandas DataFrame
|
| 32 |
+
|
| 33 |
+
Examples:
|
| 34 |
+
>>> json_to_dataframe("/tmp/data.csv")
|
| 35 |
+
>>> json_to_dataframe("/tmp/data.json")
|
| 36 |
+
>>> json_to_dataframe([{"a": 1, "b": 2}, {"a": 3, "b": 4}])
|
| 37 |
+
>>> json_to_dataframe({"a": [1, 3], "b": [2, 4]})
|
| 38 |
+
>>> json_to_dataframe('[{"a": 1, "b": 2}]')
|
| 39 |
+
"""
|
| 40 |
+
if isinstance(data, str):
|
| 41 |
+
# Check if it's a file path that exists
|
| 42 |
+
if os.path.isfile(data):
|
| 43 |
+
if data.endswith(".csv"):
|
| 44 |
+
return pd.read_csv(data)
|
| 45 |
+
elif data.endswith(".json"):
|
| 46 |
+
with open(data) as f:
|
| 47 |
+
file_data = json.load(f)
|
| 48 |
+
# Recursively process the loaded JSON data
|
| 49 |
+
return json_to_dataframe(file_data)
|
| 50 |
+
else:
|
| 51 |
+
raise ValueError(
|
| 52 |
+
f"Unsupported file type: {data}\n\n"
|
| 53 |
+
"Supported file types:\n"
|
| 54 |
+
" - .csv (CSV files)\n"
|
| 55 |
+
" - .json (JSON files containing list of dicts or dict of arrays)"
|
| 56 |
+
)
|
| 57 |
+
|
| 58 |
+
# Check if it looks like CSV content (not a file path)
|
| 59 |
+
if "\n" in data and "," in data and not data.strip().startswith(("[", "{")):
|
| 60 |
+
raise ValueError(
|
| 61 |
+
"CSV strings are not supported. Please save to a file first.\n\n"
|
| 62 |
+
"Options:\n"
|
| 63 |
+
" 1. Save CSV to a file and pass the file path\n"
|
| 64 |
+
' 2. Parse CSV to list of dicts: [{"col": val}, ...]\n'
|
| 65 |
+
' 3. Parse CSV to dict of arrays: {"col": [vals]}\n\n'
|
| 66 |
+
"Example:\n"
|
| 67 |
+
' data = [{"year": 2020, "value": 100}, {"year": 2021, "value": 150}]'
|
| 68 |
+
)
|
| 69 |
+
|
| 70 |
+
# Try to parse as JSON string
|
| 71 |
+
try:
|
| 72 |
+
data = json.loads(data)
|
| 73 |
+
except json.JSONDecodeError as e:
|
| 74 |
+
raise ValueError(
|
| 75 |
+
f"Invalid JSON string: {e}\n\n"
|
| 76 |
+
"Expected one of:\n"
|
| 77 |
+
" 1. File path: '/path/to/data.csv' or '/path/to/data.json'\n"
|
| 78 |
+
' 2. JSON string: \'[{"year": 2020, "value": 100}, ...]\'\n'
|
| 79 |
+
' 3. JSON string: \'{"year": [2020, 2021], "value": [100, 150]}\''
|
| 80 |
+
)
|
| 81 |
+
|
| 82 |
+
if isinstance(data, list):
|
| 83 |
+
if not data:
|
| 84 |
+
raise ValueError(
|
| 85 |
+
"Data list is empty. Please provide at least one row of data."
|
| 86 |
+
)
|
| 87 |
+
if not all(isinstance(item, dict) for item in data):
|
| 88 |
+
raise ValueError(
|
| 89 |
+
"List format must contain dictionaries.\n\n"
|
| 90 |
+
"Expected format:\n"
|
| 91 |
+
' [{"year": 2020, "value": 100}, {"year": 2021, "value": 150}]\n\n'
|
| 92 |
+
f"Got: {type(data[0]).__name__} in list"
|
| 93 |
+
)
|
| 94 |
+
# List of records: [{"col1": val1, "col2": val2}, ...]
|
| 95 |
+
return pd.DataFrame(data)
|
| 96 |
+
elif isinstance(data, dict):
|
| 97 |
+
if not data:
|
| 98 |
+
raise ValueError(
|
| 99 |
+
"Data dict is empty. Please provide at least one column of data."
|
| 100 |
+
)
|
| 101 |
+
# Check if it's a dict of arrays (all values should be lists)
|
| 102 |
+
if not all(isinstance(v, list) for v in data.values()):
|
| 103 |
+
raise ValueError(
|
| 104 |
+
"Dict format must have lists as values.\n\n"
|
| 105 |
+
"Expected format:\n"
|
| 106 |
+
' {"year": [2020, 2021], "value": [100, 150]}\n\n'
|
| 107 |
+
f"Got dict with values of type: {[type(v).__name__ for v in data.values()]}"
|
| 108 |
+
)
|
| 109 |
+
# Dict of arrays: {"col1": [val1, val2], "col2": [val3, val4]}
|
| 110 |
+
return pd.DataFrame(data)
|
| 111 |
+
else:
|
| 112 |
+
raise ValueError(
|
| 113 |
+
f"Unsupported data type: {type(data).__name__}\n\n"
|
| 114 |
+
"Data must be one of:\n"
|
| 115 |
+
' 1. List of dicts: [{"year": 2020, "value": 100}, ...]\n'
|
| 116 |
+
' 2. Dict of arrays: {"year": [2020, 2021], "value": [100, 150]}\n'
|
| 117 |
+
" 3. JSON string in either format above"
|
| 118 |
+
)
|
requirements.txt
CHANGED
|
@@ -12,3 +12,8 @@ python-dotenv>=1.0.0
|
|
| 12 |
|
| 13 |
# Utilities
|
| 14 |
pydantic>=2.0.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
# Utilities
|
| 14 |
pydantic>=2.0.0
|
| 15 |
+
|
| 16 |
+
# Datawrapper chart creation
|
| 17 |
+
datawrapper>=2.0.7
|
| 18 |
+
mcp>=1.20.0
|
| 19 |
+
pandas>=2.0.0
|
src/datawrapper_client.py
ADDED
|
@@ -0,0 +1,336 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Datawrapper Chart Generation Client
|
| 3 |
+
|
| 4 |
+
Integrates RAG pipeline with Datawrapper API for intelligent chart creation.
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import json
|
| 8 |
+
import os
|
| 9 |
+
from typing import Optional, Tuple
|
| 10 |
+
import pandas as pd
|
| 11 |
+
|
| 12 |
+
from .prompts import (
|
| 13 |
+
CHART_SELECTION_SYSTEM_PROMPT,
|
| 14 |
+
get_chart_selection_prompt,
|
| 15 |
+
get_chart_styling_prompt
|
| 16 |
+
)
|
| 17 |
+
from .llm_client import create_llm_client
|
| 18 |
+
from .rag_pipeline import GraphicsDesignPipeline
|
| 19 |
+
|
| 20 |
+
# Import Datawrapper MCP handlers directly
|
| 21 |
+
from datawrapper_mcp.handlers.create import create_chart as mcp_create_chart
|
| 22 |
+
from datawrapper_mcp.handlers.publish import publish_chart as mcp_publish_chart
|
| 23 |
+
from datawrapper_mcp.handlers.retrieve import get_chart_info as mcp_get_chart_info
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def get_data_summary(df: pd.DataFrame) -> str:
|
| 27 |
+
"""
|
| 28 |
+
Generate a summary of the DataFrame structure and content.
|
| 29 |
+
|
| 30 |
+
Args:
|
| 31 |
+
df: Input DataFrame
|
| 32 |
+
|
| 33 |
+
Returns:
|
| 34 |
+
String summary of data characteristics
|
| 35 |
+
"""
|
| 36 |
+
summary_parts = []
|
| 37 |
+
|
| 38 |
+
# Basic info
|
| 39 |
+
summary_parts.append(f"Rows: {len(df)}, Columns: {len(df.columns)}")
|
| 40 |
+
summary_parts.append(f"Column names: {', '.join(df.columns.tolist())}")
|
| 41 |
+
|
| 42 |
+
# Column types
|
| 43 |
+
numeric_cols = df.select_dtypes(include=['number']).columns.tolist()
|
| 44 |
+
text_cols = df.select_dtypes(include=['object']).columns.tolist()
|
| 45 |
+
date_cols = df.select_dtypes(include=['datetime']).columns.tolist()
|
| 46 |
+
|
| 47 |
+
if numeric_cols:
|
| 48 |
+
summary_parts.append(f"Numeric columns: {', '.join(numeric_cols)}")
|
| 49 |
+
if text_cols:
|
| 50 |
+
summary_parts.append(f"Text columns: {', '.join(text_cols)}")
|
| 51 |
+
if date_cols:
|
| 52 |
+
summary_parts.append(f"Date columns: {', '.join(date_cols)}")
|
| 53 |
+
|
| 54 |
+
# Data preview (first 3 rows)
|
| 55 |
+
summary_parts.append(f"\nData preview:\n{df.head(3).to_string()}")
|
| 56 |
+
|
| 57 |
+
return "\n".join(summary_parts)
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
def analyze_csv_for_chart_type(
|
| 61 |
+
df: pd.DataFrame,
|
| 62 |
+
user_prompt: str,
|
| 63 |
+
rag_pipeline: GraphicsDesignPipeline
|
| 64 |
+
) -> Tuple[str, str]:
|
| 65 |
+
"""
|
| 66 |
+
Use RAG and LLM to determine the best chart type for the data.
|
| 67 |
+
|
| 68 |
+
Args:
|
| 69 |
+
df: Input DataFrame
|
| 70 |
+
user_prompt: User's description of what they want to visualize
|
| 71 |
+
rag_pipeline: RAG pipeline for retrieving best practices
|
| 72 |
+
|
| 73 |
+
Returns:
|
| 74 |
+
Tuple of (chart_type, reasoning)
|
| 75 |
+
"""
|
| 76 |
+
# Get data summary
|
| 77 |
+
data_summary = get_data_summary(df)
|
| 78 |
+
|
| 79 |
+
# Query RAG for chart selection best practices
|
| 80 |
+
rag_query = f"chart type selection for {user_prompt}"
|
| 81 |
+
relevant_docs = rag_pipeline.retrieve_documents(rag_query, k=3)
|
| 82 |
+
rag_context = rag_pipeline.vectorstore.format_documents_for_context(relevant_docs)
|
| 83 |
+
|
| 84 |
+
# Generate chart type recommendation using LLM
|
| 85 |
+
chart_prompt = get_chart_selection_prompt()
|
| 86 |
+
full_prompt = chart_prompt.format(
|
| 87 |
+
user_prompt=user_prompt,
|
| 88 |
+
data_summary=data_summary,
|
| 89 |
+
rag_context=rag_context
|
| 90 |
+
)
|
| 91 |
+
|
| 92 |
+
llm_client = create_llm_client(
|
| 93 |
+
model=os.getenv("LLM_MODEL", "meta-llama/Llama-3.1-8B-Instruct"),
|
| 94 |
+
temperature=0.3, # Lower temperature for more deterministic chart selection
|
| 95 |
+
max_tokens=500
|
| 96 |
+
)
|
| 97 |
+
|
| 98 |
+
response = llm_client.generate(
|
| 99 |
+
prompt=full_prompt,
|
| 100 |
+
system_prompt=CHART_SELECTION_SYSTEM_PROMPT
|
| 101 |
+
)
|
| 102 |
+
|
| 103 |
+
# Parse JSON response
|
| 104 |
+
try:
|
| 105 |
+
# Extract JSON from response (handle markdown code blocks)
|
| 106 |
+
response_clean = response.strip()
|
| 107 |
+
if "```json" in response_clean:
|
| 108 |
+
response_clean = response_clean.split("```json")[1].split("```")[0].strip()
|
| 109 |
+
elif "```" in response_clean:
|
| 110 |
+
response_clean = response_clean.split("```")[1].split("```")[0].strip()
|
| 111 |
+
|
| 112 |
+
result = json.loads(response_clean)
|
| 113 |
+
chart_type = result.get("chart_type", "line")
|
| 114 |
+
reasoning = result.get("reasoning", "")
|
| 115 |
+
|
| 116 |
+
# Validate chart type
|
| 117 |
+
valid_types = ["bar", "line", "area", "scatter", "column", "stacked_bar", "arrow", "multiple_column"]
|
| 118 |
+
if chart_type not in valid_types:
|
| 119 |
+
chart_type = "line" # Default fallback
|
| 120 |
+
|
| 121 |
+
return chart_type, reasoning
|
| 122 |
+
except Exception as e:
|
| 123 |
+
print(f"Error parsing chart type response: {e}")
|
| 124 |
+
print(f"Response was: {response}")
|
| 125 |
+
# Default to line chart
|
| 126 |
+
return "line", "Using default line chart due to parsing error"
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
def generate_chart_config(
|
| 130 |
+
chart_type: str,
|
| 131 |
+
df: pd.DataFrame,
|
| 132 |
+
user_prompt: str,
|
| 133 |
+
rag_pipeline: GraphicsDesignPipeline
|
| 134 |
+
) -> dict:
|
| 135 |
+
"""
|
| 136 |
+
Generate Datawrapper chart configuration using RAG and LLM.
|
| 137 |
+
|
| 138 |
+
Args:
|
| 139 |
+
chart_type: Type of chart to create
|
| 140 |
+
df: Input DataFrame
|
| 141 |
+
user_prompt: User's visualization request
|
| 142 |
+
rag_pipeline: RAG pipeline for retrieving design best practices
|
| 143 |
+
|
| 144 |
+
Returns:
|
| 145 |
+
Dictionary with chart configuration
|
| 146 |
+
"""
|
| 147 |
+
# Get data summary
|
| 148 |
+
data_summary = get_data_summary(df)
|
| 149 |
+
|
| 150 |
+
# Query RAG for styling and design best practices
|
| 151 |
+
rag_query = f"chart design best practices colors accessibility {chart_type}"
|
| 152 |
+
relevant_docs = rag_pipeline.retrieve_documents(rag_query, k=3)
|
| 153 |
+
rag_context = rag_pipeline.vectorstore.format_documents_for_context(relevant_docs)
|
| 154 |
+
|
| 155 |
+
# Generate chart configuration using LLM
|
| 156 |
+
styling_prompt = get_chart_styling_prompt()
|
| 157 |
+
full_prompt = styling_prompt.format(
|
| 158 |
+
chart_type=chart_type,
|
| 159 |
+
user_prompt=user_prompt,
|
| 160 |
+
data_summary=data_summary,
|
| 161 |
+
rag_context=rag_context
|
| 162 |
+
)
|
| 163 |
+
|
| 164 |
+
llm_client = create_llm_client(
|
| 165 |
+
model=os.getenv("LLM_MODEL", "meta-llama/Llama-3.1-8B-Instruct"),
|
| 166 |
+
temperature=0.5,
|
| 167 |
+
max_tokens=800
|
| 168 |
+
)
|
| 169 |
+
|
| 170 |
+
response = llm_client.generate(
|
| 171 |
+
prompt=full_prompt,
|
| 172 |
+
system_prompt="You are a data visualization expert. Generate valid JSON configuration for Datawrapper charts."
|
| 173 |
+
)
|
| 174 |
+
|
| 175 |
+
# Parse JSON response
|
| 176 |
+
try:
|
| 177 |
+
# Extract JSON from response
|
| 178 |
+
response_clean = response.strip()
|
| 179 |
+
if "```json" in response_clean:
|
| 180 |
+
response_clean = response_clean.split("```json")[1].split("```")[0].strip()
|
| 181 |
+
elif "```" in response_clean:
|
| 182 |
+
response_clean = response_clean.split("```")[1].split("```")[0].strip()
|
| 183 |
+
|
| 184 |
+
config = json.loads(response_clean)
|
| 185 |
+
|
| 186 |
+
# Ensure basic required fields
|
| 187 |
+
if "title" not in config:
|
| 188 |
+
config["title"] = user_prompt[:100] # Use prompt as fallback title
|
| 189 |
+
|
| 190 |
+
return config
|
| 191 |
+
except Exception as e:
|
| 192 |
+
print(f"Error parsing chart config: {e}")
|
| 193 |
+
print(f"Response was: {response}")
|
| 194 |
+
# Return minimal config
|
| 195 |
+
return {
|
| 196 |
+
"title": user_prompt[:100] if user_prompt else "Data Visualization",
|
| 197 |
+
"source_name": "User Data"
|
| 198 |
+
}
|
| 199 |
+
|
| 200 |
+
|
| 201 |
+
async def create_and_publish_chart(
|
| 202 |
+
df: pd.DataFrame,
|
| 203 |
+
user_prompt: str,
|
| 204 |
+
rag_pipeline: GraphicsDesignPipeline,
|
| 205 |
+
api_token: Optional[str] = None
|
| 206 |
+
) -> dict:
|
| 207 |
+
"""
|
| 208 |
+
Complete workflow: analyze data, select chart type, create and publish chart.
|
| 209 |
+
|
| 210 |
+
Args:
|
| 211 |
+
df: Input DataFrame
|
| 212 |
+
user_prompt: User's visualization request
|
| 213 |
+
rag_pipeline: RAG pipeline instance
|
| 214 |
+
api_token: Datawrapper API token (defaults to env var)
|
| 215 |
+
|
| 216 |
+
Returns:
|
| 217 |
+
Dictionary with chart info including iframe URL
|
| 218 |
+
"""
|
| 219 |
+
if api_token is None:
|
| 220 |
+
api_token = os.getenv("DATAWRAPPER_ACCESS_TOKEN")
|
| 221 |
+
if not api_token:
|
| 222 |
+
raise ValueError("DATAWRAPPER_ACCESS_TOKEN not found in environment")
|
| 223 |
+
|
| 224 |
+
try:
|
| 225 |
+
# Step 1: Analyze data and select chart type
|
| 226 |
+
chart_type, reasoning = analyze_csv_for_chart_type(df, user_prompt, rag_pipeline)
|
| 227 |
+
|
| 228 |
+
# Step 2: Generate chart configuration
|
| 229 |
+
chart_config = generate_chart_config(chart_type, df, user_prompt, rag_pipeline)
|
| 230 |
+
|
| 231 |
+
# Step 3: Convert DataFrame to list of dicts for Datawrapper
|
| 232 |
+
data_list = df.to_dict('records')
|
| 233 |
+
|
| 234 |
+
# Step 4: Create chart using MCP handler
|
| 235 |
+
create_args = {
|
| 236 |
+
"data": data_list,
|
| 237 |
+
"chart_type": chart_type,
|
| 238 |
+
"chart_config": chart_config
|
| 239 |
+
}
|
| 240 |
+
|
| 241 |
+
create_result = await mcp_create_chart(create_args)
|
| 242 |
+
|
| 243 |
+
if not create_result or len(create_result) == 0:
|
| 244 |
+
raise ValueError("Empty response from chart creation")
|
| 245 |
+
|
| 246 |
+
result_text = create_result[0].text
|
| 247 |
+
|
| 248 |
+
if not result_text or result_text.strip() == "":
|
| 249 |
+
raise ValueError("Empty text in chart creation response")
|
| 250 |
+
|
| 251 |
+
result_data = json.loads(result_text)
|
| 252 |
+
|
| 253 |
+
chart_id = result_data.get("chart_id")
|
| 254 |
+
if not chart_id:
|
| 255 |
+
raise ValueError(f"Failed to get chart_id from creation response. Response was: {result_data}")
|
| 256 |
+
|
| 257 |
+
# Step 5: Try to publish chart using MCP handler
|
| 258 |
+
publish_success = False
|
| 259 |
+
publish_message = ""
|
| 260 |
+
try:
|
| 261 |
+
publish_args = {"chart_id": chart_id}
|
| 262 |
+
publish_result = await mcp_publish_chart(publish_args)
|
| 263 |
+
publish_text = publish_result[0].text
|
| 264 |
+
publish_data = json.loads(publish_text)
|
| 265 |
+
publish_success = True
|
| 266 |
+
publish_message = publish_data.get("message", "Published successfully")
|
| 267 |
+
except Exception as publish_error:
|
| 268 |
+
publish_message = f"Publish failed: {str(publish_error)}"
|
| 269 |
+
|
| 270 |
+
# Step 6: Get full chart info using MCP handler
|
| 271 |
+
chart_info_args = {"chart_id": chart_id}
|
| 272 |
+
chart_info_result = await mcp_get_chart_info(chart_info_args)
|
| 273 |
+
chart_info_text = chart_info_result[0].text
|
| 274 |
+
chart_info = json.loads(chart_info_text)
|
| 275 |
+
|
| 276 |
+
# Return complete info
|
| 277 |
+
return {
|
| 278 |
+
"success": True,
|
| 279 |
+
"chart_id": chart_id,
|
| 280 |
+
"chart_type": chart_type,
|
| 281 |
+
"reasoning": reasoning,
|
| 282 |
+
"public_url": chart_info.get("public_url"),
|
| 283 |
+
"edit_url": chart_info.get("edit_url"),
|
| 284 |
+
"published": publish_success,
|
| 285 |
+
"publish_message": publish_message,
|
| 286 |
+
"title": chart_config.get("title", "Chart")
|
| 287 |
+
}
|
| 288 |
+
|
| 289 |
+
except json.JSONDecodeError as e:
|
| 290 |
+
error_msg = f"JSON parsing error: {str(e)}"
|
| 291 |
+
print(f"Error in chart creation: {error_msg}")
|
| 292 |
+
print(f"Failed to parse: {result_text if 'result_text' in locals() else 'N/A'}")
|
| 293 |
+
return {
|
| 294 |
+
"success": False,
|
| 295 |
+
"error": error_msg,
|
| 296 |
+
"chart_type": chart_type if 'chart_type' in locals() else None,
|
| 297 |
+
"public_url": None
|
| 298 |
+
}
|
| 299 |
+
except Exception as e:
|
| 300 |
+
error_msg = f"{type(e).__name__}: {str(e)}"
|
| 301 |
+
print(f"Error in chart creation: {error_msg}")
|
| 302 |
+
import traceback
|
| 303 |
+
traceback.print_exc()
|
| 304 |
+
return {
|
| 305 |
+
"success": False,
|
| 306 |
+
"error": error_msg,
|
| 307 |
+
"chart_type": chart_type if 'chart_type' in locals() else None,
|
| 308 |
+
"public_url": None
|
| 309 |
+
}
|
| 310 |
+
|
| 311 |
+
|
| 312 |
+
def get_iframe_html(chart_url: str, height: int = 600) -> str:
|
| 313 |
+
"""
|
| 314 |
+
Generate iframe HTML for embedding a Datawrapper chart.
|
| 315 |
+
|
| 316 |
+
Args:
|
| 317 |
+
chart_url: Public URL of the chart
|
| 318 |
+
height: Height of iframe in pixels
|
| 319 |
+
|
| 320 |
+
Returns:
|
| 321 |
+
HTML string with iframe
|
| 322 |
+
"""
|
| 323 |
+
if not chart_url:
|
| 324 |
+
return "<div style='padding: 50px; text-align: center;'>No chart available</div>"
|
| 325 |
+
|
| 326 |
+
return f"""
|
| 327 |
+
<div style="width: 100%; height: {height}px;">
|
| 328 |
+
<iframe
|
| 329 |
+
src="{chart_url}"
|
| 330 |
+
style="width: 100%; height: 100%; border: none;"
|
| 331 |
+
frameborder="0"
|
| 332 |
+
scrolling="no"
|
| 333 |
+
aria-label="Chart">
|
| 334 |
+
</iframe>
|
| 335 |
+
</div>
|
| 336 |
+
"""
|
src/prompts.py
CHANGED
|
@@ -126,3 +126,110 @@ def get_followup_prompt() -> SimplePromptTemplate:
|
|
| 126 |
def get_technique_recommendation_prompt() -> SimplePromptTemplate:
|
| 127 |
"""Get the technique recommendation prompt template"""
|
| 128 |
return TECHNIQUE_RECOMMENDATION_PROMPT
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 126 |
def get_technique_recommendation_prompt() -> SimplePromptTemplate:
|
| 127 |
"""Get the technique recommendation prompt template"""
|
| 128 |
return TECHNIQUE_RECOMMENDATION_PROMPT
|
| 129 |
+
|
| 130 |
+
|
| 131 |
+
# =============================================================================
|
| 132 |
+
# CHART GENERATION PROMPTS (for Datawrapper integration)
|
| 133 |
+
# =============================================================================
|
| 134 |
+
|
| 135 |
+
CHART_SELECTION_SYSTEM_PROMPT = """You are an expert data visualization advisor specialized in selecting the optimal chart type for data storytelling.
|
| 136 |
+
|
| 137 |
+
Your task is to analyze:
|
| 138 |
+
1. The user's intent and goal (what story they want to tell)
|
| 139 |
+
2. The structure and characteristics of their data
|
| 140 |
+
3. Best practices from visualization research
|
| 141 |
+
|
| 142 |
+
You must respond with a JSON object containing:
|
| 143 |
+
- "chart_type": one of [bar, line, area, scatter, column, stacked_bar, arrow, multiple_column]
|
| 144 |
+
- "reasoning": brief explanation of why this chart type is best
|
| 145 |
+
- "data_insights": key patterns or features in the data that inform the choice"""
|
| 146 |
+
|
| 147 |
+
CHART_SELECTION_PROMPT_TEMPLATE = """USER REQUEST: {user_prompt}
|
| 148 |
+
|
| 149 |
+
DATA STRUCTURE:
|
| 150 |
+
{data_summary}
|
| 151 |
+
|
| 152 |
+
VISUALIZATION BEST PRACTICES (from knowledge base):
|
| 153 |
+
{rag_context}
|
| 154 |
+
|
| 155 |
+
Based on the user's request, the data characteristics, and visualization best practices:
|
| 156 |
+
|
| 157 |
+
1. Analyze the data type:
|
| 158 |
+
- Time series → line, area charts
|
| 159 |
+
- Categorical comparisons → bar, column charts
|
| 160 |
+
- Correlations/relationships → scatter plots
|
| 161 |
+
- Part-to-whole → stacked bar charts
|
| 162 |
+
- Change/movement → arrow charts
|
| 163 |
+
- Multiple categories over time → multiple column charts
|
| 164 |
+
|
| 165 |
+
2. Consider the user's storytelling goal:
|
| 166 |
+
- Showing trends over time
|
| 167 |
+
- Comparing categories
|
| 168 |
+
- Revealing correlations
|
| 169 |
+
- Displaying composition
|
| 170 |
+
- Highlighting change
|
| 171 |
+
|
| 172 |
+
3. Apply best practices from research:
|
| 173 |
+
- Accessibility and clarity
|
| 174 |
+
- Appropriate for data density
|
| 175 |
+
- Effective for the message
|
| 176 |
+
|
| 177 |
+
Respond with a JSON object only:
|
| 178 |
+
{{
|
| 179 |
+
"chart_type": "one of [bar, line, area, scatter, column, stacked_bar, arrow, multiple_column]",
|
| 180 |
+
"reasoning": "why this chart type is optimal for this data and intent",
|
| 181 |
+
"data_insights": "key patterns that inform the visualization approach"
|
| 182 |
+
}}"""
|
| 183 |
+
|
| 184 |
+
CHART_STYLING_PROMPT_TEMPLATE = """You are creating a Datawrapper {chart_type} chart configuration.
|
| 185 |
+
|
| 186 |
+
USER REQUEST: {user_prompt}
|
| 187 |
+
|
| 188 |
+
DATA STRUCTURE:
|
| 189 |
+
{data_summary}
|
| 190 |
+
|
| 191 |
+
DESIGN BEST PRACTICES (from knowledge base):
|
| 192 |
+
{rag_context}
|
| 193 |
+
|
| 194 |
+
IMPORTANT: You must ONLY include these fields in your JSON response:
|
| 195 |
+
- title (string, required): Clear, descriptive chart title
|
| 196 |
+
- intro (string, optional): Brief explanation
|
| 197 |
+
- byline (string, optional): Author/source attribution
|
| 198 |
+
- source_name (string, optional): Data source name
|
| 199 |
+
- source_url (string, optional): Link to data source
|
| 200 |
+
|
| 201 |
+
DO NOT include any other fields like:
|
| 202 |
+
- styling, options, data, chart_type, colors, labels, annotations, tooltips
|
| 203 |
+
- metadata, visualize, or any internal fields
|
| 204 |
+
|
| 205 |
+
These other fields will cause validation errors. Keep it simple with just the 5 fields listed above.
|
| 206 |
+
|
| 207 |
+
Example valid response:
|
| 208 |
+
{{
|
| 209 |
+
"title": "Sales Trends 2024",
|
| 210 |
+
"intro": "Monthly sales showing 30% growth",
|
| 211 |
+
"source_name": "Company Data",
|
| 212 |
+
"source_url": "https://example.com"
|
| 213 |
+
}}
|
| 214 |
+
|
| 215 |
+
Generate a minimal, valid JSON configuration with ONLY the allowed fields above."""
|
| 216 |
+
|
| 217 |
+
CHART_SELECTION_PROMPT = SimplePromptTemplate(
|
| 218 |
+
template=CHART_SELECTION_PROMPT_TEMPLATE,
|
| 219 |
+
input_variables=["user_prompt", "data_summary", "rag_context"]
|
| 220 |
+
)
|
| 221 |
+
|
| 222 |
+
CHART_STYLING_PROMPT = SimplePromptTemplate(
|
| 223 |
+
template=CHART_STYLING_PROMPT_TEMPLATE,
|
| 224 |
+
input_variables=["chart_type", "user_prompt", "data_summary", "rag_context"]
|
| 225 |
+
)
|
| 226 |
+
|
| 227 |
+
|
| 228 |
+
def get_chart_selection_prompt() -> SimplePromptTemplate:
|
| 229 |
+
"""Get the chart type selection prompt template"""
|
| 230 |
+
return CHART_SELECTION_PROMPT
|
| 231 |
+
|
| 232 |
+
|
| 233 |
+
def get_chart_styling_prompt() -> SimplePromptTemplate:
|
| 234 |
+
"""Get the chart styling configuration prompt template"""
|
| 235 |
+
return CHART_STYLING_PROMPT
|
start.sh
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
# Start script for Viz LLM with Datawrapper integration
|
| 4 |
+
|
| 5 |
+
echo "🚀 Starting Viz LLM..."
|
| 6 |
+
echo ""
|
| 7 |
+
|
| 8 |
+
# Check for required environment variables
|
| 9 |
+
if [ ! -f .env ]; then
|
| 10 |
+
echo "⚠️ Error: .env file not found!"
|
| 11 |
+
echo "Please create a .env file based on .env.example"
|
| 12 |
+
exit 1
|
| 13 |
+
fi
|
| 14 |
+
|
| 15 |
+
# Check if required packages are installed
|
| 16 |
+
echo "📦 Checking dependencies..."
|
| 17 |
+
python -c "import gradio; import datawrapper; import pandas; import mcp" 2>/dev/null
|
| 18 |
+
if [ $? -ne 0 ]; then
|
| 19 |
+
echo "⚠️ Some dependencies are missing. Installing..."
|
| 20 |
+
pip install -r requirements.txt
|
| 21 |
+
fi
|
| 22 |
+
|
| 23 |
+
echo ""
|
| 24 |
+
echo "✓ Dependencies OK"
|
| 25 |
+
echo ""
|
| 26 |
+
echo "Starting Gradio app..."
|
| 27 |
+
echo "Once started, open your browser to: http://localhost:7860"
|
| 28 |
+
echo ""
|
| 29 |
+
|
| 30 |
+
# Run the app
|
| 31 |
+
python app.py
|