Spaces:

tomvaillant
/

graphics-llm

Running

Tom Claude commited on Nov 5

Commit

2a10e9c

1 Parent(s): df042c8

Add Datawrapper chart generation mode with clean iframe display

- Add Chart Generation Mode with CSV upload and AI-powered chart creation
- Integrate Datawrapper API via custom MCP handlers for create, publish, and retrieve operations
- Implement RAG-powered chart type selection and configuration
- Display charts as embedded iframes with reasoning and edit button
- Clean up debug output for production-ready UI
- Update README with dual-mode functionality

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (20) hide show

.env.example +7 -0
README.md +34 -246
app.py +188 -21
datawrapper_mcp/__init__.py +1 -0
datawrapper_mcp/config.py +24 -0
datawrapper_mcp/handlers/__init__.py +19 -0
datawrapper_mcp/handlers/create.py +52 -0
datawrapper_mcp/handlers/delete.py +25 -0
datawrapper_mcp/handlers/export.py +48 -0
datawrapper_mcp/handlers/publish.py +26 -0
datawrapper_mcp/handlers/retrieve.py +27 -0
datawrapper_mcp/handlers/schema.py +31 -0
datawrapper_mcp/handlers/update.py +61 -0
datawrapper_mcp/server.py +101 -0
datawrapper_mcp/tools.py +286 -0
datawrapper_mcp/utils.py +118 -0
requirements.txt +5 -0
src/datawrapper_client.py +336 -0
src/prompts.py +107 -0
start.sh +31 -0

.env.example CHANGED Viewed

@@ -21,6 +21,13 @@ HF_TOKEN=hf_your_token_here
 # This is used for Jina-CLIP-v2 embeddings
 JINA_API_KEY=jina_your_token_here
 # =============================================================================
 # OPTIONAL: LLM Configuration
 # =============================================================================

 # This is used for Jina-CLIP-v2 embeddings
 JINA_API_KEY=jina_your_token_here
+# =============================================================================
+# REQUIRED: Datawrapper API Token
+# =============================================================================
+# Get your token from: https://app.datawrapper.de/account/api-tokens
+# This is used for creating and publishing charts via Datawrapper API
+DATAWRAPPER_ACCESS_TOKEN=your_datawrapper_token_here
 # =============================================================================
 # OPTIONAL: LLM Configuration
 # =============================================================================

README.md CHANGED Viewed

@@ -7,278 +7,66 @@ sdk: gradio
 sdk_version: 5.49.1
 app_file: app.py
 pinned: false
-short_description: AI assistant for visualization guidance and design
 license: mit
 ---
-# 📊 Graphics Guide / Design Assistant
-A RAG-powered AI assistant that helps users select appropriate visualizations and provides technical implementation guidance for creating effective information graphics. Built with Supabase PGVector and Hugging Face Inference Providers, powered by a knowledge base of graphics research and design principles.
-## ✨ Features
-- **🎯 Design Recommendations**: Get tailored visualization suggestions based on your intent and data characteristics
-- **📚 Research-Backed Guidance**: Access insights from academic papers and design best practices
-- **🔍 Context-Aware Retrieval**: Semantic search finds the most relevant examples and knowledge for your needs
-- **🚀 API Access**: Built-in REST API for integration with external applications
-- **💬 Chat Interface**: User-friendly conversational interface
-- **⚡ Technical Implementation**: Practical guidance on tools, techniques, and code examples
-## 🏗️ Architecture
-```
-┌──────────────────────────────────────┐
-│      Gradio UI + API Endpoints       │
-└──────────────┬───────────────────────┘
-               │
-┌──────────────▼───────────────────────┐
-│          RAG Pipeline                │
-│  • Query Understanding               │
-│  • Document Retrieval (PGVector)     │
-│  • Response Generation (LLM)         │
-└──────────────┬───────────────────────┘
-               │
-    ┌──────────┴──────────┐
-    │                     │
-┌───▼───────────┐  ┌─────▼────────────┐
-│ Supabase      │  │ HF Inference     │
-│ PGVector DB   │  │ Providers        │
-│ (198 docs)    │  │ (Llama 3.1)      │
-└───────────────┘  └──────────────────┘
-```
-## 🚀 Quick Start
-### Local Development
-1. **Clone the repository**
-   ```bash
-   git clone <your-repo-url>
-   cd graphics-llm
-   ```
-2. **Install dependencies**
    ```bash
    pip install -r requirements.txt
    ```
-3. **Set up environment variables**
    ```bash
    cp .env.example .env
-   # Edit .env with your credentials
    ```
-   Required variables:
-   - `SUPABASE_URL`: Your Supabase project URL
-   - `SUPABASE_KEY`: Your Supabase anon key
-   - `HF_TOKEN`: Your Hugging Face API token (for LLM generation)
-   - `JINA_API_KEY`: Your Jina AI API token (for embeddings)
-4. **Run the application**
    ```bash
    python app.py
    ```
-   The app will be available at `http://localhost:7860`
-### Hugging Face Spaces Deployment
-1. **Create a new Space** on Hugging Face
-2. **Push this repository** to your Space
-3. **Set environment variables** in Space settings:
-   - `SUPABASE_URL`
-   - `SUPABASE_KEY`
-   - `HF_TOKEN`
-   - `JINA_API_KEY`
-4. **Deploy** - The Space will automatically build and launch
-## 📚 Usage
-### Chat Interface
-Simply ask your design questions:
-```
-"What's the best chart type for showing trends over time?"
-"How do I create an effective infographic for complex data?"
-"What are best practices for data visualization accessibility?"
-```
-The assistant will provide:
-1. Design recommendations based on your intent
-2. WHY each visualization type is suitable
-3. HOW to implement it (tools, techniques, code)
-4. Best practices from research and examples
-5. Accessibility and effectiveness considerations
-### API Access
-This app automatically exposes REST API endpoints for external integration.
-**Python Client:**
-```python
-from gradio_client import Client
-client = Client("your-space-url")
-result = client.predict(
-    "What's the best chart for time series?",
-    api_name="/recommend"
-)
-print(result)
-```
-**JavaScript Client:**
-```javascript
-import { Client } from "@gradio/client";
-const client = await Client.connect("your-space-url");
-const result = await client.predict("/recommend", {
-  message: "What's the best chart for time series?"
-});
-console.log(result.data);
-```
-**cURL:**
-```bash
-curl -X POST "https://your-space.hf.space/call/recommend" \
-     -H "Content-Type: application/json" \
-     -d '{"data": ["What's the best chart for time series?"]}'
-```
-**Available Endpoints:**
-- `/call/recommend` - Main design recommendation assistant
-- `/gradio_api/openapi.json` - OpenAPI specification
-## 🗄️ Database
-The app uses Supabase with PGVector extension to store and retrieve document chunks from graphics research and examples.
-**Database Schema:**
-```sql
-CREATE TABLE document_embeddings (
-  id BIGINT PRIMARY KEY,
-  source_type TEXT, -- pdf, url, or image
-  source_id TEXT, -- filename or URL
-  title TEXT,
-  content_type TEXT, -- text or image
-  chunk_index INTEGER,
-  chunk_text TEXT,
-  page_number INTEGER,
-  embedding VECTOR(1024), -- 1024-dimensional vectors
-  metadata JSONB,
-  word_count INTEGER,
-  image_metadata JSONB,
-  created_at TIMESTAMPTZ
-);
-```
-**Knowledge Base Content:**
-- Research papers on data visualization
-- Design principles and best practices
-- Visual narrative techniques
-- Accessibility guidelines
-- Chart type selection guidance
-- Real-world examples and case studies
-## 🛠️ Technology Stack
-- **UI/API**: [Gradio](https://gradio.app/) - Automatic API generation
-- **Vector Database**: [Supabase](https://supabase.com/) with PGVector extension
-- **Embeddings**: Jina-CLIP-v2 (1024-dimensional)
-- **LLM**: [Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers/) - Llama 3.1
-- **Language**: Python 3.9+
-## 📁 Project Structure
-```
-graphics-llm/
-├── app.py                    # Main Gradio application
-├── requirements.txt          # Python dependencies
-├── .env.example             # Environment variables template
-├── README.md                # This file
-└── src/
-    ├── __init__.py
-    ├── vectorstore.py       # Supabase PGVector connection
-    ├── rag_pipeline.py      # RAG pipeline logic
-    ├── llm_client.py        # Inference Provider client
-    └── prompts.py           # Design recommendation prompt templates
-```
-## ⚙️ Configuration
-### Environment Variables
-See `.env.example` for all available configuration options.
-**Required:**
-- `SUPABASE_URL` - Supabase project URL
-- `SUPABASE_KEY` - Supabase anon key
-- `HF_TOKEN` - Hugging Face API token (for LLM generation)
-- `JINA_API_KEY` - Jina AI API token (for Jina-CLIP-v2 embeddings)
-**Optional:**
-- `LLM_MODEL` - Model to use (default: meta-llama/Llama-3.1-8B-Instruct)
-- `LLM_TEMPERATURE` - Generation temperature (default: 0.2)
-- `LLM_MAX_TOKENS` - Max tokens to generate (default: 2000)
-- `RETRIEVAL_K` - Number of documents to retrieve (default: 5)
-- `EMBEDDING_MODEL` - Embedding model (default: jina-clip-v2)
-### Supported LLM Models
-- `meta-llama/Llama-3.1-8B-Instruct` (recommended)
-- `meta-llama/Meta-Llama-3-8B-Instruct`
-- `Qwen/Qwen2.5-72B-Instruct`
-- `mistralai/Mistral-7B-Instruct-v0.3`
-## 💰 Cost Considerations
-### Hugging Face Inference Providers
-- Free tier: $0.10/month credits
-- PRO tier: $2.00/month credits + pay-as-you-go
-- Typical cost: ~$0.001-0.01 per query
-- Recommended budget: $10-50/month for moderate usage
-### Supabase
-- Free tier sufficient for most use cases
-- PGVector operations are standard database queries
-### Hugging Face Spaces
-- Free CPU hosting available
-- GPU upgrade: ~$0.60/hour (optional, not required)
-## 🔮 Future Enhancements
-- [ ] Multi-turn conversation with memory
-- [ ] Code generation for visualization implementations
-- [ ] Interactive visualization previews
-- [ ] User-uploaded data analysis
-- [ ] Export recommendations as PDF/markdown
-- [ ] Community-contributed examples
-- [ ] Support for more design domains (UI/UX, print graphics)
-## 🤝 Contributing
-Contributions are welcome! Please feel free to submit issues or pull requests.
-## 📄 License
-MIT License - See LICENSE file for details
-## 🙏 Acknowledgments
-- Knowledge base includes research papers on data visualization and information design
-- Built to support designers, journalists, and data practitioners
-## 📞 Support
-For issues or questions:
-- Open an issue on GitHub
-- Check the [Hugging Face Spaces documentation](https://huggingface.co/docs/hub/spaces)
-- Review the [Gradio documentation](https://gradio.app/docs/)
 ---
-Built with ❤️ for the design and visualization community

 sdk_version: 5.49.1
 app_file: app.py
 pinned: false
+short_description: AI assistant for visualization guidance and chart generation
 license: mit
 ---
+# 📊 Viz LLM
+AI-powered data visualization assistant with two modes:
+- **💡 Ideation Mode**: Get design recommendations based on research and best practices
+- **📊 Chart Generation Mode**: Upload CSV data and automatically generate publication-ready charts
+## Features
+**Ideation Mode:**
+- Research-backed visualization guidance
+- Chart type recommendations
+- Design best practices and accessibility advice
+- Powered by RAG with Jina-CLIP-v2 embeddings
+**Chart Generation Mode:**
+- Upload CSV data
+- AI analyzes your data and selects optimal chart type
+- Automatic chart creation via Datawrapper API
+- Publication-ready visualizations with one click
+## Quick Start
+1. **Install dependencies:**
    ```bash
    pip install -r requirements.txt
    ```
+2. **Set up environment variables:**
    ```bash
    cp .env.example .env
    ```
+   Required:
+   - `SUPABASE_URL` - Your Supabase project URL
+   - `SUPABASE_KEY` - Your Supabase anon key
+   - `HF_TOKEN` - Hugging Face API token
+   - `DATAWRAPPER_ACCESS_TOKEN` - Datawrapper API token
+3. **Run the app:**
    ```bash
    python app.py
    ```
+## Technology Stack
+- **UI**: Gradio
+- **Vector Database**: Supabase PGVector
+- **Embeddings**: Jina-CLIP-v2
+- **LLM**: Llama 3.1 via Hugging Face Inference Providers
+- **Charts**: Datawrapper API
+## License
+MIT License
 ---
+Built for the data visualization community

app.py CHANGED Viewed

@@ -3,12 +3,17 @@ Viz LLM - Gradio App
 A RAG-powered assistant for data visualization guidance, powered by Jina-CLIP-v2
 embeddings and research from the field of information graphics.
 """
 import os
 import gradio as gr
 from dotenv import load_dotenv
 from src.rag_pipeline import create_pipeline
 from datetime import datetime, timedelta
 from collections import defaultdict
@@ -90,7 +95,94 @@ def recommend_stream(message: str, history: list, request: gr.Request):
         yield f"Error generating response: {str(e)}\n\nPlease check your environment variables (HF_TOKEN, SUPABASE_URL, SUPABASE_KEY) and try again."
-# Minimal CSS to fix UI artifacts
 custom_css = """
 /* Hide retry/undo buttons that appear as artifacts */
 .chatbot button[aria-label="Retry"],
@@ -102,9 +194,16 @@ custom_css = """
 textarea[data-testid="textbox"] {
     overflow-y: hidden !important;
 }
 """
-# Create Gradio interface
 with gr.Blocks(
     title="Viz LLM",
     css=custom_css
@@ -112,29 +211,95 @@ with gr.Blocks(
     gr.Markdown("""
     # 📊 Viz LLM
-    Get design recommendations for creating effective data visualizations based on research and best practices.
     """)
-    # Main chat interface
-    chatbot = gr.ChatInterface(
-        fn=recommend_stream,
-        type="messages",
-        examples=[
-            "What's the best chart type for showing trends over time?",
-            "How do I create an effective infographic for complex data?",
-            "What are best practices for data visualization accessibility?",
-            "How should I design a dashboard for storytelling?",
-            "What visualization works best for comparing categories?"
-        ],
-        cache_examples=False,
-        api_name="recommend"
     )
-    # Knowledge base section (below chat interface)
     gr.Markdown("""
-    ### Knowledge Base
-    This assistant draws on research papers, design principles, and examples from the field of information graphics and data visualization.
     **Credits:** Special thanks to the researchers whose work informed this model: Robert Kosara, Edward Segel, Jeffrey Heer, Matthew Conlen, John Maeda, Kennedy Elliott, Scott McCloud, and many others.
@@ -143,19 +308,21 @@ with gr.Blocks(
     **Usage Limits:** This service is limited to 20 queries per day per user to manage costs. Responses are optimized for English.
     <div style="text-align: center; margin-top: 20px; opacity: 0.6; font-size: 0.9em;">
-    Embeddings: Jina-CLIP-v2
     </div>
     """)
 # Launch configuration
 if __name__ == "__main__":
     # Check for required environment variables
-    required_vars = ["SUPABASE_URL", "SUPABASE_KEY", "HF_TOKEN"]
     missing_vars = [var for var in required_vars if not os.getenv(var)]
     if missing_vars:
         print(f"⚠️  Warning: Missing environment variables: {', '.join(missing_vars)}")
         print("Please set these in your .env file or as environment variables")
     # Launch the app
     demo.launch(

 A RAG-powered assistant for data visualization guidance, powered by Jina-CLIP-v2
 embeddings and research from the field of information graphics.
+Now with Datawrapper integration for chart generation!
 """
 import os
+import asyncio
+import pandas as pd
 import gradio as gr
 from dotenv import load_dotenv
 from src.rag_pipeline import create_pipeline
+from src.datawrapper_client import create_and_publish_chart, get_iframe_html
 from datetime import datetime, timedelta
 from collections import defaultdict
         yield f"Error generating response: {str(e)}\n\nPlease check your environment variables (HF_TOKEN, SUPABASE_URL, SUPABASE_KEY) and try again."
+def generate_chart_from_csv(csv_file, user_prompt):
+    """
+    Generate a Datawrapper chart from uploaded CSV and user prompt.
+    Args:
+        csv_file: Uploaded CSV file
+        user_prompt: User's description of the chart
+    Returns:
+        HTML string with iframe or error message
+    """
+    if not csv_file:
+        return "<div style='padding: 50px; text-align: center;'>Please upload a CSV file to generate a chart.</div>"
+    if not user_prompt or user_prompt.strip() == "":
+        return "<div style='padding: 50px; text-align: center;'>Please describe what chart you want to create.</div>"
+    try:
+        # Show loading message
+        loading_html = """
+        <div style='padding: 100px; text-align: center;'>
+            <h3>🎨 Creating your chart...</h3>
+            <p>Analyzing your data and selecting the best visualization...</p>
+        </div>
+        """
+        # Read CSV file
+        df = pd.read_csv(csv_file)
+        # Create and publish chart (async function, need to run in event loop)
+        loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(loop)
+        result = loop.run_until_complete(
+            create_and_publish_chart(df, user_prompt, pipeline)
+        )
+        loop.close()
+        if result.get("success"):
+            # Get the iframe HTML
+            iframe_html = get_iframe_html(result.get('public_url'), height=500)
+            # Create HTML with iframe, reasoning, and edit button
+            chart_html = f"""
+            <div style='padding: 20px;'>
+                <!-- Chart iframe -->
+                <div style='margin-bottom: 20px;'>
+                    {iframe_html}
+                </div>
+                <!-- Why this chart? -->
+                <div style='background: #f9f9f9; padding: 15px; border-radius: 5px; margin-bottom: 15px;'>
+                    <strong>Why this chart?</strong><br>
+                    <p style='margin: 10px 0 0 0;'>{result['reasoning']}</p>
+                </div>
+                <!-- Edit button -->
+                <div>
+                    <a href="{result['edit_url']}" target="_blank"
+                       style="display: inline-block; padding: 12px 24px; background: #1976d2; color: white;
+                              text-decoration: none; border-radius: 5px; font-weight: bold;">
+                        ✏️ Open in Datawrapper
+                    </a>
+                </div>
+            </div>
+            """
+            return chart_html
+        else:
+            error_msg = result.get("error", "Unknown error")
+            return f"""
+            <div style='padding: 50px; text-align: center; color: red;'>
+                <h3>❌ Chart Generation Failed</h3>
+                <p>{error_msg}</p>
+                <p style='font-size: 0.9em; color: #666;'>Please check your CSV format and try again.</p>
+            </div>
+            """
+    except Exception as e:
+        return f"""
+        <div style='padding: 50px; text-align: center; color: red;'>
+            <h3>❌ Error</h3>
+            <p>{str(e)}</p>
+            <p style='font-size: 0.9em; color: #666;'>Please ensure your CSV is properly formatted and try again.</p>
+        </div>
+        """
+# Minimal CSS to fix UI artifacts and style the mode selector
 custom_css = """
 /* Hide retry/undo buttons that appear as artifacts */
 .chatbot button[aria-label="Retry"],
 textarea[data-testid="textbox"] {
     overflow-y: hidden !important;
 }
+/* Mode selector buttons */
+.mode-button {
+    font-size: 1.1em;
+    padding: 12px 24px;
+    margin: 5px;
+}
 """
+# Create Gradio interface with dual-mode layout
 with gr.Blocks(
     title="Viz LLM",
     css=custom_css
     gr.Markdown("""
     # 📊 Viz LLM
+    Get design recommendations or generate charts with AI-powered data visualization assistance.
     """)
+    # Mode selector buttons
+    with gr.Row():
+        ideation_btn = gr.Button("💡 Ideation Mode", variant="primary", elem_classes="mode-button")
+        chart_gen_btn = gr.Button("📊 Chart Generation Mode", variant="secondary", elem_classes="mode-button")
+    # Ideation Mode: Chat interface (shown by default, wrapped in Column)
+    with gr.Column(visible=True) as ideation_container:
+        ideation_interface = gr.ChatInterface(
+            fn=recommend_stream,
+            type="messages",
+            examples=[
+                "What's the best chart type for showing trends over time?",
+                "How do I create an effective infographic for complex data?",
+                "What are best practices for data visualization accessibility?",
+                "How should I design a dashboard for storytelling?",
+                "What visualization works best for comparing categories?"
+            ],
+            cache_examples=False,
+            api_name="recommend"
+        )
+    # Chart Generation Mode: Chart controls and output (hidden by default)
+    with gr.Column(visible=False) as chart_gen_container:
+        csv_upload = gr.File(
+            label="📁 Upload CSV File",
+            file_types=[".csv"],
+            type="filepath"
+        )
+        chart_prompt_input = gr.Textbox(
+            label="Describe your chart",
+            placeholder="E.g., 'Show sales trends over time' or 'Compare revenue by category'",
+            lines=2
+        )
+        generate_chart_btn = gr.Button("Generate Chart", variant="primary", size="lg")
+        chart_output = gr.HTML(
+            value="<div style='text-align:center; padding:100px; color: #666;'>Upload a CSV file and describe your visualization above, then click Generate Chart.</div>",
+            label="Generated Chart"
+        )
+    # Mode switching functions
+    def switch_to_ideation():
+        return [
+            gr.update(variant="primary"),  # ideation_btn
+            gr.update(variant="secondary"),  # chart_gen_btn
+            gr.update(visible=True),  # ideation_container
+            gr.update(visible=False),  # chart_gen_container
+        ]
+    def switch_to_chart_gen():
+        return [
+            gr.update(variant="secondary"),  # ideation_btn
+            gr.update(variant="primary"),  # chart_gen_btn
+            gr.update(visible=False),  # ideation_container
+            gr.update(visible=True),  # chart_gen_container
+        ]
+    # Wire up mode switching
+    ideation_btn.click(
+        fn=switch_to_ideation,
+        inputs=[],
+        outputs=[ideation_btn, chart_gen_btn, ideation_container, chart_gen_container]
     )
+    chart_gen_btn.click(
+        fn=switch_to_chart_gen,
+        inputs=[],
+        outputs=[ideation_btn, chart_gen_btn, ideation_container, chart_gen_container]
+    )
+    # Generate chart when button is clicked
+    generate_chart_btn.click(
+        fn=generate_chart_from_csv,
+        inputs=[csv_upload, chart_prompt_input],
+        outputs=[chart_output]
+    )
+    # Knowledge base section (below both interfaces)
     gr.Markdown("""
+    ### About Viz LLM
+    **Ideation Mode:** Get design recommendations based on research papers, design principles, and examples from the field of information graphics and data visualization.
+    **Chart Generation Mode:** Upload your CSV data and describe your visualization goal. The AI will analyze your data, select the optimal chart type, and generate a publication-ready chart using Datawrapper.
     **Credits:** Special thanks to the researchers whose work informed this model: Robert Kosara, Edward Segel, Jeffrey Heer, Matthew Conlen, John Maeda, Kennedy Elliott, Scott McCloud, and many others.
     **Usage Limits:** This service is limited to 20 queries per day per user to manage costs. Responses are optimized for English.
     <div style="text-align: center; margin-top: 20px; opacity: 0.6; font-size: 0.9em;">
+    Embeddings: Jina-CLIP-v2 | Charts: Datawrapper API
     </div>
     """)
 # Launch configuration
 if __name__ == "__main__":
     # Check for required environment variables
+    required_vars = ["SUPABASE_URL", "SUPABASE_KEY", "HF_TOKEN", "DATAWRAPPER_ACCESS_TOKEN"]
     missing_vars = [var for var in required_vars if not os.getenv(var)]
     if missing_vars:
         print(f"⚠️  Warning: Missing environment variables: {', '.join(missing_vars)}")
         print("Please set these in your .env file or as environment variables")
+        if "DATAWRAPPER_ACCESS_TOKEN" in missing_vars:
+            print("Note: DATAWRAPPER_ACCESS_TOKEN is required for chart generation mode")
     # Launch the app
     demo.launch(

datawrapper_mcp/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """A Model Context Protocol server for creating Datawrapper charts."""

datawrapper_mcp/config.py ADDED Viewed

	@@ -0,0 +1,24 @@

+"""Configuration and constants for the Datawrapper MCP server."""
+from datawrapper import (
+    AreaChart,
+    ArrowChart,
+    BarChart,
+    ColumnChart,
+    LineChart,
+    MultipleColumnChart,
+    ScatterPlot,
+    StackedBarChart,
+)
+# Map of chart type names to their Pydantic classes
+CHART_CLASSES = {
+    "bar": BarChart,
+    "line": LineChart,
+    "area": AreaChart,
+    "arrow": ArrowChart,
+    "column": ColumnChart,
+    "multiple_column": MultipleColumnChart,
+    "scatter": ScatterPlot,
+    "stacked_bar": StackedBarChart,
+}

datawrapper_mcp/handlers/__init__.py ADDED Viewed

	@@ -0,0 +1,19 @@

+"""Handler functions for MCP tool implementations."""
+from .create import create_chart
+from .delete import delete_chart
+from .export import export_chart_png
+from .publish import publish_chart
+from .retrieve import get_chart_info
+from .schema import get_chart_schema
+from .update import update_chart
+__all__ = [
+    "create_chart",
+    "delete_chart",
+    "export_chart_png",
+    "get_chart_info",
+    "get_chart_schema",
+    "publish_chart",
+    "update_chart",
+]

datawrapper_mcp/handlers/create.py ADDED Viewed

	@@ -0,0 +1,52 @@

+"""Handler for creating Datawrapper charts."""
+import json
+from mcp.types import TextContent
+from ..config import CHART_CLASSES
+from ..utils import get_api_token, json_to_dataframe
+async def create_chart(arguments: dict) -> list[TextContent]:
+    """Create a chart with full Pydantic model configuration."""
+    api_token = get_api_token()
+    # Convert data to DataFrame
+    df = json_to_dataframe(arguments["data"])
+    # Get chart class and validate config
+    chart_type = arguments["chart_type"]
+    chart_class = CHART_CLASSES[chart_type]
+    # Validate and create chart using Pydantic model
+    try:
+        chart = chart_class.model_validate(arguments["chart_config"])
+    except Exception as e:
+        return [
+            TextContent(
+                type="text",
+                text=f"Invalid chart configuration: {str(e)}\n\n"
+                f"Use get_chart_schema with chart_type '{chart_type}' "
+                f"to see the valid schema.",
+            )
+        ]
+    # Set data on chart instance
+    chart.data = df
+    # Create chart using Pydantic instance method
+    chart.create(access_token=api_token)
+    result = {
+        "chart_id": chart.chart_id,
+        "chart_type": chart_type,
+        "title": chart.title,
+        "edit_url": chart.get_editor_url(),
+        "message": (
+            f"Chart created successfully! Edit it at: {chart.get_editor_url()}\n"
+            f"Use publish_chart with chart_id '{chart.chart_id}' to make it public."
+        ),
+    }
+    return [TextContent(type="text", text=json.dumps(result, indent=2))]

datawrapper_mcp/handlers/delete.py ADDED Viewed

	@@ -0,0 +1,25 @@

+"""Handler for deleting Datawrapper charts."""
+import json
+from datawrapper import get_chart
+from mcp.types import TextContent
+from ..utils import get_api_token
+async def delete_chart(arguments: dict) -> list[TextContent]:
+    """Delete a chart permanently."""
+    api_token = get_api_token()
+    chart_id = arguments["chart_id"]
+    # Get chart and delete using Pydantic instance method
+    chart = get_chart(chart_id, access_token=api_token)
+    chart.delete(access_token=api_token)
+    result = {
+        "chart_id": chart_id,
+        "message": "Chart deleted successfully!",
+    }
+    return [TextContent(type="text", text=json.dumps(result, indent=2))]

datawrapper_mcp/handlers/export.py ADDED Viewed

	@@ -0,0 +1,48 @@

+"""Handler for exporting Datawrapper charts."""
+import base64
+from datawrapper import get_chart
+from mcp.types import ImageContent
+from ..utils import get_api_token
+async def export_chart_png(arguments: dict) -> list[ImageContent]:
+    """Export a chart as PNG and return it as inline image."""
+    api_token = get_api_token()
+    chart_id = arguments["chart_id"]
+    # Get chart using factory function
+    chart = get_chart(chart_id, access_token=api_token)
+    # Build export parameters
+    export_params = {}
+    if "width" in arguments:
+        export_params["width"] = arguments["width"]
+    if "height" in arguments:
+        export_params["height"] = arguments["height"]
+    if "plain" in arguments:
+        export_params["plain"] = arguments["plain"]
+    if "zoom" in arguments:
+        export_params["zoom"] = arguments["zoom"]
+    if "transparent" in arguments:
+        export_params["transparent"] = arguments["transparent"]
+    if "border_width" in arguments:
+        export_params["borderWidth"] = arguments["border_width"]
+    if "border_color" in arguments:
+        export_params["borderColor"] = arguments["border_color"]
+    # Export PNG using Pydantic instance method
+    png_bytes = chart.export_png(access_token=api_token, **export_params)
+    # Encode to base64
+    base64_data = base64.b64encode(png_bytes).decode("utf-8")
+    return [
+        ImageContent(
+            type="image",
+            data=base64_data,
+            mimeType="image/png",
+        )
+    ]

datawrapper_mcp/handlers/publish.py ADDED Viewed

	@@ -0,0 +1,26 @@

+"""Handler for publishing Datawrapper charts."""
+import json
+from datawrapper import get_chart
+from mcp.types import TextContent
+from ..utils import get_api_token
+async def publish_chart(arguments: dict) -> list[TextContent]:
+    """Publish a chart to make it publicly accessible."""
+    api_token = get_api_token()
+    chart_id = arguments["chart_id"]
+    # Get chart and publish using Pydantic instance method
+    chart = get_chart(chart_id, access_token=api_token)
+    chart.publish(access_token=api_token)
+    result = {
+        "chart_id": chart_id,
+        "public_url": chart.get_public_url(),
+        "message": "Chart published successfully!",
+    }
+    return [TextContent(type="text", text=json.dumps(result, indent=2))]

datawrapper_mcp/handlers/retrieve.py ADDED Viewed

	@@ -0,0 +1,27 @@

+"""Handler for retrieving chart information."""
+import json
+from datawrapper import get_chart
+from mcp.types import TextContent
+from ..utils import get_api_token
+async def get_chart_info(arguments: dict) -> list[TextContent]:
+    """Get information about an existing chart."""
+    api_token = get_api_token()
+    chart_id = arguments["chart_id"]
+    # Get chart using factory function
+    chart = get_chart(chart_id, access_token=api_token)
+    result = {
+        "chart_id": chart.chart_id,
+        "title": chart.title,
+        "type": chart.chart_type,
+        "public_url": chart.get_public_url(),
+        "edit_url": chart.get_editor_url(),
+    }
+    return [TextContent(type="text", text=json.dumps(result, indent=2))]

datawrapper_mcp/handlers/schema.py ADDED Viewed

	@@ -0,0 +1,31 @@

+"""Handler for retrieving chart schemas."""
+import json
+from mcp.types import TextContent
+from ..config import CHART_CLASSES
+async def get_chart_schema(arguments: dict) -> list[TextContent]:
+    """Get the Pydantic schema for a chart type."""
+    chart_type = arguments["chart_type"]
+    chart_class = CHART_CLASSES[chart_type]
+    schema = chart_class.model_json_schema()
+    # Remove examples that contain DataFrames (not JSON serializable)
+    if "examples" in schema:
+        del schema["examples"]
+    result = {
+        "chart_type": chart_type,
+        "class_name": chart_class.__name__,
+        "schema": schema,
+        "usage": (
+            "Use this schema to construct a chart_config dict for create_chart_advanced. "
+            "The schema shows all available properties, their types, and descriptions."
+        ),
+    }
+    return [TextContent(type="text", text=json.dumps(result, indent=2))]

datawrapper_mcp/handlers/update.py ADDED Viewed

	@@ -0,0 +1,61 @@

+"""Handler for updating Datawrapper charts."""
+import json
+from datawrapper import get_chart
+from mcp.types import TextContent
+from ..utils import get_api_token, json_to_dataframe
+async def update_chart(arguments: dict) -> list[TextContent]:
+    """Update an existing chart's data or configuration."""
+    api_token = get_api_token()
+    chart_id = arguments["chart_id"]
+    # Get chart using factory function - returns correct Pydantic class instance
+    chart = get_chart(chart_id, access_token=api_token)
+    # Update data if provided
+    if "data" in arguments:
+        df = json_to_dataframe(arguments["data"])
+        chart.data = df
+    # Update config if provided
+    if "chart_config" in arguments:
+        # Directly set attributes on the chart instance
+        # Pydantic will validate each assignment automatically due to validate_assignment=True
+        try:
+            # Build a mapping of aliases to field names
+            alias_to_field = {}
+            for field_name, field_info in chart.model_fields.items():
+                # Add the field name itself
+                alias_to_field[field_name] = field_name
+                # Add any aliases
+                if field_info.alias:
+                    alias_to_field[field_info.alias] = field_name
+            for key, value in arguments["chart_config"].items():
+                # Convert alias to field name if needed
+                field_name = alias_to_field.get(key, key)
+                setattr(chart, field_name, value)
+        except Exception as e:
+            return [
+                TextContent(
+                    type="text",
+                    text=f"Invalid chart configuration: {str(e)}\n\n"
+                    f"Use get_chart_schema to see the valid schema for this chart type. "
+                    f"Only high-level Pydantic fields are accepted.",
+                )
+            ]
+    # Update using Pydantic instance method
+    chart.update(access_token=api_token)
+    result = {
+        "chart_id": chart.chart_id,
+        "message": "Chart updated successfully!",
+        "edit_url": chart.get_editor_url(),
+    }
+    return [TextContent(type="text", text=json.dumps(result, indent=2))]

datawrapper_mcp/server.py ADDED Viewed

	@@ -0,0 +1,101 @@

+"""Main MCP server implementation for Datawrapper chart creation."""
+import json
+from typing import Any, Sequence
+from mcp.server import Server
+from mcp.types import ImageContent, Resource, TextContent
+from pydantic import AnyUrl
+from .config import CHART_CLASSES
+from .handlers import (
+    create_chart,
+    delete_chart,
+    export_chart_png,
+    get_chart_info,
+    get_chart_schema,
+    publish_chart,
+    update_chart,
+)
+from .tools import list_tools as get_tool_list
+# Initialize the MCP server
+app = Server("datawrapper-mcp")
+@app.list_resources()
+async def list_resources() -> list[Resource]:
+    """List available resources."""
+    return [
+        Resource(
+            uri=AnyUrl("datawrapper://chart-types"),
+            name="Available Chart Types",
+            mimeType="application/json",
+            description="List of available Datawrapper chart types and their Pydantic schemas",
+        )
+    ]
+@app.read_resource()
+async def read_resource(uri: AnyUrl) -> str:
+    """Read a resource by URI."""
+    if str(uri) == "datawrapper://chart-types":
+        chart_info = {}
+        for name, chart_class in CHART_CLASSES.items():
+            chart_info[name] = {
+                "class_name": chart_class.__name__,
+                "schema": chart_class.model_json_schema(),
+            }
+        return json.dumps(chart_info, indent=2)
+    raise ValueError(f"Unknown resource URI: {uri}")
+@app.list_tools()
+async def list_tools():
+    """List available tools."""
+    return await get_tool_list()
+@app.call_tool()
+async def call_tool(name: str, arguments: Any) -> Sequence[TextContent | ImageContent]:
+    """Handle tool calls."""
+    try:
+        if name == "create_chart":
+            return await create_chart(arguments)
+        elif name == "get_chart_schema":
+            return await get_chart_schema(arguments)
+        elif name == "publish_chart":
+            return await publish_chart(arguments)
+        elif name == "get_chart":
+            return await get_chart_info(arguments)
+        elif name == "update_chart":
+            return await update_chart(arguments)
+        elif name == "delete_chart":
+            return await delete_chart(arguments)
+        elif name == "export_chart_png":
+            return await export_chart_png(arguments)
+        else:
+            raise ValueError(f"Unknown tool: {name}")
+    except Exception as e:
+        return [TextContent(type="text", text=f"Error: {str(e)}")]
+def main():
+    """Run the MCP server."""
+    import asyncio
+    from mcp.server.stdio import stdio_server
+    async def run():
+        async with stdio_server() as (read_stream, write_stream):
+            await app.run(
+                read_stream,
+                write_stream,
+                app.create_initialization_options(),
+            )
+    asyncio.run(run())
+if __name__ == "__main__":
+    main()

datawrapper_mcp/tools.py ADDED Viewed

	@@ -0,0 +1,286 @@

+"""Tool definitions for the Datawrapper MCP server."""
+from mcp.types import Tool
+from .config import CHART_CLASSES
+async def list_tools() -> list[Tool]:
+    """List available tools."""
+    return [
+        Tool(
+            name="create_chart",
+            description=(
+                "⚠️ THIS IS THE DATAWRAPPER INTEGRATION ⚠️\n"
+                "Use this MCP tool for ALL Datawrapper chart creation.\n\n"
+                "DO NOT:\n"
+                "❌ Install the 'datawrapper' Python package\n"
+                "❌ Use the Datawrapper API directly\n"
+                "❌ Import 'from datawrapper import ...'\n"
+                "❌ Run pip install datawrapper\n\n"
+                "This MCP server IS the complete Datawrapper integration. All Datawrapper operations "
+                "should use the MCP tools provided by this server.\n\n"
+                "---\n\n"
+                "Create a Datawrapper chart with full control using Pydantic models. "
+                "This allows you to specify all chart properties including title, description, "
+                "visualization settings, axes, colors, and more. The chart_config should "
+                "be a complete Pydantic model dict matching the schema for the chosen chart type.\n\n"
+                "STYLING WORKFLOW:\n"
+                "1. Use get_chart_schema to explore all available options for your chart type\n"
+                "2. Refer to https://datawrapper.readthedocs.io/en/latest/ for detailed examples\n"
+                "3. Build your chart_config with the desired styling properties\n\n"
+                "Common styling patterns:\n"
+                '- Colors: {"color_category": {"sales": "#1d81a2", "profit": "#15607a"}}\n'
+                '- Line styling: {"lines": [{"column": "sales", "width": "style1", "interpolation": "curved"}]}\n'
+                '- Axis ranges: {"custom_range_y": [0, 100], "custom_range_x": [2020, 2024]}\n'
+                '- Grid formatting: {"y_grid_format": "0", "x_grid": "on", "y_grid": "on"}\n'
+                '- Tooltips: {"tooltip_number_format": "00.00", "tooltip_x_format": "YYYY"}\n'
+                '- Annotations: {"text_annotations": [{"x": "2023", "y": 50, "text": "Peak"}]}\n\n'
+                "See the documentation for chart-type specific examples and advanced patterns.\n\n"
+                'Example data format: [{"date": "2024-01", "value": 100}, {"date": "2024-02", "value": 150}]'
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "data": {
+                        "type": ["string", "array", "object"],
+                        "description": (
+                            "Chart data. RECOMMENDED: Pass data inline as a list or dict.\n\n"
+                            "PREFERRED FORMATS (use these first):\n\n"
+                            "1. List of records (RECOMMENDED):\n"
+                            '   [{"year": 2020, "sales": 100}, {"year": 2021, "sales": 150}]\n\n'
+                            "2. Dict of arrays:\n"
+                            '   {"year": [2020, 2021], "sales": [100, 150]}\n\n'
+                            "3. JSON string of format 1 or 2:\n"
+                            '   \'[{"year": 2020, "sales": 100}]\'\n\n'
+                            "ALTERNATIVE (only for extremely large datasets where inline data is impractical):\n\n"
+                            "4. File path to CSV or JSON:\n"
+                            '   "/path/to/data.csv" or "/path/to/data.json"\n'
+                            "   - Use only when inline data would be too large to pass directly\n"
+                            "   - CSV files are read directly\n"
+                            "   - JSON files must contain list of dicts or dict of arrays"
+                        ),
+                    },
+                    "chart_type": {
+                        "type": "string",
+                        "enum": list(CHART_CLASSES.keys()),
+                        "description": "Type of chart to create",
+                    },
+                    "chart_config": {
+                        "type": "object",
+                        "description": (
+                            "Complete chart configuration as a Pydantic model dict. "
+                            "Must match the schema for the chosen chart_type. "
+                            "Use get_chart_schema to see the full schema."
+                        ),
+                    },
+                },
+                "required": ["data", "chart_type", "chart_config"],
+            },
+        ),
+        Tool(
+            name="get_chart_schema",
+            description=(
+                "⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
+                "This is part of the Datawrapper MCP server integration.\n\n"
+                "---\n\n"
+                "Get the Pydantic JSON schema for a specific chart type. This is your primary tool "
+                "for discovering styling and configuration options.\n\n"
+                "The schema shows:\n"
+                "- All available properties and their types\n"
+                "- Enum values (e.g., line widths, interpolation methods)\n"
+                "- Default values\n"
+                "- Detailed descriptions for each property\n\n"
+                "WORKFLOW: Use this tool first to explore options, then refer to "
+                "https://datawrapper.readthedocs.io/en/latest/ for detailed examples and patterns "
+                "showing how to use these properties in practice."
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "chart_type": {
+                        "type": "string",
+                        "enum": list(CHART_CLASSES.keys()),
+                        "description": "Chart type to get schema for",
+                    },
+                },
+                "required": ["chart_type"],
+            },
+        ),
+        Tool(
+            name="publish_chart",
+            description=(
+                "⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
+                "This is part of the Datawrapper MCP server integration.\n\n"
+                "---\n\n"
+                "Publish a Datawrapper chart to make it publicly accessible. "
+                "Returns the public URL of the published chart. "
+                "IMPORTANT: Only use this tool when the user explicitly requests to publish the chart. "
+                "Do not automatically publish charts after creation unless specifically asked."
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "chart_id": {
+                        "type": "string",
+                        "description": "ID of the chart to publish",
+                    },
+                },
+                "required": ["chart_id"],
+            },
+        ),
+        Tool(
+            name="get_chart",
+            description=(
+                "⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
+                "This is part of the Datawrapper MCP server integration.\n\n"
+                "---\n\n"
+                "Get information about an existing Datawrapper chart, "
+                "including its metadata, data, and public URL if published."
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "chart_id": {
+                        "type": "string",
+                        "description": "ID of the chart to retrieve",
+                    },
+                },
+                "required": ["chart_id"],
+            },
+        ),
+        Tool(
+            name="update_chart",
+            description=(
+                "⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
+                "This is part of the Datawrapper MCP server integration.\n\n"
+                "---\n\n"
+                "Update an existing Datawrapper chart's data or configuration using Pydantic models. "
+                "IMPORTANT: The chart_config must use high-level Pydantic fields only (title, intro, "
+                "byline, source_name, source_url, etc.). Do NOT use low-level serialized structures "
+                "like 'metadata', 'visualize', or other internal API fields.\n\n"
+                "STYLING UPDATES:\n"
+                "Use get_chart_schema to see available fields, then apply styling changes:\n"
+                '- Colors: {"color_category": {"sales": "#ff0000"}}\n'
+                '- Line properties: {"lines": [{"column": "sales", "width": "style2"}]}\n'
+                '- Axis settings: {"custom_range_y": [0, 200], "y_grid_format": "0,0"}\n'
+                '- Tooltips: {"tooltip_number_format": "0.0"}\n\n'
+                "See https://datawrapper.readthedocs.io/en/latest/ for detailed examples. "
+                "The provided config will be validated through Pydantic and merged with the existing "
+                "chart configuration."
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "chart_id": {
+                        "type": "string",
+                        "description": "ID of the chart to update",
+                    },
+                    "data": {
+                        "type": ["string", "array", "object"],
+                        "description": (
+                            "Chart data. RECOMMENDED: Pass data inline as a list or dict.\n\n"
+                            "PREFERRED FORMATS (use these first):\n\n"
+                            "1. List of records (RECOMMENDED):\n"
+                            '   [{"year": 2020, "sales": 100}, {"year": 2021, "sales": 150}]\n\n'
+                            "2. Dict of arrays:\n"
+                            '   {"year": [2020, 2021], "sales": [100, 150]}\n\n'
+                            "3. JSON string of format 1 or 2:\n"
+                            '   \'[{"year": 2020, "sales": 100}]\'\n\n'
+                            "ALTERNATIVE (only for extremely large datasets where inline data is impractical):\n\n"
+                            "4. File path to CSV or JSON:\n"
+                            '   "/path/to/data.csv" or "/path/to/data.json"\n'
+                            "   - Use only when inline data would be too large to pass directly\n"
+                            "   - CSV files are read directly\n"
+                            "   - JSON files must contain list of dicts or dict of arrays"
+                        ),
+                    },
+                    "chart_config": {
+                        "type": "object",
+                        "description": (
+                            "Updated chart configuration using high-level Pydantic fields (optional). "
+                            "Must use Pydantic model fields like 'title', 'intro', 'byline', etc. "
+                            "Do NOT use raw API structures like 'metadata' or 'visualize'. "
+                            "Use get_chart_schema to see valid fields. Will be validated and merged "
+                            "with existing config."
+                        ),
+                    },
+                },
+                "required": ["chart_id"],
+            },
+        ),
+        Tool(
+            name="delete_chart",
+            description=(
+                "⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
+                "This is part of the Datawrapper MCP server integration.\n\n"
+                "---\n\n"
+                "Delete a Datawrapper chart permanently."
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "chart_id": {
+                        "type": "string",
+                        "description": "ID of the chart to delete",
+                    },
+                },
+                "required": ["chart_id"],
+            },
+        ),
+        Tool(
+            name="export_chart_png",
+            description=(
+                "⚠️ DATAWRAPPER MCP TOOL ⚠️\n"
+                "This is part of the Datawrapper MCP server integration.\n\n"
+                "---\n\n"
+                "Export a Datawrapper chart as PNG and display it inline. "
+                "The chart must be created first using create_chart. "
+                "Supports high-resolution output via the zoom parameter. "
+                "IMPORTANT: Only use this tool when the user explicitly requests to see the chart image "
+                "or export it as PNG. Do not automatically export charts after creation unless specifically asked."
+            ),
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "chart_id": {
+                        "type": "string",
+                        "description": "ID of the chart to export",
+                    },
+                    "width": {
+                        "type": "integer",
+                        "description": "Width of the image in pixels (optional, uses chart width if not specified)",
+                    },
+                    "height": {
+                        "type": "integer",
+                        "description": "Height of the image in pixels (optional, uses chart height if not specified)",
+                    },
+                    "plain": {
+                        "type": "boolean",
+                        "description": "If true, exports only the visualization without header/footer (default: false)",
+                        "default": False,
+                    },
+                    "zoom": {
+                        "type": "integer",
+                        "description": "Scale multiplier for resolution, e.g., 2 = 2x resolution (default: 2)",
+                        "default": 2,
+                    },
+                    "transparent": {
+                        "type": "boolean",
+                        "description": "If true, exports with transparent background (default: false)",
+                        "default": False,
+                    },
+                    "border_width": {
+                        "type": "integer",
+                        "description": "Margin around visualization in pixels (default: 0)",
+                        "default": 0,
+                    },
+                    "border_color": {
+                        "type": "string",
+                        "description": "Color of the border, e.g., '#FFFFFF' (optional, uses chart background if not specified)",
+                    },
+                },
+                "required": ["chart_id"],
+            },
+        ),
+    ]

datawrapper_mcp/utils.py ADDED Viewed

	@@ -0,0 +1,118 @@

+"""Utility functions for the Datawrapper MCP server."""
+import json
+import os
+import pandas as pd
+def get_api_token() -> str:
+    """Get the Datawrapper API token from environment."""
+    api_token = os.environ.get("DATAWRAPPER_ACCESS_TOKEN")
+    if not api_token:
+        raise ValueError(
+            "DATAWRAPPER_ACCESS_TOKEN environment variable is required. "
+            "Get your token from https://app.datawrapper.de/account/api-tokens"
+        )
+    return api_token
+def json_to_dataframe(data: str | list | dict) -> pd.DataFrame:
+    """Convert JSON data to a pandas DataFrame.
+    Args:
+        data: One of:
+            - File path to CSV or JSON file (e.g., "/path/to/data.csv")
+            - List of records: [{"col1": val1, "col2": val2}, ...]
+            - Dict of arrays: {"col1": [val1, val2], "col2": [val3, val4]}
+            - JSON string in either format above
+    Returns:
+        pandas DataFrame
+    Examples:
+        >>> json_to_dataframe("/tmp/data.csv")
+        >>> json_to_dataframe("/tmp/data.json")
+        >>> json_to_dataframe([{"a": 1, "b": 2}, {"a": 3, "b": 4}])
+        >>> json_to_dataframe({"a": [1, 3], "b": [2, 4]})
+        >>> json_to_dataframe('[{"a": 1, "b": 2}]')
+    """
+    if isinstance(data, str):
+        # Check if it's a file path that exists
+        if os.path.isfile(data):
+            if data.endswith(".csv"):
+                return pd.read_csv(data)
+            elif data.endswith(".json"):
+                with open(data) as f:
+                    file_data = json.load(f)
+                # Recursively process the loaded JSON data
+                return json_to_dataframe(file_data)
+            else:
+                raise ValueError(
+                    f"Unsupported file type: {data}\n\n"
+                    "Supported file types:\n"
+                    "  - .csv (CSV files)\n"
+                    "  - .json (JSON files containing list of dicts or dict of arrays)"
+                )
+        # Check if it looks like CSV content (not a file path)
+        if "\n" in data and "," in data and not data.strip().startswith(("[", "{")):
+            raise ValueError(
+                "CSV strings are not supported. Please save to a file first.\n\n"
+                "Options:\n"
+                "  1. Save CSV to a file and pass the file path\n"
+                '  2. Parse CSV to list of dicts: [{"col": val}, ...]\n'
+                '  3. Parse CSV to dict of arrays: {"col": [vals]}\n\n'
+                "Example:\n"
+                '  data = [{"year": 2020, "value": 100}, {"year": 2021, "value": 150}]'
+            )
+        # Try to parse as JSON string
+        try:
+            data = json.loads(data)
+        except json.JSONDecodeError as e:
+            raise ValueError(
+                f"Invalid JSON string: {e}\n\n"
+                "Expected one of:\n"
+                "  1. File path: '/path/to/data.csv' or '/path/to/data.json'\n"
+                '  2. JSON string: \'[{"year": 2020, "value": 100}, ...]\'\n'
+                '  3. JSON string: \'{"year": [2020, 2021], "value": [100, 150]}\''
+            )
+    if isinstance(data, list):
+        if not data:
+            raise ValueError(
+                "Data list is empty. Please provide at least one row of data."
+            )
+        if not all(isinstance(item, dict) for item in data):
+            raise ValueError(
+                "List format must contain dictionaries.\n\n"
+                "Expected format:\n"
+                '  [{"year": 2020, "value": 100}, {"year": 2021, "value": 150}]\n\n'
+                f"Got: {type(data[0]).__name__} in list"
+            )
+        # List of records: [{"col1": val1, "col2": val2}, ...]
+        return pd.DataFrame(data)
+    elif isinstance(data, dict):
+        if not data:
+            raise ValueError(
+                "Data dict is empty. Please provide at least one column of data."
+            )
+        # Check if it's a dict of arrays (all values should be lists)
+        if not all(isinstance(v, list) for v in data.values()):
+            raise ValueError(
+                "Dict format must have lists as values.\n\n"
+                "Expected format:\n"
+                '  {"year": [2020, 2021], "value": [100, 150]}\n\n'
+                f"Got dict with values of type: {[type(v).__name__ for v in data.values()]}"
+            )
+        # Dict of arrays: {"col1": [val1, val2], "col2": [val3, val4]}
+        return pd.DataFrame(data)
+    else:
+        raise ValueError(
+            f"Unsupported data type: {type(data).__name__}\n\n"
+            "Data must be one of:\n"
+            '  1. List of dicts: [{"year": 2020, "value": 100}, ...]\n'
+            '  2. Dict of arrays: {"year": [2020, 2021], "value": [100, 150]}\n'
+            "  3. JSON string in either format above"
+        )

requirements.txt CHANGED Viewed

@@ -12,3 +12,8 @@ python-dotenv>=1.0.0
 # Utilities
 pydantic>=2.0.0

 # Utilities
 pydantic>=2.0.0
+# Datawrapper chart creation
+datawrapper>=2.0.7
+mcp>=1.20.0
+pandas>=2.0.0

src/datawrapper_client.py ADDED Viewed

	@@ -0,0 +1,336 @@

+"""
+Datawrapper Chart Generation Client
+Integrates RAG pipeline with Datawrapper API for intelligent chart creation.
+"""
+import json
+import os
+from typing import Optional, Tuple
+import pandas as pd
+from .prompts import (
+    CHART_SELECTION_SYSTEM_PROMPT,
+    get_chart_selection_prompt,
+    get_chart_styling_prompt
+)
+from .llm_client import create_llm_client
+from .rag_pipeline import GraphicsDesignPipeline
+# Import Datawrapper MCP handlers directly
+from datawrapper_mcp.handlers.create import create_chart as mcp_create_chart
+from datawrapper_mcp.handlers.publish import publish_chart as mcp_publish_chart
+from datawrapper_mcp.handlers.retrieve import get_chart_info as mcp_get_chart_info
+def get_data_summary(df: pd.DataFrame) -> str:
+    """
+    Generate a summary of the DataFrame structure and content.
+    Args:
+        df: Input DataFrame
+    Returns:
+        String summary of data characteristics
+    """
+    summary_parts = []
+    # Basic info
+    summary_parts.append(f"Rows: {len(df)}, Columns: {len(df.columns)}")
+    summary_parts.append(f"Column names: {', '.join(df.columns.tolist())}")
+    # Column types
+    numeric_cols = df.select_dtypes(include=['number']).columns.tolist()
+    text_cols = df.select_dtypes(include=['object']).columns.tolist()
+    date_cols = df.select_dtypes(include=['datetime']).columns.tolist()
+    if numeric_cols:
+        summary_parts.append(f"Numeric columns: {', '.join(numeric_cols)}")
+    if text_cols:
+        summary_parts.append(f"Text columns: {', '.join(text_cols)}")
+    if date_cols:
+        summary_parts.append(f"Date columns: {', '.join(date_cols)}")
+    # Data preview (first 3 rows)
+    summary_parts.append(f"\nData preview:\n{df.head(3).to_string()}")
+    return "\n".join(summary_parts)
+def analyze_csv_for_chart_type(
+    df: pd.DataFrame,
+    user_prompt: str,
+    rag_pipeline: GraphicsDesignPipeline
+) -> Tuple[str, str]:
+    """
+    Use RAG and LLM to determine the best chart type for the data.
+    Args:
+        df: Input DataFrame
+        user_prompt: User's description of what they want to visualize
+        rag_pipeline: RAG pipeline for retrieving best practices
+    Returns:
+        Tuple of (chart_type, reasoning)
+    """
+    # Get data summary
+    data_summary = get_data_summary(df)
+    # Query RAG for chart selection best practices
+    rag_query = f"chart type selection for {user_prompt}"
+    relevant_docs = rag_pipeline.retrieve_documents(rag_query, k=3)
+    rag_context = rag_pipeline.vectorstore.format_documents_for_context(relevant_docs)
+    # Generate chart type recommendation using LLM
+    chart_prompt = get_chart_selection_prompt()
+    full_prompt = chart_prompt.format(
+        user_prompt=user_prompt,
+        data_summary=data_summary,
+        rag_context=rag_context
+    )
+    llm_client = create_llm_client(
+        model=os.getenv("LLM_MODEL", "meta-llama/Llama-3.1-8B-Instruct"),
+        temperature=0.3,  # Lower temperature for more deterministic chart selection
+        max_tokens=500
+    )
+    response = llm_client.generate(
+        prompt=full_prompt,
+        system_prompt=CHART_SELECTION_SYSTEM_PROMPT
+    )
+    # Parse JSON response
+    try:
+        # Extract JSON from response (handle markdown code blocks)
+        response_clean = response.strip()
+        if "```json" in response_clean:
+            response_clean = response_clean.split("```json")[1].split("```")[0].strip()
+        elif "```" in response_clean:
+            response_clean = response_clean.split("```")[1].split("```")[0].strip()
+        result = json.loads(response_clean)
+        chart_type = result.get("chart_type", "line")
+        reasoning = result.get("reasoning", "")
+        # Validate chart type
+        valid_types = ["bar", "line", "area", "scatter", "column", "stacked_bar", "arrow", "multiple_column"]
+        if chart_type not in valid_types:
+            chart_type = "line"  # Default fallback
+        return chart_type, reasoning
+    except Exception as e:
+        print(f"Error parsing chart type response: {e}")
+        print(f"Response was: {response}")
+        # Default to line chart
+        return "line", "Using default line chart due to parsing error"
+def generate_chart_config(
+    chart_type: str,
+    df: pd.DataFrame,
+    user_prompt: str,
+    rag_pipeline: GraphicsDesignPipeline
+) -> dict:
+    """
+    Generate Datawrapper chart configuration using RAG and LLM.
+    Args:
+        chart_type: Type of chart to create
+        df: Input DataFrame
+        user_prompt: User's visualization request
+        rag_pipeline: RAG pipeline for retrieving design best practices
+    Returns:
+        Dictionary with chart configuration
+    """
+    # Get data summary
+    data_summary = get_data_summary(df)
+    # Query RAG for styling and design best practices
+    rag_query = f"chart design best practices colors accessibility {chart_type}"
+    relevant_docs = rag_pipeline.retrieve_documents(rag_query, k=3)
+    rag_context = rag_pipeline.vectorstore.format_documents_for_context(relevant_docs)
+    # Generate chart configuration using LLM
+    styling_prompt = get_chart_styling_prompt()
+    full_prompt = styling_prompt.format(
+        chart_type=chart_type,
+        user_prompt=user_prompt,
+        data_summary=data_summary,
+        rag_context=rag_context
+    )
+    llm_client = create_llm_client(
+        model=os.getenv("LLM_MODEL", "meta-llama/Llama-3.1-8B-Instruct"),
+        temperature=0.5,
+        max_tokens=800
+    )
+    response = llm_client.generate(
+        prompt=full_prompt,
+        system_prompt="You are a data visualization expert. Generate valid JSON configuration for Datawrapper charts."
+    )
+    # Parse JSON response
+    try:
+        # Extract JSON from response
+        response_clean = response.strip()
+        if "```json" in response_clean:
+            response_clean = response_clean.split("```json")[1].split("```")[0].strip()
+        elif "```" in response_clean:
+            response_clean = response_clean.split("```")[1].split("```")[0].strip()
+        config = json.loads(response_clean)
+        # Ensure basic required fields
+        if "title" not in config:
+            config["title"] = user_prompt[:100]  # Use prompt as fallback title
+        return config
+    except Exception as e:
+        print(f"Error parsing chart config: {e}")
+        print(f"Response was: {response}")
+        # Return minimal config
+        return {
+            "title": user_prompt[:100] if user_prompt else "Data Visualization",
+            "source_name": "User Data"
+        }
+async def create_and_publish_chart(
+    df: pd.DataFrame,
+    user_prompt: str,
+    rag_pipeline: GraphicsDesignPipeline,
+    api_token: Optional[str] = None
+) -> dict:
+    """
+    Complete workflow: analyze data, select chart type, create and publish chart.
+    Args:
+        df: Input DataFrame
+        user_prompt: User's visualization request
+        rag_pipeline: RAG pipeline instance
+        api_token: Datawrapper API token (defaults to env var)
+    Returns:
+        Dictionary with chart info including iframe URL
+    """
+    if api_token is None:
+        api_token = os.getenv("DATAWRAPPER_ACCESS_TOKEN")
+        if not api_token:
+            raise ValueError("DATAWRAPPER_ACCESS_TOKEN not found in environment")
+    try:
+        # Step 1: Analyze data and select chart type
+        chart_type, reasoning = analyze_csv_for_chart_type(df, user_prompt, rag_pipeline)
+        # Step 2: Generate chart configuration
+        chart_config = generate_chart_config(chart_type, df, user_prompt, rag_pipeline)
+        # Step 3: Convert DataFrame to list of dicts for Datawrapper
+        data_list = df.to_dict('records')
+        # Step 4: Create chart using MCP handler
+        create_args = {
+            "data": data_list,
+            "chart_type": chart_type,
+            "chart_config": chart_config
+        }
+        create_result = await mcp_create_chart(create_args)
+        if not create_result or len(create_result) == 0:
+            raise ValueError("Empty response from chart creation")
+        result_text = create_result[0].text
+        if not result_text or result_text.strip() == "":
+            raise ValueError("Empty text in chart creation response")
+        result_data = json.loads(result_text)
+        chart_id = result_data.get("chart_id")
+        if not chart_id:
+            raise ValueError(f"Failed to get chart_id from creation response. Response was: {result_data}")
+        # Step 5: Try to publish chart using MCP handler
+        publish_success = False
+        publish_message = ""
+        try:
+            publish_args = {"chart_id": chart_id}
+            publish_result = await mcp_publish_chart(publish_args)
+            publish_text = publish_result[0].text
+            publish_data = json.loads(publish_text)
+            publish_success = True
+            publish_message = publish_data.get("message", "Published successfully")
+        except Exception as publish_error:
+            publish_message = f"Publish failed: {str(publish_error)}"
+        # Step 6: Get full chart info using MCP handler
+        chart_info_args = {"chart_id": chart_id}
+        chart_info_result = await mcp_get_chart_info(chart_info_args)
+        chart_info_text = chart_info_result[0].text
+        chart_info = json.loads(chart_info_text)
+        # Return complete info
+        return {
+            "success": True,
+            "chart_id": chart_id,
+            "chart_type": chart_type,
+            "reasoning": reasoning,
+            "public_url": chart_info.get("public_url"),
+            "edit_url": chart_info.get("edit_url"),
+            "published": publish_success,
+            "publish_message": publish_message,
+            "title": chart_config.get("title", "Chart")
+        }
+    except json.JSONDecodeError as e:
+        error_msg = f"JSON parsing error: {str(e)}"
+        print(f"Error in chart creation: {error_msg}")
+        print(f"Failed to parse: {result_text if 'result_text' in locals() else 'N/A'}")
+        return {
+            "success": False,
+            "error": error_msg,
+            "chart_type": chart_type if 'chart_type' in locals() else None,
+            "public_url": None
+        }
+    except Exception as e:
+        error_msg = f"{type(e).__name__}: {str(e)}"
+        print(f"Error in chart creation: {error_msg}")
+        import traceback
+        traceback.print_exc()
+        return {
+            "success": False,
+            "error": error_msg,
+            "chart_type": chart_type if 'chart_type' in locals() else None,
+            "public_url": None
+        }
+def get_iframe_html(chart_url: str, height: int = 600) -> str:
+    """
+    Generate iframe HTML for embedding a Datawrapper chart.
+    Args:
+        chart_url: Public URL of the chart
+        height: Height of iframe in pixels
+    Returns:
+        HTML string with iframe
+    """
+    if not chart_url:
+        return "<div style='padding: 50px; text-align: center;'>No chart available</div>"
+    return f"""
+    <div style="width: 100%; height: {height}px;">
+        <iframe
+            src="{chart_url}"
+            style="width: 100%; height: 100%; border: none;"
+            frameborder="0"
+            scrolling="no"
+            aria-label="Chart">
+        </iframe>
+    </div>
+    """

src/prompts.py CHANGED Viewed

@@ -126,3 +126,110 @@ def get_followup_prompt() -> SimplePromptTemplate:
 def get_technique_recommendation_prompt() -> SimplePromptTemplate:
     """Get the technique recommendation prompt template"""
     return TECHNIQUE_RECOMMENDATION_PROMPT

 def get_technique_recommendation_prompt() -> SimplePromptTemplate:
     """Get the technique recommendation prompt template"""
     return TECHNIQUE_RECOMMENDATION_PROMPT
+# =============================================================================
+# CHART GENERATION PROMPTS (for Datawrapper integration)
+# =============================================================================
+CHART_SELECTION_SYSTEM_PROMPT = """You are an expert data visualization advisor specialized in selecting the optimal chart type for data storytelling.
+Your task is to analyze:
+1. The user's intent and goal (what story they want to tell)
+2. The structure and characteristics of their data
+3. Best practices from visualization research
+You must respond with a JSON object containing:
+- "chart_type": one of [bar, line, area, scatter, column, stacked_bar, arrow, multiple_column]
+- "reasoning": brief explanation of why this chart type is best
+- "data_insights": key patterns or features in the data that inform the choice"""
+CHART_SELECTION_PROMPT_TEMPLATE = """USER REQUEST: {user_prompt}
+DATA STRUCTURE:
+{data_summary}
+VISUALIZATION BEST PRACTICES (from knowledge base):
+{rag_context}
+Based on the user's request, the data characteristics, and visualization best practices:
+1. Analyze the data type:
+   - Time series → line, area charts
+   - Categorical comparisons → bar, column charts
+   - Correlations/relationships → scatter plots
+   - Part-to-whole → stacked bar charts
+   - Change/movement → arrow charts
+   - Multiple categories over time → multiple column charts
+2. Consider the user's storytelling goal:
+   - Showing trends over time
+   - Comparing categories
+   - Revealing correlations
+   - Displaying composition
+   - Highlighting change
+3. Apply best practices from research:
+   - Accessibility and clarity
+   - Appropriate for data density
+   - Effective for the message
+Respond with a JSON object only:
+{{
+    "chart_type": "one of [bar, line, area, scatter, column, stacked_bar, arrow, multiple_column]",
+    "reasoning": "why this chart type is optimal for this data and intent",
+    "data_insights": "key patterns that inform the visualization approach"
+}}"""
+CHART_STYLING_PROMPT_TEMPLATE = """You are creating a Datawrapper {chart_type} chart configuration.
+USER REQUEST: {user_prompt}
+DATA STRUCTURE:
+{data_summary}
+DESIGN BEST PRACTICES (from knowledge base):
+{rag_context}
+IMPORTANT: You must ONLY include these fields in your JSON response:
+- title (string, required): Clear, descriptive chart title
+- intro (string, optional): Brief explanation
+- byline (string, optional): Author/source attribution
+- source_name (string, optional): Data source name
+- source_url (string, optional): Link to data source
+DO NOT include any other fields like:
+- styling, options, data, chart_type, colors, labels, annotations, tooltips
+- metadata, visualize, or any internal fields
+These other fields will cause validation errors. Keep it simple with just the 5 fields listed above.
+Example valid response:
+{{
+  "title": "Sales Trends 2024",
+  "intro": "Monthly sales showing 30% growth",
+  "source_name": "Company Data",
+  "source_url": "https://example.com"
+}}
+Generate a minimal, valid JSON configuration with ONLY the allowed fields above."""
+CHART_SELECTION_PROMPT = SimplePromptTemplate(
+    template=CHART_SELECTION_PROMPT_TEMPLATE,
+    input_variables=["user_prompt", "data_summary", "rag_context"]
+)
+CHART_STYLING_PROMPT = SimplePromptTemplate(
+    template=CHART_STYLING_PROMPT_TEMPLATE,
+    input_variables=["chart_type", "user_prompt", "data_summary", "rag_context"]
+)
+def get_chart_selection_prompt() -> SimplePromptTemplate:
+    """Get the chart type selection prompt template"""
+    return CHART_SELECTION_PROMPT
+def get_chart_styling_prompt() -> SimplePromptTemplate:
+    """Get the chart styling configuration prompt template"""
+    return CHART_STYLING_PROMPT

start.sh ADDED Viewed

	@@ -0,0 +1,31 @@

+#!/bin/bash
+# Start script for Viz LLM with Datawrapper integration
+echo "🚀 Starting Viz LLM..."
+echo ""
+# Check for required environment variables
+if [ ! -f .env ]; then
+    echo "⚠️  Error: .env file not found!"
+    echo "Please create a .env file based on .env.example"
+    exit 1
+fi
+# Check if required packages are installed
+echo "📦 Checking dependencies..."
+python -c "import gradio; import datawrapper; import pandas; import mcp" 2>/dev/null
+if [ $? -ne 0 ]; then
+    echo "⚠️  Some dependencies are missing. Installing..."
+    pip install -r requirements.txt
+fi
+echo ""
+echo "✓ Dependencies OK"
+echo ""
+echo "Starting Gradio app..."
+echo "Once started, open your browser to: http://localhost:7860"
+echo ""
+# Run the app
+python app.py