A newer version of the Gradio SDK is available:
6.1.0
title: Science storyteller
sdk: gradio
emoji: ๐
pinned: true
short_description: science told with ease
๐ง Science Storyteller: Research to Podcast
MCP's 1st Birthday Hackathon Submission
Track: Track 2 - MCP in Action (Multimodal)
Tag: mcp-in-action-track-multimodal
๐ฏ Project Overview
Science Storyteller transforms complex scientific research papers into accessible, engaging audio podcasts. Enter any research topic, and our AI-powered system will:
- Search for relevant papers using Semantic Scholar API (all research fields)
- Analyze and summarize the research using Claude AI
- Generate an engaging podcast script optimized for storytelling
- Convert to audio using Kokoro-82M (HF Inference API) - high-quality, open-source
- Deliver a complete podcast episode you can listen to anywhere
This project makes cutting-edge science accessible to everyoneโfrom researchers to curious learnersโthrough the power of audio storytelling.
โจ Key Features
๐ค Autonomous Agent Behavior
- Planning: Intelligently enhances search queries for better results
- Reasoning: Evaluates and selects the most relevant paper from multiple results
- Execution: Orchestrates multi-step workflow from search to audio generation
- Self-correction: Implements fallback strategies when API calls fail
๐ง Direct API Integration
- Semantic Scholar API: Research paper retrieval across all scientific fields
- Direct HTTP requests: Simple, reliable, production-ready (no MCP subprocess overhead)
- Claude AI: Advanced summarization and script generation via Anthropic API
- Proper error handling: Retry logic, rate limiting, fallback strategies
๐จ Polished User Experience
- Clean, responsive Gradio interface
- Real-time progress indicators
- Mobile-friendly design
- Example topics for quick start
- Tabbed output (Audio, Summary, Script, Source)
๐ต Multimodal Output
- Text: Comprehensive summaries and podcast scripts
- Audio: High-quality WAV podcasts via Kokoro-82M (HF Inference API)
- Metadata: Full source paper citations and links
๐๏ธ Architecture
โโโโโโโโโโโโโโโ
โ User โ Enters research topic
โโโโโโโโฌโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Gradio Interface (app.py) โ
โ - User input handling โ
โ - Progress tracking โ
โ - Result display โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Science Storyteller Orchestrator โ
โ - Autonomous workflow planning โ
โ - Agent coordination โ
โ - Error handling & recovery โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโบ ResearchAgent โโโบ Semantic Scholar API (Direct HTTP)
โ (Search & retrieve papers across all fields)
โ
โโโโบ AnalysisAgent โโโบ Claude AI โโโบ Anthropic API
โ (Summarize & create script)
โ
โโโโบ AudioAgent โโโบ Kokoro-82M โโโบ HF Inference API
(Text-to-speech conversion - high quality, open-source)
Directory Structure
app/
โโโ app.py # Main Gradio application
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
โโโ .env.example # Environment variable template
โโโ .gitignore # Git ignore rules
โ
โโโ agents/ # Autonomous agents
โ โโโ __init__.py
โ โโโ research_agent.py # Paper search & retrieval
โ โโโ analysis_agent.py # Summarization & scripting
โ โโโ audio_agent.py # Text-to-speech conversion
โ
โโโ mcp_tools/ # API integrations
โ โโโ __init__.py
โ โโโ scholar_tool.py # Semantic Scholar Direct API client
โ โโโ llm_tool.py # Claude AI wrapper
โ
โโโ utils/ # Utility functions
โ โโโ __init__.py
โ โโโ script_formatter.py # Script formatting
โ โโโ audio_processor.py # Audio file handling
โ
โโโ assets/ # Generated content
โโโ audio/ # Generated podcasts
โโโ examples/ # Example outputs
๐ Getting Started
Prerequisites
- Python 3.10+
- API Keys:
- Semantic Scholar API (optional, for higher rate limits)
- Anthropic API for Claude AI
- Hugging Face Token for Kokoro-82M TTS
Installation
Clone the repository:
git clone <your-repo-url> cd appInstall Python dependencies:
pip install -r requirements.txtSet up environment variables:
cp .env.example .env # Edit .env and add your API keysConfigure your
.envfile:SEMANTIC_SCHOLAR_API=your_semantic_scholar_api_key_here # Optional ANTHROPIC_API_KEY=your_anthropic_api_key_here HUGGINGFACE_TOKEN=your_hf_token_here # For Kokoro-82M TTSRun the application:
python app.pyOpen your browser: Navigate to
http://localhost:7860
Using in Hugging Face Spaces
This project is designed to run seamlessly on Hugging Face Spaces:
Add your API keys in Space Settings โ Secrets:
SEMANTIC_SCHOLAR_API(optional, but recommended for higher rate limits)ANTHROPIC_API_KEYHUGGINGFACE_TOKEN(for Kokoro-82M TTS via Inference API)
The Space will automatically install dependencies and launch
๐ฌ Usage
- Enter a research topic (e.g., "AlphaFold", "CRISPR gene editing", "quantum computing")
- Click "Generate Podcast"
- Wait for the AI agents to search, analyze, and generate content (~1-2 minutes)
- Listen to your podcast in the Audio tab
- Read the summary and script in their respective tabs
- Check the source paper in the Source Paper tab
Example Topics
Artificial Intelligence:
- Transformer neural network architecture
- AlphaFold 3 protein structure prediction
- GPT language models
- Diffusion models for image generation
Medicine & Health:
- mRNA vaccine technology and development
- Tuberculosis vaccine BCG immunotherapy
- Cancer immunotherapy checkpoint inhibitors
- CRISPR Cas9 gene editing applications
Astronomy & Physics:
- Comet 3I/ATLAS interstellar trajectory
- Gravitational waves detection
- Quantum entanglement Bell inequality
- Dark matter detection experiments
Climate & Environment:
- Climate change ocean acidification
- Carbon capture and storage technologies
- Renewable energy grid integration
- Arctic ice sheet dynamics
Biology:
- Gut microbiome metabolic pathways
- Neuroscience brain plasticity
- Evolutionary genetics adaptation
๐ ๏ธ Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Frontend | Gradio 5.x | Interactive web interface |
| Backend | Python 3.10+ | Application logic |
| Research API | Semantic Scholar | Direct HTTP API for paper retrieval |
| AI Analysis | Claude 3.5 Sonnet | Summarization & script generation |
| Audio | Kokoro-82M | HF Inference API TTS (Apache-2.0) |
| HTTP Client | requests library | Reliable API communication |
| Deployment | Hugging Face Spaces | Cloud hosting |
๐ฏ Hackathon Requirements Coverage
โ Track 2: MCP in Action
Autonomous Agent Behavior:
- Planning (query enhancement, paper selection)
- Reasoning (best paper evaluation)
- Execution (multi-step workflow orchestration)
- Self-correction (fallback strategies)
API Integration:
- Uses Semantic Scholar API directly for reliable research retrieval
- Follows REST API best practices
- Demonstrates proper async HTTP client usage
- Rate limiting and retry logic implemented
Gradio Application:
- Built with Gradio 5.x
- Professional UI/UX
- Progress indicators
- Mobile-responsive
Real-world Value:
- Makes research accessible to non-experts
- Saves time for researchers doing literature review
- Educational tool for science communication
- Multimodal output (text + audio)
๐๏ธ Advanced Features (Bonus)
- Context Engineering: Optimized prompts for summarization and script generation
- Error Handling: Comprehensive fallback strategies with retry logic
- Caching: Efficient file management
- Multimodal: Combines text analysis with audio generation
- Production-ready: Direct API calls, no subprocess dependencies
๐ Performance
- Search Speed: < 5 seconds for paper retrieval
- Analysis Time: 10-20 seconds for summarization
- Script Generation: 10-20 seconds
- Audio Synthesis: 30-60 seconds (varies by length)
- Total Time: ~1-2 minutes for complete workflow
๐ฅ Demo & Links
๐น Demo Video
Coming Soon: Watch the demo (1-5 minutes)
The demo showcases:
- Complete workflow from topic input to podcast output
- Autonomous agent behavior
- Direct API integration
- User interface features
๐ฑ Social Media
Coming Soon: Social media post link
๐งช Testing
Run the test suite to verify all components:
# Test Semantic Scholar API integration
python test_scholar_direct.py
# Test individual components
python test_components.py
๐ค Contributing
This project was created for the MCP's 1st Birthday Hackathon (November 14-30, 2025). Feel free to:
- Report bugs via Issues
- Suggest improvements
- Fork and extend for your own use cases
๐ License
MIT License - feel free to use this project for learning and development.
๐ Acknowledgments
- Anthropic for the Model Context Protocol and Claude AI
- Gradio for the amazing web framework
- Semantic Scholar for comprehensive research paper access across all fields
- Kokoro-82M (@hexgrad) for the excellent open-source TTS model
- Hugging Face for hosting, infrastructure, and Inference API
- MCP Community for the hackathon opportunity
๐ฎ Future Enhancements
Potential improvements for future versions:
- Support for multiple research sources (arXiv, PubMed, etc.)
- Multiple voice options for narration
- Podcast series generation for related topics
- Export to various audio formats
- Integration with podcast platforms
- Multi-language support
- User accounts for saving favorite podcasts
- Custom voice training
- Background music and sound effects
- Batch processing for multiple topics
๐ง Contact
Created for MCP's 1st Birthday Hackathon 2025
Track 2: MCP in Action (Multimodal)
Made with โค๏ธ for science communication and AI innovation