Spaces:

MCP-1st-Birthday
/

science-storyteller

Running

File size: 50,127 Bytes

# Science Storyteller - Learning Guide

> **For developers new to async Python, OOP, and MCP protocol**  
> A step-by-step guide to understanding the Science Storyteller codebase

---

## 📚 Table of Contents

0. [Architecture](#architecture)
1. [Learning Philosophy](#learning-philosophy)
2. [Object-Oriented Programming Basics](#object-oriented-programming-basics)
3. [Async/Await Deep Dive](#asyncawait-deep-dive)
4. [Module-by-Module Learning Path](#module-by-module-learning-path)
5. [Hands-On Exercises](#hands-on-exercises)
6. [Common Patterns Explained](#common-patterns-explained)
7. [Debugging Tips](#debugging-tips)
8. [Further Resources](#further-resources)
9. [Testing Strategy](#-testing-strategy)

---

## Architecture 

This diagram shows how a user request flows through the system.

```mermaid
graph TD
    subgraph User Interface
        A[Gradio UI]
    end

    subgraph Orchestration Layer
        B(app.py: ScienceStoryteller)
    end

    subgraph Agent Layer
        C[agents/research_agent.py]
        D[agents/analysis_agent.py]
        E[agents/audio_agent.py]
    end

    subgraph Tool Layer
        F(mcp_tools/arxiv_tool.py)
        G(mcp_tools/llm_tool.py)
        H(ElevenLabs API)
    end

    subgraph External Services
        I[arXiv MCP Server]
        J[Anthropic Claude API]
        K[ElevenLabs TTS Service]
    end

    A -- User Input (Topic) --> B
    B -- 1. search(topic) --> C
    C -- 2. search_papers(query) --> F
    F -- 3. call_tool --> I
    I -- 4. Paper Results --> F
    F -- 5. Papers --> C
    C -- 6. Papers --> B
    B -- 7. summarize_and_script(paper) --> D
    D -- 8. summarize_paper(paper) --> G
    G -- 9. API Call --> J
    J -- 10. Summary --> G
    G -- 11. Summary --> D
    D -- 12. Script --> B
    B -- 13. text_to_speech(script) --> E
    E -- 14. API Call --> H
    H -- 15. API Call --> K
    K -- 16. Audio MP3 --> H
    H -- 17. Audio File Path --> E
    E -- 18. Audio Path --> B
    B -- 19. Results (Summary, Audio, etc.) --> A
```

---

## Python Logging Module

### What is Logging?

Logging is Python's built-in system for tracking events, debugging, and monitoring your application. It's much better than using `print()` statements for debugging.

### Basic Setup

```python
import logging

# Create a logger instance specific to this module
logger = logging.getLogger(__name__)

# Configure logging to display messages
logging.basicConfig(
    level=logging.INFO,  # Show INFO and above (INFO, WARNING, ERROR, CRITICAL)
    format='%(levelname)s - %(name)s - %(message)s'
)

# Now you can log messages
logger.info("Audio processor functions module loaded.")
```

### Why Use `__name__` with Logger?

**Benefits of `getLogger(__name__)`:**

1. **Hierarchical organization**: If your code is imported as a module (like `utils.audio_processor`), the logger name will be `"utils.audio_processor"` instead of `"__main__"`. This creates a logger hierarchy that helps organize logs from different parts of your app.

2. **Filtering by module**: You can configure different log levels for different parts of your application:
   ```python
   logging.getLogger("agents").setLevel(logging.DEBUG)  # Verbose for agents
   logging.getLogger("utils").setLevel(logging.WARNING)  # Quiet for utils
   ```

3. **Identifies source**: In log output, you can see exactly which module generated each message, making debugging much easier.

4. **Best practice**: Prevents logger name conflicts and follows Python conventions.

### Log Levels

From least to most severe:

| Level | When to Use | Example |
|-------|-------------|---------|
| `DEBUG` | Detailed diagnostic information | `logger.debug(f"Variable x = {x}")` |
| `INFO` | General informational messages | `logger.info("Processing started")` |
| `WARNING` | Something unexpected, but not an error | `logger.warning("Cache miss, fetching from API")` |
| `ERROR` | An error occurred, but app can continue | `logger.error(f"Failed to load file: {e}")` |
| `CRITICAL` | Serious error, app may crash | `logger.critical("Database connection lost!")` |

### Why Logging Doesn't Show by Default

**The problem:** By default, loggers only show messages at WARNING level and above. Your `logger.info()` calls are ignored!

**The solution:** Configure logging with `basicConfig()` to set the minimum level:

```python
logging.basicConfig(level=logging.INFO)  # Now INFO messages will appear
```

### Format String Explained

```python
format='%(levelname)s - %(name)s - %(message)s'
```

This creates output like:
```
INFO - __main__ - Audio processor functions module loaded.
```

- `%(levelname)s` → Log level (INFO, ERROR, etc.)
- `%(name)s` → Logger name (from `__name__`)
- `%(message)s` → Your actual message

**Note:** You can add timestamps with `%(asctime)s` if you need them, but for simple learning it's cleaner without.

### Practical Example

```python
import logging

logger = logging.getLogger(__name__)

def process_audio(file_path):
    logger.debug(f"Starting audio processing for: {file_path}")  # Only in DEBUG mode
    
    try:
        # Process the file
        logger.info(f"Successfully processed: {file_path}")  # Normal operation
        return True
    except FileNotFoundError:
        logger.error(f"File not found: {file_path}")  # Error, but continue
        return False
    except Exception as e:
        logger.critical(f"Critical error processing {file_path}: {e}")  # Serious problem
        raise
```

### Why Use Logging Instead of Print?

| Feature | `print()` | `logging` |
|---------|-----------|-----------|
| **Control output** | ❌ Always prints | ✅ Can turn on/off by level |
| **Timestamps** | ❌ Manual | ✅ Automatic |
| **File output** | ❌ Manual redirection | ✅ Built-in handlers |
| **Severity levels** | ❌ No distinction | ✅ DEBUG, INFO, WARNING, etc. |
| **Production-ready** | ❌ Need to remove/comment | ✅ Just change log level |
| **Module identification** | ❌ Manual | ✅ Automatic with `__name__` |

### In Your Science Storyteller Project

You'll use logging to track:
- Which research papers were retrieved
- API call successes/failures
- Processing steps (search → summarize → TTS)
- Errors during workflow
- Performance timing

**Example from your project:**
```python
logger.info(f"Searching for papers on topic: {topic}")
logger.warning("No papers found, trying fallback query")
logger.error(f"API call failed: {e}")
```

---

## Working with File Paths: `pathlib.Path`

### What is `pathlib`?

`pathlib` is Python's modern, object-oriented way to work with file system paths. It was introduced in **Python 3.4** (2014) and is now the recommended approach for handling files and directories.

### Why Use `Path` Instead of Strings?

**Old way (strings and `os.path`):**
```python
import os

path = "/home/user/audio.mp3"
if os.path.exists(path):
    dirname = os.path.dirname(path)
    basename = os.path.basename(path)
    new_path = os.path.join(dirname, "new_audio.mp3")
```

**New way (`pathlib.Path`):**
```python
from pathlib import Path

path = Path("/home/user/audio.mp3")
if path.exists():
    dirname = path.parent
    basename = path.name
    new_path = path.parent / "new_audio.mp3"  # Use / operator!
```

**Benefits:**
- ✅ More readable and intuitive
- ✅ Works across Windows/Mac/Linux automatically
- ✅ Chainable methods
- ✅ Less error-prone than string concatenation
- ✅ Object-oriented design

### Creating Path Objects

```python
from pathlib import Path

# From a string
p = Path("/home/user/app/assets/audio/test.mp3")

# From current directory
p = Path.cwd()  # Current working directory. It does not need input path.

# From home directory
p = Path.home()  # User's home directory (~)

# Relative paths
p = Path("./assets/audio")
```

### Path Properties and Methods

```python
from pathlib import Path

p = Path("/home/user/app/assets/audio/podcast_123.mp3")

# Check existence and type
p.exists()          # True/False - does it exist?
p.is_file()         # True/False - is it a file?
p.is_dir()          # True/False - is it a directory?

# Get path components
p.name              # 'podcast_123.mp3' - filename with extension
p.stem              # 'podcast_123' - filename without extension
p.suffix            # '.mp3' - file extension
p.parent            # Path('/home/user/app/assets/audio') - parent directory
p.parts             # ('/', 'home', 'user', 'app', 'assets', 'audio', 'podcast_123.mp3')

# Path conversion
str(p)              # Convert Path to string
p.absolute()        # Get absolute path
p.resolve()         # Resolve symlinks and make absolute
```

### Common Operations

**1. Check if file exists:**
```python
path = Path("myfile.txt")
if path.exists():
    print("File found!")
```

**2. Create directories:**
```python
audio_dir = Path("./assets/audio")
audio_dir.mkdir(parents=True, exist_ok=True)
# parents=True: creates parent directories if needed
# exist_ok=True: doesn't raise error if already exists
```

**3. Join paths (the smart way):**
```python
base = Path("./assets")
audio_file = base / "audio" / "test.mp3"  # Use / operator!
# Result: Path('./assets/audio/test.mp3')

# Works with strings too!
file_path = base / "audio" / f"podcast_{123}.mp3"
```

**4. Find files (glob patterns):**
```python
audio_dir = Path("./assets/audio")

# All MP3 files in directory
mp3_files = list(audio_dir.glob("*.mp3"))

# All files recursively
all_files = list(audio_dir.glob("**/*"))

# Specific pattern
podcasts = list(audio_dir.glob("podcast_*.mp3"))
```

**5. Read and write files:**
```python
path = Path("data.txt")

# Write text
path.write_text("Hello, world!")

# Read text
content = path.read_text()

# Write bytes (for binary files)
path.write_bytes(b'\x89PNG...')

# Read bytes
data = path.read_bytes()
```

**6. Get file metadata:**
```python
path = Path("myfile.txt")

stats = path.stat()
size_bytes = stats.st_size
modified_time = stats.st_mtime
```

### Real Example from Your Project

From `utils/audio_processor.py`:

```python
def process_audio_file(audio_path: str) -> Optional[str]:
    """Validate an audio file using Path."""
    
    # Convert string to Path object
    path = Path(audio_path)
    
    # Check if file exists
    if not path.exists():
        logger.error(f"Audio file not found: {audio_path}")
        return None
    
    # Check file extension
    if not path.suffix.lower() in ['.mp3', '.wav', '.ogg']:
        logger.error(f"Invalid audio format: {path.suffix}")
        return None
    
    # Convert back to string for return
    return str(path)
```

**Why this is better than strings:**
- `path.exists()` is clearer than `os.path.exists(audio_path)`
- `path.suffix` is simpler than manually parsing the extension
- Cross-platform compatible (Windows uses `\`, Unix uses `/`)
- Type-safe with IDE autocomplete

### Advanced Example: Cleanup Old Files

```python
from pathlib import Path

def cleanup_old_files(directory: str, max_files: int = 10):
    """Remove oldest audio files, keeping only max_files."""
    
    dir_path = Path(directory)
    
    if not dir_path.exists():
        return
    
    # Get all MP3 files sorted by modification time
    audio_files = sorted(
        dir_path.glob('*.mp3'),              # Find all MP3s
        key=lambda p: p.stat().st_mtime,     # Sort by modified time
        reverse=True                          # Newest first
    )
    
    # Remove oldest files beyond max_files
    for old_file in audio_files[max_files:]:
        old_file.unlink()  # Delete the file
        logger.info(f"Removed old file: {old_file}")
```

### Path Version History

- **Python 3.4** (2014): `pathlib` introduced
- **Python 3.5** (2015): Bug fixes and improvements
- **Python 3.6+** (2016+): Standard library functions accept `Path` objects

**Backward compatibility:** If you need to support Python 2.7 or 3.3, use `pathlib2` package. But for modern projects (like yours), just use built-in `pathlib`.

### Quick Reference Table

| Task | Old Way (`os.path`) | New Way (`pathlib.Path`) |
|------|---------------------|--------------------------|
| Check exists | `os.path.exists(path)` | `Path(path).exists()` |
| Get filename | `os.path.basename(path)` | `Path(path).name` |
| Get directory | `os.path.dirname(path)` | `Path(path).parent` |
| Join paths | `os.path.join(a, b)` | `Path(a) / b` |
| Get extension | Manual string split | `Path(path).suffix` |
| Create directory | `os.makedirs(path)` | `Path(path).mkdir(parents=True)` |
| List files | `os.listdir(path)` | `Path(path).iterdir()` |
| Read file | `open(path).read()` | `Path(path).read_text()` |

### When to Convert Between Path and String

**Rule of thumb:**
- Use `Path` objects internally for all file operations
- Convert to `str()` only when:
  - Passing to APIs that don't accept Path
  - Displaying to user
  - Storing in JSON or database

```python
# Internal: use Path
path = Path("./assets/audio") / "file.mp3"

# External API: convert to string
audio_url = upload_to_api(str(path))

# Display to user: convert to string
print(f"Audio saved to: {path}")  # Prints nicely automatically
```

---

## Python Function Basics

Functions are the primary way to group code into reusable blocks. Let's break down a function from our codebase: `utils/audio_processor.py`.

```python
def process_audio_file(audio_path: str) -> Optional[str]:
    """
    Process and validate an audio file.
    
    Args:
        audio_path: Path to audio file
        
    Returns:
        Validated path or None if invalid
    """
    # ... function body ...
    return str(path)
```

### Anatomy of a Function

Let's look at each part of the function definition:

1.  **`def` keyword**: This signals the start of a function definition.
2.  **Function Name**: `process_audio_file`. This is how you'll call the function later. It should be descriptive and follow the `snake_case` convention (all lowercase with underscores).
3.  **Parameters (in `()`)**: `(audio_path: str)`. These are the inputs the function accepts.
    -   `audio_path`: The name of the parameter.
    -   `: str`: This is a **type hint**. It tells developers that this function expects `audio_path` to be a string. It helps with code readability and catching errors.
4.  **Return Type Hint**: `-> Optional[str]`. This indicates what the function will return.
    -   `Optional[str]` means the function can return either a `str` (string) or `None`. This is very useful for functions that might not always have a valid result to give back.
5.  **Docstring**: The triple-quoted string `"""..."""` right after the definition. It explains the function's purpose, arguments (`Args`), and return value (`Returns`). This is essential for documentation.
6.  **Function Body**: The indented code block below the definition. This is where the function's logic is implemented.
7.  **`return` statement**: This keyword exits the function and passes back a value to whoever called it.

### Why Use Functions?

-   **Reusability**: Write code once and use it many times.
-   **Modularity**: Break down complex problems into smaller, manageable pieces.
-   **Readability**: Well-named functions make code easier to understand.

---

## Learning Philosophy

### Why Learn Module-by-Module?

**Bottom-up approach** is recommended for this project:
1. Start with simple utilities (pure Python functions)
2. Progress to MCP tools (understand protocol basics)
3. Study agents (business logic and coordination)
4. Finally tackle orchestration (integration)

**Benefits:**
- ✅ Build confidence with simple concepts first
- ✅ Understand dependencies before integration
- ✅ Easier to debug when you know each piece
- ✅ Can test components independently

### Learning vs Building Trade-off

For a hackathon project, you need to balance:
- **Deep understanding**: Takes time, prevents bugs
- **Quick delivery**: Ship working product by deadline

**Recommended approach for this project:**
- **Week 1**: Deep dive into 2-3 core modules
- **Week 2**: Implement and integrate
- **Week 3**: Test, polish, document

---

## Object-Oriented Programming Basics

### What is a Class?

A **class** is a blueprint for creating objects. Think of it as a cookie cutter.

```python
class ScienceStoryteller:  # The blueprint
    """Main orchestrator for the Science Storyteller workflow."""
```

### Creating Objects (Instantiation)

```python
# Creating an object from the class
storyteller = ScienceStoryteller()  # Now you have a specific storyteller object
```

### The `__init__` Method (Constructor)

The `__init__` method is called **automatically** when you create a new object.

```python
class ScienceStoryteller:
    def __init__(self):  # Runs when ScienceStoryteller() is called
        self.research_agent = ResearchAgent()
        self.analysis_agent = AnalysisAgent()
        self.audio_agent = AudioAgent()
```

**Purpose:** Set up the initial state of your object.

**When it runs:**
```python
storyteller = ScienceStoryteller()  # __init__ runs here automatically
```

### Understanding `self`

`self` refers to **this particular object instance**.

```python
class ScienceStoryteller:
    def __init__(self):
        self.research_agent = ResearchAgent()  # Attach to THIS object
    
    async def process_topic(self, topic: str):
        papers = await self.research_agent.search(topic)  # Use THIS object's agent
```

**Why `self`?** So each object can have its own separate data.

```python
storyteller1 = ScienceStoryteller()  # Has its own research_agent
storyteller2 = ScienceStoryteller()  # Has a different research_agent
```

### Attributes (Instance Variables)

**Attributes** store data that belongs to an object.

```python
self.research_agent = ResearchAgent()  # This is an attribute
self.analysis_agent = AnalysisAgent()  # This is an attribute
```

**Accessing attributes:**
```python
async def process_topic(self, topic: str):
    # Use the attributes we created in __init__
    papers = await self.research_agent.search(topic)
    best_paper = await self.analysis_agent.select_best(papers, topic)
```

### Methods (Functions in a Class)

**Methods** define what an object can **do**.

```python
class ScienceStoryteller:
    async def process_topic(self, topic: str):  # This is a method
        """Process a research topic into a podcast."""
        # ... implementation ...
    
    def _format_paper_info(self, paper: dict) -> str:  # Another method
        """Format paper metadata for display."""
        # ... implementation ...
```

**Key points:**
- First parameter is always `self`
- Called using dot notation: `storyteller.process_topic("AI")`
- Can access attributes: `self.research_agent`

### Public vs Private Naming Convention

```python
def process_topic(self, topic):     # Public - no underscore
    """Meant to be called from outside the class."""
    
def _format_paper_info(self, paper): # Private - starts with _
    """Internal helper, not meant to be called externally."""
```

**Convention (not enforced):**
- `method_name` → Public, part of the API
- `_method_name` → Private, internal use only

### Complete Example

```python
class ScienceStoryteller:
    """Main orchestrator for the Science Storyteller workflow."""
    
    # Constructor - runs when object is created
    def __init__(self):
        self.research_agent = ResearchAgent()      # Attribute
        self.analysis_agent = AnalysisAgent()      # Attribute
        self.audio_agent = AudioAgent()            # Attribute
    
    # Public method - main workflow
    async def process_topic(self, topic: str):
        papers = await self.research_agent.search(topic)  # Use attribute
        best_paper = await self.analysis_agent.select_best(papers)
        paper_info = self._format_paper_info(best_paper)  # Call private method
        return paper_info
    
    # Private method - internal helper
    def _format_paper_info(self, paper: dict) -> str:
        return f"**Title:** {paper.get('title', 'Unknown')}"

# Usage
storyteller = ScienceStoryteller()           # Create object (__init__ runs)
result = await storyteller.process_topic("AlphaFold")  # Call method
```

### Quick Reference

| Concept | Syntax | Purpose |
|---------|--------|---------|
| **Class** | `class ClassName:` | Blueprint for objects |
| **Object** | `obj = ClassName()` | Instance created from class |
| **Constructor** | `def __init__(self):` | Initialize object state |
| **Self** | `self.attribute` | Reference to current object |
| **Attribute** | `self.name = value` | Data stored in object |
| **Method** | `def method(self, args):` | Function belonging to class |
| **Public** | `def method(self):` | External API |
| **Private** | `def _method(self):` | Internal helper |

---

## Async/Await Deep Dive

### Why Async? The Three Use Cases

Based on [RealPython's async guide](https://realpython.com/async-io-python/):

1. **Writing pausable/resumable functions**
2. **Managing I/O-bound tasks** (network, files, databases)
3. **Improving performance** (handle multiple tasks concurrently)

**Science Storyteller uses all three!**

### The Problem: Blocking I/O

**Without async (blocking):**
```python
def process_topic_sync(topic):
    papers = requests.get("arxiv_api")      # ⏸️ BLOCKS for 5 seconds
    summary = requests.post("claude_api")   # ⏸️ BLOCKS for 10 seconds
    audio = requests.post("elevenlabs_api") # ⏸️ BLOCKS for 60 seconds
    return results  # Total: 75 seconds of BLOCKING

# During blocking:
# ❌ UI freezes
# ❌ Progress bar can't update
# ❌ Other users can't be served
# ❌ Event loop is stuck
```

**With async (non-blocking):**
```python
async def process_topic(topic):
    papers = await arxiv_tool.search()      # ⏸️ Yields control for 5 seconds
    summary = await llm_tool.summarize()    # ⏸️ Yields control for 10 seconds
    audio = await audio_tool.convert()      # ⏸️ Yields control for 60 seconds
    return results  # Total: 75 seconds, but non-blocking

# During await:
# ✅ UI stays responsive
# ✅ Progress bar updates
# ✅ Other users can be served
# ✅ Event loop continues running
```

### Visualizing Blocking vs. Async

**Blocking (Sequential) Execution:**
```
Request 1:  [--arxiv--|----claude----|----------------audio----------------|]
Request 2:                                                                [--arxiv--|----claude----|---...
Time -----> 0s        5s           15s                                     75s       80s          90s
```
- The UI is frozen for the entire 75s duration of Request 1.
- Request 2 must wait for Request 1 to completely finish.

**Async (Concurrent) Execution:**
```
Request 1:  [--arxiv--] ... [----claude----] ... [----------------audio----------------]
Request 2:    [--arxiv--] ...   [----claude----] ... [----------------audio----------------]
Time -----> 0s        1s        5s           6s        15s           16s                   75s
```
- When Request 1 `await`s `arxiv`, the event loop is free to start Request 2.
- Both requests run concurrently, sharing time during I/O waits. The UI remains responsive throughout.

### How Async Works: The Event Loop

```
┌─────────────────────────────────────────┐
│        Python Asyncio Event Loop        │
│  (Single thread, multiple tasks)        │
└─────────────────────────────────────────┘
         ↓               ↓               ↓
    Task A          Task B          Task C
 (User 1 req)    (User 2 req)    (User 3 req)
```

**When `await` is hit:**
1. Function **pauses** at that line
2. Control **returns** to the event loop
3. Event loop **runs other code** (updates UI, handles requests)
4. When I/O completes, function **resumes** from where it paused

### Single VM, Multiple Users

**Key insight:** On Hugging Face Spaces, **all users share one Python process**.

```
Hugging Face Space (Single VM)
├─ Python Process (port 7860)
│  └─ Event Loop
│     ├─ Task: User A (paused at await)
│     ├─ Task: User B (paused at await)
│     └─ Task: User C (paused at await)
```

**Without async (sequential):**
```
User A: 0-75s    (completes at 75s)
User B: 75-150s  (WAITS 75s, then runs 75s = 150s total)
User C: 150-225s (WAITS 150s, then runs 75s = 225s total)
```

**With async (concurrent):**
```
User A: 0-75s    (completes at 75s)
User B: 1-76s    (starts 1s later, runs concurrently = 76s total)
User C: 2-77s    (starts 2s later, runs concurrently = 77s total)
```

### Performance Comparison

| Metric | Without Async | With Async |
|--------|--------------|------------|
| **User A wait** | 75s | 75s |
| **User B wait** | 150s | ~76s |
| **User C wait** | 225s | ~77s |
| **UI responsiveness** | Frozen | Live updates |
| **Progress tracking** | Can't update | Works |
| **Concurrent users** | Sequential | Interleaved |

### Gradio + Async Integration

Gradio uses **FastAPI** internally, which is async-native:

```python
# Gradio internals (simplified)
from fastapi import FastAPI

app = FastAPI()

@app.post("/api/predict")
async def predict(request):
    result = await your_gradio_function(request.data)
    return result
```

**Why this matters:**
- `gr.Progress()` only works with async (sends WebSocket updates)
- Gradio's event loop can handle multiple users
- Your async functions integrate seamlessly

### Async Syntax Rules

**Defining async functions:**
```python
async def my_function():  # Note the 'async' keyword
    result = await some_async_operation()
    return result
```

**Calling async functions:**
```python
# From another async function:
result = await my_function()

# From synchronous code:
import asyncio
result = asyncio.run(my_function())
```

**Common mistake:**
```python
# ❌ Wrong - missing await
async def process():
    result = some_async_function()  # This returns a coroutine, not the result!
    
# ✅ Correct - with await
async def process():
    result = await some_async_function()  # This waits and gets the actual result
```

### The Async Chain in Science Storyteller

```
app.py: process_topic (async)
  ↓ await
agents/research_agent.py: search (async)
  ↓ await
mcp_tools/arxiv_tool.py: search_papers (async)
  ↓ await
session.call_tool() (MCP I/O)
  ↓
[Network request to arXiv server]
```

**Every step must be async** because:
- MCP communication uses async I/O
- Can't `await` inside a non-async function
- Event loop requires async all the way up

---

## Module-by-Module Learning Path

### Level 1: Foundation (Start Here)

#### 1. `utils/audio_processor.py`

**What it does:** File system operations for audio files

**Key concepts:**
- Creating directories with `Path.mkdir()`
- Checking file sizes with `os.path.getsize()`
- Working with file paths

**Learning exercise:**
```python
from utils.audio_processor import ensure_audio_dir, get_file_size_mb

# Create the audio directory
ensure_audio_dir()

# Check size of a file (if it exists)
# size = get_file_size_mb("assets/audio/podcast_123.mp3")
```

**What to look for:**
- How does it handle file paths in a cross-platform way (`pathlib.Path`)?
- The use of `exist_ok=True` to prevent errors.
- Simple, pure functions that have no side effects other than interacting with the filesystem.

**Questions to answer:**
- Why use `Path` instead of strings for file paths?
- What happens if the directory already exists?
- How is file size converted from bytes to MB?

---

#### 2. `utils/script_formatter.py`

**What it does:** Clean and format podcast scripts for TTS

**Key concepts:**
- String manipulation (`strip()`, `replace()`)
- Regular expressions (if used)
- Estimating audio duration from text

**Learning exercise:**
```python
from utils.script_formatter import format_podcast_script, estimate_duration

script = """
Hello!   This is a test.

With extra   spaces and newlines.
"""

cleaned = format_podcast_script(script)
duration = estimate_duration(cleaned)

print(f"Cleaned: {cleaned}")
print(f"Duration: {duration} seconds")
```

**What to look for:**
- How simple string methods (`.strip()`, `.replace()`) are used for cleaning.
- The logic for `estimate_duration`: it's a heuristic, not an exact calculation.
- This is another example of pure functions that are easy to test.

**Questions to answer:**
- How does text length relate to audio duration?
- What characters need to be cleaned for TTS?
- Why estimate duration before generating audio?

---

### Level 2: MCP Tools (Core Hackathon Requirement)

#### 3. `mcp_tools/arxiv_tool.py`

**What it does:** Connects to arXiv MCP server to search papers

**Key concepts:**
- Model Context Protocol (MCP)
- Stdio transport (stdin/stdout communication)
- Async context managers (`__aenter__`, `__aexit__`)
- JSON-RPC messaging

**Important code sections:**

**Connection setup:**
```python
server_params = StdioServerParameters(
    command="npx",
    args=["-y", "@blindnotation/arxiv-mcp-server"],
    env=None
)

self.exit_stack = stdio_client(server_params)
stdio_transport = await self.exit_stack.__aenter__()
read_stream, write_stream = stdio_transport
self.session = ClientSession(read_stream, write_stream)
await self.session.__aenter__()
```

**Calling tools:**
```python
result = await self.session.call_tool(
    "search_arxiv",
    {
        "query": query,
        "max_results": max_results,
        "sort_by": sort_by
    }
)
```

**Learning exercise:**
```python
import asyncio
from mcp_tools.arxiv_tool import ArxivTool

async def explore_arxiv():
    tool = ArxivTool()
    
    # Connect to MCP server
    connected = await tool.connect()
    print(f"Connected: {connected}")
    
    # Search for papers
    papers = await tool.search_papers("quantum computing", max_results=3)
    print(f"Found {len(papers)} papers:")
    
    for paper in papers:
        print(f"\n  Title: {paper.get('title', 'N/A')}")
        print(f"  Authors: {paper.get('authors', [])[:2]}")
    
    # Clean up
    await tool.disconnect()

asyncio.run(explore_arxiv())
```

**Questions to answer:**
- What is stdio transport and why use it?
- Why do we need both `exit_stack` and `session`?
- What happens if the MCP server crashes?
- How does `call_tool` send messages to the server?

**Deep dive topics:**
- JSON-RPC protocol format
- Async context managers (what `__aenter__` and `__aexit__` do)
- Process communication (pipes and streams)

---

#### 4. `mcp_tools/llm_tool.py`

**What it does:** Calls Anthropic Claude API for summarization

**Key concepts:**
- HTTP API requests with async
- Prompt engineering
- API authentication
- Response parsing

**Important code sections:**

**API call:**
```python
message = self.client.messages.create(
    model=self.model,
    max_tokens=max_tokens,
    messages=[
        {"role": "user", "content": prompt}
    ]
)

summary = message.content[0].text
```

**Learning exercise:**
```python
import asyncio
from mcp_tools.llm_tool import LLMTool

async def test_llm():
    tool = LLMTool()  # Needs ANTHROPIC_API_KEY in .env
    
    # Fake paper data
    paper = {
        "title": "Quantum Computing Fundamentals",
        "summary": "This paper explores the basic principles of quantum computing...",
        "authors": [{"name": "Alice"}, {"name": "Bob"}]
    }
    
    # Generate summary
    summary = await tool.summarize_paper(paper, max_tokens=500)
    print(f"Summary:\n{summary}")

asyncio.run(test_llm())
```

**Questions to answer:**
- How is the prompt structured for summarization?
- What's the difference between `max_tokens` in the request and actual tokens used?
- How does prompt engineering affect output quality?
- What happens if the API returns an error?

---

### Level 3: Agents (Business Logic)

#### 5. `agents/research_agent.py`

**What it does:** Autonomous paper retrieval and search optimization

**Key concepts:**
- Query enhancement (autonomous planning)
- Fallback strategies (self-correction)
- Agent initialization and cleanup

**Autonomous behaviors:**
```python
def _enhance_query(self, topic: str) -> str:
    """
    Autonomous planning - agent decides how to optimize search.
    """
    topic_lower = topic.lower()
    
    enhancements = {
        'ai': 'artificial intelligence machine learning',
        'ml': 'machine learning',
        'quantum': 'quantum computing physics',
    }
    
    for key, value in enhancements.items():
        if key in topic_lower and value not in topic_lower:
            return f"{topic} {value}"
    
    return topic
```

**Self-correction:**
```python
papers = await self.arxiv_tool.search_papers(enhanced_query)

if not papers:
    # Fallback: try original query
    papers = await self.arxiv_tool.search_papers(topic)
```

**Learning exercise:**
```python
from agents.research_agent import ResearchAgent

async def test_research():
    agent = ResearchAgent()
    await agent.initialize()
    
    # Test query enhancement
    original = "AI"
    enhanced = agent._enhance_query(original)
    print(f"Original: {original}")
    print(f"Enhanced: {enhanced}")
    
    # Test search
    papers = await agent.search("AlphaFold", max_results=3)
    print(f"\nFound {len(papers)} papers")
    
    await agent.cleanup()

asyncio.run(test_research())
```

**Questions to answer:**
- Why enhance queries? What problem does it solve?
- When should you use the fallback strategy?
- Why initialize and cleanup separately from `__init__`?

---

#### 6. `agents/analysis_agent.py`

**What it does:** Paper analysis and podcast script generation

**Key concepts:**
- Paper selection (reasoning)
- LLM-based summarization
- Script generation with prompt engineering
- Fallback content for LLM failures

**Autonomous reasoning:**
```python
async def select_best(self, papers: list, topic: str):
    """
    Reasoning - evaluate and select most relevant paper.
    """
    scored_papers = []
    for paper in papers:
        score = 0
        
        # Has abstract
        if paper.get('summary') or paper.get('abstract'):
            score += 1
        
        # Recent paper
        pub_date = paper.get('published', '')
        if '2024' in pub_date or '2023' in pub_date:
            score += 2
        
        scored_papers.append((score, paper))
    
    scored_papers.sort(key=lambda x: x[0], reverse=True)
    return scored_papers[0][1] if scored_papers else papers[0]
```

**Learning exercise:**
```python
from agents.analysis_agent import AnalysisAgent

async def test_analysis():
    agent = AnalysisAgent()
    
    # Mock paper data
    papers = [
        {"title": "Old Paper", "published": "2020-01-01", "summary": "..."},
        {"title": "New Paper", "published": "2024-01-01", "summary": "..."},
    ]
    
    best = await agent.select_best(papers, "quantum computing")
    print(f"Selected: {best['title']}")

asyncio.run(test_analysis())
```

**Questions to answer:**
- What criteria determine "best" paper?
- Why fallback to template content instead of failing?
- How does prompt engineering affect script quality?

---

#### 7. `agents/audio_agent.py`

**What it does:** Text-to-speech conversion via ElevenLabs

**Key concepts:**
- HTTP POST with binary response
- File I/O (saving MP3 bytes)
- API timeout handling
- Voice configuration

**Learning exercise:**
```python
from agents.audio_agent import AudioAgent

async def test_audio():
    agent = AudioAgent()  # Needs ELEVENLABS_API_KEY
    
    script = "Welcome to Science Storyteller. Today we explore quantum computing."
    
    audio_path = await agent.text_to_speech(script)
    
    if audio_path:
        print(f"Audio saved to: {audio_path}")
    else:
        print("Audio generation failed")

asyncio.run(test_audio())
```

**Questions to answer:**
- Why does TTS take so long (30-60 seconds)?
- What happens if the API times out?
- How are MP3 bytes different from text?

---

### Level 4: Orchestration (Integration)

#### 8. `app.py` - `ScienceStoryteller` Class

**What it does:** Coordinates all agents into a complete workflow

**Key concepts:**
- Orchestrator pattern
- Error recovery
- Progress tracking
- State management

**Learning exercise:**
```python
from app import ScienceStoryteller

async def test_orchestrator():
    storyteller = ScienceStoryteller()
    
    # Test full workflow
    result = await storyteller.process_topic("quantum entanglement")
    summary, script, audio, paper_info, status = result
    
    print(f"Status: {status}")
    if summary:
        print(f"Summary length: {len(summary)} chars")

asyncio.run(test_orchestrator())
```

**Questions to answer:**
- How does the orchestrator handle partial failures?
- Why return a tuple instead of a dict?
- What's the role of `gr.Progress()`?

---

#### 9. `app.py` - Gradio Interface

**What it does:** Web UI for user interaction

**Key concepts:**
- Gradio Blocks API
- Event handlers
- Async in Gradio
- UI layout

**Learning exercise:**
```python
# Just run the app
python app.py

# Then interact with the UI to see the flow
```

**Questions to answer:**
- How does Gradio handle async functions?
- What's the difference between `gr.Blocks` and `gr.Interface`?
- How are outputs mapped to UI components?

---

## Hands-On Exercises

### Exercise 1: Test Individual Tools

**Goal:** Verify MCP connection works

```python
# File: test_my_learning.py
import asyncio
from mcp_tools.arxiv_tool import ArxivTool

async def main():
    print("Testing ArxivTool...")
    
    tool = ArxivTool()
    connected = await tool.connect()
    
    if connected:
        print("✓ Connected to MCP server")
        
        papers = await tool.search_papers("AlphaFold", max_results=2)
        print(f"✓ Found {len(papers)} papers")
        
        for i, paper in enumerate(papers, 1):
            print(f"\n{i}. {paper.get('title', 'N/A')}")
        
        await tool.disconnect()
        print("\n✓ Disconnected")
    else:
        print("✗ Failed to connect")

if __name__ == "__main__":
    asyncio.run(main())
```

Run: `python test_my_learning.py`

---

### Exercise 2: Trace the Async Chain

**Goal:** Understand how async calls propagate

Add print statements to trace execution:

```python
# In arxiv_tool.py
async def search_papers(self, query: str, ...):
    print(f"[ArxivTool] Starting search for: {query}")
    result = await self.session.call_tool("search_arxiv", {...})
    print(f"[ArxivTool] Search complete, parsing results...")
    return papers

# In research_agent.py
async def search(self, topic: str, max_results: int = 5):
    print(f"[ResearchAgent] Enhancing query: {topic}")
    enhanced = self._enhance_query(topic)
    print(f"[ResearchAgent] Enhanced to: {enhanced}")
    papers = await self.arxiv_tool.search_papers(enhanced)
    print(f"[ResearchAgent] Got {len(papers)} papers")
    return papers
```

Then run and watch the flow!

---

### Exercise 3: Mock External Dependencies

**Goal:** Test without API keys

```python
# test_mock.py
from unittest.mock import AsyncMock, Mock
from agents.research_agent import ResearchAgent

async def test_with_mock():
    agent = ResearchAgent()
    
    # Mock the arxiv_tool to avoid real API calls
    agent.arxiv_tool.search_papers = AsyncMock(return_value=[
        {"title": "Fake Paper 1", "summary": "Test"},
        {"title": "Fake Paper 2", "summary": "Test"},
    ])
    
    papers = await agent.search("test topic")
    
    assert len(papers) == 2
    print(f"✓ Mock test passed: {len(papers)} papers")

asyncio.run(test_with_mock())
```

---

### Exercise 4: Build a Mini Version

**Goal:** Understand the workflow by simplifying

```python
# mini_storyteller.py
import asyncio

class MiniStoryteller:
    """Simplified version to understand the flow"""
    
    def __init__(self):
        print("📚 Initializing agents...")
        self.research = "ResearchAgent"
        self.analysis = "AnalysisAgent"
        self.audio = "AudioAgent"
    
    async def process(self, topic):
        print(f"\n🔍 Step 1: Search for '{topic}'")
        await asyncio.sleep(1)  # Simulate API call
        papers = ["Paper 1", "Paper 2"]
        
        print(f"📝 Step 2: Select best paper")
        await asyncio.sleep(1)
        best = papers[0]
        
        print(f"✍️ Step 3: Summarize '{best}'")
        await asyncio.sleep(1)
        summary = "This is a summary..."
        
        print(f"🎙️ Step 4: Generate script")
        await asyncio.sleep(1)
        script = "Welcome to the podcast..."
        
        print(f"🔊 Step 5: Convert to audio")
        await asyncio.sleep(2)
        audio = "podcast.mp3"
        
        print(f"✅ Done!")
        return summary, script, audio

async def main():
    storyteller = MiniStoryteller()
    result = await storyteller.process("AlphaFold")
    print(f"\nResult: {result}")

asyncio.run(main())
```

---

## Common Patterns Explained

### Pattern 1: Async Context Managers

**What you see:**
```python
self.exit_stack = stdio_client(server_params)
stdio_transport = await self.exit_stack.__aenter__()
# ... use the connection ...
await self.exit_stack.__aexit__(None, None, None)
```

**What it means:**
- `__aenter__`: Setup (open connection, allocate resources)
- `__aexit__`: Cleanup (close connection, free resources)

**Better syntax:**
```python
async with stdio_client(server_params) as stdio_transport:
    # Connection is open here
    read_stream, write_stream = stdio_transport
    # ... use streams ...
# Connection automatically closed when block exits
```

**Why the manual version in the code?**
- Need to keep connection alive for multiple operations
- Can't use `async with` because connection persists beyond one function call

---

### Pattern 2: Optional Parameters with Defaults

```python
async def search(self, topic: str, max_results: int = 5):
    """Search with default max_results"""
```

**Usage:**
```python
# Use default
papers = await agent.search("AI")  # max_results=5

# Override default  
papers = await agent.search("AI", max_results=10)
```

---

### Pattern 3: Type Hints

```python
async def search_papers(
    self,
    query: str,                    # Must be a string
    max_results: int = 5,          # Must be an int, defaults to 5
    sort_by: str = "relevance"     # Must be a string, defaults to "relevance"
) -> List[Dict[str, Any]]:         # Returns a list of dictionaries
```

**Benefits:**
- Self-documenting code
- IDE autocomplete
- Type checking tools (mypy)
- Easier to catch bugs

---

### Pattern 4: Dictionary `.get()` with Defaults

```python
title = paper.get('title', 'Unknown')  # Returns 'Unknown' if 'title' key missing
```

**Why not just `paper['title']`?**
- `paper['title']` → Raises `KeyError` if missing
- `paper.get('title', 'Unknown')` → Returns default if missing (safer)

---

### Pattern 5: List Comprehension

```python
author_names = [
    author.get('name', '')
    for author in authors[:5]
    if isinstance(author, dict)
]
```

**Equivalent to:**
```python
author_names = []
for author in authors[:5]:
    if isinstance(author, dict):
        author_names.append(author.get('name', ''))
```

---

### Pattern 6: Try/Except for Error Handling

```python
try:
    result = await api_call()
    return result
except Exception as e:
    logger.error(f"API error: {e}")
    return fallback_result()
```

**Why?**
- External APIs can fail
- Network can be unreliable
- Graceful degradation instead of crashes

---

## Debugging Tips

### Tip 1: Use Print Debugging

Add strategic print statements:

```python
async def search(self, topic: str):
    print(f"🔍 [DEBUG] Searching for: {topic}")
    
    enhanced = self._enhance_query(topic)
    print(f"🔍 [DEBUG] Enhanced to: {enhanced}")
    
    papers = await self.arxiv_tool.search_papers(enhanced)
    print(f"🔍 [DEBUG] Found {len(papers)} papers")
    
    return papers
```

---

### Tip 2: Check Logs

The app uses Python's logging:

```python
logging.basicConfig(
    level=logging.INFO,  # Change to DEBUG for more detail
    format='%(levelname)s - %(name)s - %(message)s'
)
```

Run with verbose logging:
```bash
python app.py 2>&1 | tee app.log
```

---

### Tip 3: Use Python REPL

Test small pieces interactively:

```bash
$ python
>>> from utils.script_formatter import estimate_duration
>>> text = "Hello world, this is a test."
>>> duration = estimate_duration(text)
>>> print(duration)
5
```

---

### Tip 4: Check Environment Variables

```bash
# Verify API keys are set
echo $ANTHROPIC_API_KEY
echo $ELEVENLABS_API_KEY

# Or in Python
import os
print(os.getenv("ANTHROPIC_API_KEY"))
```

---

### Tip 5: Test Error Cases

```python
# Test with invalid input
result = await storyteller.process_topic("")  # Empty string
result = await storyteller.process_topic("xyzinvalidtopic999")  # No results
```

---

### Tip 6: Use Async Debugger

For complex async issues:

```python
import asyncio
asyncio.run(my_function(), debug=True)  # Enables debug mode
```

---

## Further Resources

### Official Documentation

- **Python Async/Await**: [RealPython Guide](https://realpython.com/async-io-python/)
- **MCP Protocol**: [Official Docs](https://modelcontextprotocol.io/)
- **Anthropic Claude API**: [API Reference](https://docs.anthropic.com/claude/reference)
- **Gradio**: [Documentation](https://www.gradio.app/docs)
- **ElevenLabs**: [API Docs](https://elevenlabs.io/docs/api-reference)

### Learning Paths

**If you're new to async:**
1. Read RealPython's async guide
2. Practice with simple async examples
3. Understand event loops
4. Study this project's async chain

**If you're new to OOP:**
1. Python classes tutorial
2. Understand `self` and `__init__`
3. Practice with simple class examples
4. Study `ScienceStoryteller` class

**If you're new to MCP:**
1. Read MCP specification
2. Understand stdio transport
3. Study `ArxivTool` implementation
4. Try building your own MCP tool

### Practice Projects

**After understanding this codebase:**

1. **Add a new MCP tool**: Try Semantic Scholar instead of arXiv
2. **Add a new agent**: Create a fact-checking agent
3. **Extend functionality**: Add multiple podcast voices
4. **Improve error handling**: Better retry logic
5. **Add caching**: Cache arXiv results for 24 hours

---

## Review Checklist

Before moving on, can you answer:

- [ ] What's the difference between a class and an object?
- [ ] What does `self` refer to?
- [ ] When does `__init__` run?
- [ ] Why use `async`/`await`?
- [ ] How does the event loop work?
- [ ] What is MCP and why use it?
- [ ] How do the three agents differ?
- [ ] What does the orchestrator do?
- [ ] How does Gradio integrate with async?
- [ ] Where would you add error handling?
- [ ] What is the difference between a unit and an integration test?

---

## Your Learning Journey

**Recommended 3-Week Plan:**

### Week 1: Fundamentals
- Day 1-2: OOP basics (`__init__`, `self`, methods)
- Day 3-4: Async/await concepts
- Day 5-7: Study `utils/` and `mcp_tools/`

### Week 2: Implementation
- Day 8-10: Understand all three agents
- Day 11-12: Study orchestrator
- Day 13-14: Explore Gradio interface

### Week 3: Integration & Polish
- Day 15-17: Test full workflow
- Day 18-19: Fix bugs, improve error handling
- Day 20-21: Polish UI, prepare demo

---

**Remember:** Deep understanding takes time. Don't rush. Each module builds on the previous one. Master the basics before tackling integration!

---

**Last Updated:** November 17, 2025  
**Version:** 1.0  
**For:** MCP's 1st Birthday Hackathon 2025

---

## 🧪 Testing Strategy

A good testing strategy is crucial for building reliable software. For this project, we can use a model called the "Testing Pyramid."

### Unit Tests

**Definition:** Test individual components in isolation.

- **What to test:** Pure functions, methods with no external dependencies.
- **Tools:** Python's built-in `unittest` or `pytest`.
- **Example:**
    ```python
    import unittest

    class TestArxivTool(unittest.TestCase):
        def test_search_papers(self):
            tool = ArxivTool()
            result = asyncio.run(tool.search_papers("AI"))
            self.assertGreater(len(result), 0)
    ```

### Integration Tests

**Definition:** Test how components work together.

- **What to test:** Interactions between modules, like agent and tool communication.
- **Tools:** `pytest` with async support.
- **Example:**
    ```python
    async def test_agent_tool_integration():
        agent = ResearchAgent()
        await agent.initialize()
        
        papers = await agent.search("AI")
        self.assertIsInstance(papers, list)
        self.assertGreater(len(papers), 0)
    ```

### End-to-End Tests

**Definition:** Test the complete workflow from start to finish.

- **What to test:** User scenarios, like submitting a topic and receiving audio.
- **Tools:** Gradio's built-in testing, Selenium for UI tests.
- **Example:**
    ```python
    def test_gradio_interface(client):
        response = client.post("/api/predict", json={"data": "AI in healthcare"})
        assert response.status_code == 200
        assert "audio" in response.json()
    ```

### Load Tests

**Definition:** Test system behavior under heavy load.

- **What to test:** How the system handles many requests at once.
- **Tools:** Locust, JMeter.
- **Example:**
    ```
    locust -f load_test.py
    ```

### Security Tests

**Definition:** Identify vulnerabilities in the application.

- **What to test:** API security, data validation, authentication.
- **Tools:** OWASP ZAP, Burp Suite.
- **Example:**
    ```
    zap-cli quick-scan --self-contained --spider -r http://localhost:7860
    ```

### Best Practices

- **Automate tests**: Use CI/CD pipelines to run tests automatically.
- **Test coverage**: Aim for at least 80% coverage, but prioritize critical paths.
- **Mock external services**: Use tools like `vcr.py` or `responses` to mock API calls.
- **Data-driven tests**: Use parameterized tests to cover multiple scenarios.
- **Regularly review and update tests**: As the code evolves, so should the tests.

---