Spaces:

MCP-1st-Birthday
/

DeepBoner

Running

VibecoderMcSwaggins commited on 8 days ago

Commit

e720905

1 Parent(s): 631e5fc

fix: complete audit fixes for documentation accuracy

Code Changes:
- Remove DeepCriticalError backwards-compat alias (src/utils/exceptions.py)

Documentation Accuracy Fixes:
- Update CLAUDE.md, AGENTS.md, GEMINI.md: "Phases 1-13" → "Phases 1-14"
- Replace 11_phase_biorxiv.md with 11_phase_europepmc.md (actual implementation)
- Fix all bioRxiv → Europe PMC references in:
- workflow-diagrams.md (mermaid diagrams)
- 05_phase_magentic.md (code examples)
- 04_phase_ui.md (imports)
- roadmap.md (directory tree, phase list)
- index.md (links)

All 127 tests still pass. Documentation now accurately reflects codebase.

Files changed (11) hide show

AGENTS.md +1 -1
CLAUDE.md +1 -1
GEMINI.md +1 -6
docs/implementation/04_phase_ui.md +4 -4
docs/implementation/05_phase_magentic.md +15 -15
docs/implementation/11_phase_biorxiv.md +0 -572
docs/implementation/11_phase_europepmc.md +181 -0
docs/implementation/roadmap.md +4 -4
docs/index.md +1 -1
docs/workflow-diagrams.md +16 -16
src/utils/exceptions.py +0 -4

AGENTS.md CHANGED Viewed

@@ -6,7 +6,7 @@ This file provides guidance to AI agents when working with code in this reposito
 DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
-**Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
 ## Development Commands

 DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
+**Current Status:** Phases 1-14 COMPLETE (Foundation through Demo Submission).
 ## Development Commands

CLAUDE.md CHANGED Viewed

@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
-**Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
 ## Development Commands

 DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
+**Current Status:** Phases 1-14 COMPLETE (Foundation through Demo Submission).
 ## Development Commands

GEMINI.md CHANGED Viewed

@@ -8,12 +8,7 @@
 **Architecture:**
 The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orchestrator) and adheres to **Strict TDD** (Test-Driven Development).
-**Current Status:**
-- **Phases 1-9:** COMPLETE. Foundation, Search, Judge, UI, Orchestrator, Embeddings, Hypothesis, Report, Cleanup.
-- **Phases 10-11:** COMPLETE. ClinicalTrials.gov and Europe PMC integration.
-- **Phase 12:** COMPLETE. MCP Server integration (Gradio MCP at `/gradio_api/mcp/`).
-- **Phase 13:** COMPLETE. Modal sandbox for statistical analysis.
 ## Tech Stack & Tooling

 **Architecture:**
 The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orchestrator) and adheres to **Strict TDD** (Test-Driven Development).
+**Current Status:** Phases 1-14 COMPLETE (Foundation through Demo Submission).
 ## Tech Stack & Tooling

docs/implementation/04_phase_ui.md CHANGED Viewed

@@ -409,7 +409,7 @@ from typing import AsyncGenerator
 from src.orchestrator import Orchestrator
 from src.tools.pubmed import PubMedTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
-from src.tools.biorxiv import BioRxivTool
 from src.tools.search_handler import SearchHandler
 from src.agent_factory.judges import JudgeHandler, HFInferenceJudgeHandler
 from src.utils.models import OrchestratorConfig, AgentEvent
@@ -443,7 +443,7 @@ def create_orchestrator(
     # Create search tools
     search_handler = SearchHandler(
-        tools=[PubMedTool(), ClinicalTrialsTool(), BioRxivTool()],
         timeout=30.0,
     )
@@ -1033,7 +1033,7 @@ uv run python -m src.app
 import asyncio
 from src.orchestrator import Orchestrator
 from src.tools.pubmed import PubMedTool
-from src.tools.biorxiv import BioRxivTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
 from src.tools.search_handler import SearchHandler
 from src.agent_factory.judges import HFInferenceJudgeHandler, MockJudgeHandler
@@ -1041,7 +1041,7 @@ from src.utils.models import OrchestratorConfig
 async def test_full_flow():
     # Create components
-    search_handler = SearchHandler([PubMedTool(), ClinicalTrialsTool(), BioRxivTool()])
     # Option 1: Use FREE HuggingFace Inference (real AI analysis)
     judge_handler = HFInferenceJudgeHandler()

 from src.orchestrator import Orchestrator
 from src.tools.pubmed import PubMedTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.europepmc import EuropePMCTool
 from src.tools.search_handler import SearchHandler
 from src.agent_factory.judges import JudgeHandler, HFInferenceJudgeHandler
 from src.utils.models import OrchestratorConfig, AgentEvent
     # Create search tools
     search_handler = SearchHandler(
+        tools=[PubMedTool(), ClinicalTrialsTool(), EuropePMCTool()],
         timeout=30.0,
     )
 import asyncio
 from src.orchestrator import Orchestrator
 from src.tools.pubmed import PubMedTool
+from src.tools.europepmc import EuropePMCTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
 from src.tools.search_handler import SearchHandler
 from src.agent_factory.judges import HFInferenceJudgeHandler, MockJudgeHandler
 async def test_full_flow():
     # Create components
+    search_handler = SearchHandler([PubMedTool(), ClinicalTrialsTool(), EuropePMCTool()])
     # Option 1: Use FREE HuggingFace Inference (real AI analysis)
     judge_handler = HFInferenceJudgeHandler()

docs/implementation/05_phase_magentic.md CHANGED Viewed

@@ -97,9 +97,9 @@ async def search_pubmed(query: str, max_results: int = 10) -> str:
 search_agent = ChatAgent(
     name="SearchAgent",
     description="Searches biomedical databases for drug repurposing evidence",
-    instructions="You search PubMed, ClinicalTrials.gov, and bioRxiv for evidence.",
     chat_client=OpenAIChatClient(model_id="gpt-4o-mini"),  # INTERNAL LLM
-    tools=[search_pubmed, search_clinicaltrials, search_biorxiv],  # TOOLS
 )
 ```
@@ -286,14 +286,14 @@ This preserves semantic deduplication and structured citation access.
 from agent_framework import AIFunction
 from src.agents.state import get_magentic_state
-from src.tools.biorxiv import BioRxivTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
 from src.tools.pubmed import PubMedTool
 # Singleton tool instances
 _pubmed = PubMedTool()
 _clinicaltrials = ClinicalTrialsTool()
-_biorxiv = BioRxivTool()
 def _format_results(results: list, source_name: str, query: str) -> str:
@@ -382,21 +382,21 @@ async def search_clinical_trials(query: str, max_results: int = 10) -> str:
 @AIFunction
-async def search_preprints(query: str, max_results: int = 10) -> str:
-    """Search bioRxiv/medRxiv for preprint papers.
-    Use this tool to find the latest research that hasn't been
-    peer-reviewed yet. Good for cutting-edge findings.
     Args:
         query: Search terms (e.g., "long covid treatment")
         max_results: Maximum results to return (default 10)
     Returns:
-        Formatted list of preprints with abstracts and links
     """
     # 1. Execute search
-    results = await _biorxiv.search(query, max_results)
     # 2. Update shared state
     state = get_magentic_state()
@@ -406,7 +406,7 @@ async def search_preprints(query: str, max_results: int = 10) -> str:
     total_new = len(unique)
     total_stored = len(state.evidence_store)
-    output = _format_results(results, "bioRxiv/medRxiv", query)
     output += f"\n[State: {total_new} new, {total_stored} total in evidence store]"
     return output
@@ -513,7 +513,7 @@ def create_search_agent(chat_client: OpenAIChatClient | None = None) -> ChatAgen
     return ChatAgent(
         name="SearchAgent",
-        description="Searches biomedical databases (PubMed, ClinicalTrials.gov, bioRxiv) for drug repurposing evidence",
         instructions="""You are a biomedical search specialist. When asked to find evidence:
 1. Analyze the request to determine what to search for
@@ -521,13 +521,13 @@ def create_search_agent(chat_client: OpenAIChatClient | None = None) -> ChatAgen
 3. Use the appropriate search tools:
    - search_pubmed for peer-reviewed papers
    - search_clinical_trials for clinical studies
-   - search_preprints for cutting-edge findings
 4. Summarize what you found and highlight key evidence
 Be thorough - search multiple databases when appropriate.
 Focus on finding: mechanisms of action, clinical evidence, and specific drug candidates.""",
         chat_client=client,
-        tools=[search_pubmed, search_clinical_trials, search_preprints],
         temperature=0.3,  # More deterministic for tool use
     )
@@ -790,7 +790,7 @@ class MagenticOrchestrator:
         task = f"""Research drug repurposing opportunities for: {query}
 Workflow:
-1. SearchAgent: Find evidence from PubMed, ClinicalTrials.gov, and bioRxiv
 2. HypothesisAgent: Generate mechanistic hypotheses (Drug → Target → Pathway → Effect)
 3. JudgeAgent: Evaluate if evidence is sufficient
 4. If insufficient → SearchAgent refines search based on gaps

 search_agent = ChatAgent(
     name="SearchAgent",
     description="Searches biomedical databases for drug repurposing evidence",
+    instructions="You search PubMed, ClinicalTrials.gov, and Europe PMC for evidence.",
     chat_client=OpenAIChatClient(model_id="gpt-4o-mini"),  # INTERNAL LLM
+    tools=[search_pubmed, search_clinicaltrials, search_europepmc],  # TOOLS
 )
 ```
 from agent_framework import AIFunction
 from src.agents.state import get_magentic_state
+from src.tools.europepmc import EuropePMCTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
 from src.tools.pubmed import PubMedTool
 # Singleton tool instances
 _pubmed = PubMedTool()
 _clinicaltrials = ClinicalTrialsTool()
+_europepmc = EuropePMCTool()
 def _format_results(results: list, source_name: str, query: str) -> str:
 @AIFunction
+async def search_europepmc(query: str, max_results: int = 10) -> str:
+    """Search Europe PMC for preprints and papers.
+    Use this tool to find the latest research including preprints
+    from bioRxiv, medRxiv, and peer-reviewed papers.
     Args:
         query: Search terms (e.g., "long covid treatment")
         max_results: Maximum results to return (default 10)
     Returns:
+        Formatted list of papers with abstracts and links
     """
     # 1. Execute search
+    results = await _europepmc.search(query, max_results)
     # 2. Update shared state
     state = get_magentic_state()
     total_new = len(unique)
     total_stored = len(state.evidence_store)
+    output = _format_results(results, "Europe PMC", query)
     output += f"\n[State: {total_new} new, {total_stored} total in evidence store]"
     return output
     return ChatAgent(
         name="SearchAgent",
+        description="Searches biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) for drug repurposing evidence",
         instructions="""You are a biomedical search specialist. When asked to find evidence:
 1. Analyze the request to determine what to search for
 3. Use the appropriate search tools:
    - search_pubmed for peer-reviewed papers
    - search_clinical_trials for clinical studies
+   - search_europepmc for preprints and additional papers
 4. Summarize what you found and highlight key evidence
 Be thorough - search multiple databases when appropriate.
 Focus on finding: mechanisms of action, clinical evidence, and specific drug candidates.""",
         chat_client=client,
+        tools=[search_pubmed, search_clinical_trials, search_europepmc],
         temperature=0.3,  # More deterministic for tool use
     )
         task = f"""Research drug repurposing opportunities for: {query}
 Workflow:
+1. SearchAgent: Find evidence from PubMed, ClinicalTrials.gov, and Europe PMC
 2. HypothesisAgent: Generate mechanistic hypotheses (Drug → Target → Pathway → Effect)
 3. JudgeAgent: Evaluate if evidence is sufficient
 4. If insufficient → SearchAgent refines search based on gaps

docs/implementation/11_phase_biorxiv.md DELETED Viewed

@@ -1,572 +0,0 @@
-# Phase 11 Implementation Spec: bioRxiv Preprint Integration
-**Goal**: Add cutting-edge preprint search for the latest research.
-**Philosophy**: "Preprints are where breakthroughs appear first."
-**Prerequisite**: Phase 10 complete (ClinicalTrials.gov working)
-**Estimated Time**: 2-3 hours
----
-## 1. Why bioRxiv?
-### Scientific Value
-| Feature | Value for Drug Repurposing |
-|---------|---------------------------|
-| **Cutting-edge research** | 6-12 months ahead of PubMed |
-| **Rapid publication** | Days, not months |
-| **Free full-text** | Complete papers, not just abstracts |
-| **medRxiv included** | Medical preprints via same API |
-| **No API key required** | Free and open |
-### The Preprint Advantage
-```
-Traditional Publication Timeline:
-  Research → Submit → Review → Revise → Accept → Publish
-  |___________________________ 6-18 months _______________|
-Preprint Timeline:
-  Research → Upload → Available
-  |______ 1-3 days ______|
-```
-**For drug repurposing**: Preprints contain the newest hypotheses and evidence!
----
-## 2. API Specification
-### Endpoint
-```
-Base URL: https://api.biorxiv.org/details/[server]/[interval]/[cursor]/[format]
-```
-### Servers
-| Server | Content |
-|--------|---------|
-| `biorxiv` | Biology preprints |
-| `medrxiv` | Medical preprints (more relevant for us!) |
-### Interval Formats
-| Format | Example | Description |
-|--------|---------|-------------|
-| Date range | `2024-01-01/2024-12-31` | Papers between dates |
-| Recent N | `50` | Most recent N papers |
-| Recent N days | `30d` | Papers from last N days |
-### Response Format
-```json
-{
-  "collection": [
-    {
-      "doi": "10.1101/2024.01.15.123456",
-      "title": "Metformin repurposing for neurodegeneration",
-      "authors": "Smith, J; Jones, A",
-      "date": "2024-01-15",
-      "category": "neuroscience",
-      "abstract": "We investigated metformin's potential..."
-    }
-  ],
-  "messages": [{"status": "ok", "count": 100}]
-}
-```
-### Rate Limits
-- No official limit, but be respectful
-- Results paginated (100 per call)
-- Use cursor for pagination
-### Documentation
-- [bioRxiv API](https://api.biorxiv.org/)
-- [medrxivr R package docs](https://docs.ropensci.org/medrxivr/)
----
-## 3. Search Strategy
-### Challenge: bioRxiv API Limitations
-The bioRxiv API does NOT support keyword search directly. It returns papers by:
-- Date range
-- Recent count
-### Solution: Client-Side Filtering
-```python
-# Strategy:
-# 1. Fetch recent papers (e.g., last 90 days)
-# 2. Filter by keyword matching in title/abstract
-# 3. Use embeddings for semantic matching (leverage Phase 6!)
-```
-### Alternative: Content Search Endpoint
-```
-https://api.biorxiv.org/pubs/[server]/[doi_prefix]
-```
-For searching, we can use the publisher endpoint with filtering.
----
-## 4. Data Model
-### 4.1 Update Citation Source Type (`src/utils/models.py`)
-```python
-# After Phase 11
-source: Literal["pubmed", "clinicaltrials", "biorxiv"]
-```
-### 4.2 Evidence from Preprints
-```python
-Evidence(
-    content=abstract[:2000],
-    citation=Citation(
-        source="biorxiv",  # or "medrxiv"
-        title=title,
-        url=f"https://doi.org/{doi}",
-        date=date,
-        authors=authors.split("; ")[:5]
-    ),
-    relevance=0.75  # Preprints slightly lower than peer-reviewed
-)
-```
----
-## 5. Implementation
-### 5.1 bioRxiv Tool (`src/tools/biorxiv.py`)
-```python
-"""bioRxiv/medRxiv preprint search tool."""
-import re
-from datetime import datetime, timedelta
-import httpx
-from tenacity import retry, stop_after_attempt, wait_exponential
-from src.utils.exceptions import SearchError
-from src.utils.models import Citation, Evidence
-class BioRxivTool:
-    """Search tool for bioRxiv and medRxiv preprints."""
-    BASE_URL = "https://api.biorxiv.org/details"
-    # Use medRxiv for medical/clinical content (more relevant for drug repurposing)
-    DEFAULT_SERVER = "medrxiv"
-    # Fetch papers from last N days
-    DEFAULT_DAYS = 90
-    def __init__(self, server: str = DEFAULT_SERVER, days: int = DEFAULT_DAYS):
-        """
-        Initialize bioRxiv tool.
-        Args:
-            server: "biorxiv" or "medrxiv"
-            days: How many days back to search
-        """
-        self.server = server
-        self.days = days
-    @property
-    def name(self) -> str:
-        return "biorxiv"
-    @retry(
-        stop=stop_after_attempt(3),
-        wait=wait_exponential(multiplier=1, min=1, max=10),
-        reraise=True,
-    )
-    async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-        """
-        Search bioRxiv/medRxiv for preprints matching query.
-        Note: bioRxiv API doesn't support keyword search directly.
-        We fetch recent papers and filter client-side.
-        Args:
-            query: Search query (keywords)
-            max_results: Maximum results to return
-        Returns:
-            List of Evidence objects from preprints
-        """
-        # Build date range for last N days
-        end_date = datetime.now().strftime("%Y-%m-%d")
-        start_date = (datetime.now() - timedelta(days=self.days)).strftime("%Y-%m-%d")
-        interval = f"{start_date}/{end_date}"
-        # Fetch recent papers
-        url = f"{self.BASE_URL}/{self.server}/{interval}/0/json"
-        async with httpx.AsyncClient(timeout=30.0) as client:
-            try:
-                response = await client.get(url)
-                response.raise_for_status()
-            except httpx.HTTPStatusError as e:
-                raise SearchError(f"bioRxiv search failed: {e}") from e
-            data = response.json()
-            papers = data.get("collection", [])
-            # Filter papers by query keywords
-            query_terms = self._extract_terms(query)
-            matching = self._filter_by_keywords(papers, query_terms, max_results)
-            return [self._paper_to_evidence(paper) for paper in matching]
-    def _extract_terms(self, query: str) -> list[str]:
-        """Extract search terms from query."""
-        # Simple tokenization, lowercase
-        terms = re.findall(r'\b\w+\b', query.lower())
-        # Filter out common stop words
-        stop_words = {'the', 'a', 'an', 'in', 'on', 'for', 'and', 'or', 'of', 'to'}
-        return [t for t in terms if t not in stop_words and len(t) > 2]
-    def _filter_by_keywords(
-        self, papers: list[dict], terms: list[str], max_results: int
-    ) -> list[dict]:
-        """Filter papers that contain query terms in title or abstract."""
-        scored_papers = []
-        for paper in papers:
-            title = paper.get("title", "").lower()
-            abstract = paper.get("abstract", "").lower()
-            text = f"{title} {abstract}"
-            # Count matching terms
-            matches = sum(1 for term in terms if term in text)
-            if matches > 0:
-                scored_papers.append((matches, paper))
-        # Sort by match count (descending)
-        scored_papers.sort(key=lambda x: x[0], reverse=True)
-        return [paper for _, paper in scored_papers[:max_results]]
-    def _paper_to_evidence(self, paper: dict) -> Evidence:
-        """Convert a preprint paper to Evidence."""
-        doi = paper.get("doi", "")
-        title = paper.get("title", "Untitled")
-        authors_str = paper.get("authors", "Unknown")
-        date = paper.get("date", "Unknown")
-        abstract = paper.get("abstract", "No abstract available.")
-        category = paper.get("category", "")
-        # Parse authors (format: "Smith, J; Jones, A")
-        authors = [a.strip() for a in authors_str.split(";")][:5]
-        # Note this is a preprint in the content
-        content = (
-            f"[PREPRINT - Not peer-reviewed] "
-            f"{abstract[:1800]}... "
-            f"Category: {category}."
-        )
-        return Evidence(
-            content=content[:2000],
-            citation=Citation(
-                source="biorxiv",
-                title=title[:500],
-                url=f"https://doi.org/{doi}" if doi else f"https://www.medrxiv.org/",
-                date=date,
-                authors=authors,
-            ),
-            relevance=0.75,  # Slightly lower than peer-reviewed
-        )
-```
----
-## 6. TDD Test Suite
-### 6.1 Unit Tests (`tests/unit/tools/test_biorxiv.py`)
-```python
-"""Unit tests for bioRxiv tool."""
-import pytest
-import respx
-from httpx import Response
-from src.tools.biorxiv import BioRxivTool
-from src.utils.models import Evidence
-@pytest.fixture
-def mock_biorxiv_response():
-    """Mock bioRxiv API response."""
-    return {
-        "collection": [
-            {
-                "doi": "10.1101/2024.01.15.24301234",
-                "title": "Metformin repurposing for Alzheimer's disease: a systematic review",
-                "authors": "Smith, John; Jones, Alice; Brown, Bob",
-                "date": "2024-01-15",
-                "category": "neurology",
-                "abstract": "Background: Metformin has shown neuroprotective effects. "
-                           "We conducted a systematic review of metformin's potential "
-                           "for Alzheimer's disease treatment."
-            },
-            {
-                "doi": "10.1101/2024.01.10.24301111",
-                "title": "COVID-19 vaccine efficacy study",
-                "authors": "Wilson, C",
-                "date": "2024-01-10",
-                "category": "infectious diseases",
-                "abstract": "This study evaluates COVID-19 vaccine efficacy."
-            }
-        ],
-        "messages": [{"status": "ok", "count": 2}]
-    }
-class TestBioRxivTool:
-    """Tests for BioRxivTool."""
-    def test_tool_name(self):
-        """Tool should have correct name."""
-        tool = BioRxivTool()
-        assert tool.name == "biorxiv"
-    def test_default_server_is_medrxiv(self):
-        """Default server should be medRxiv for medical relevance."""
-        tool = BioRxivTool()
-        assert tool.server == "medrxiv"
-    @pytest.mark.asyncio
-    @respx.mock
-    async def test_search_returns_evidence(self, mock_biorxiv_response):
-        """Search should return Evidence objects."""
-        respx.get(url__startswith="https://api.biorxiv.org/details").mock(
-            return_value=Response(200, json=mock_biorxiv_response)
-        )
-        tool = BioRxivTool()
-        results = await tool.search("metformin alzheimer", max_results=5)
-        assert len(results) == 1  # Only the matching paper
-        assert isinstance(results[0], Evidence)
-        assert results[0].citation.source == "biorxiv"
-        assert "metformin" in results[0].citation.title.lower()
-    @pytest.mark.asyncio
-    @respx.mock
-    async def test_search_filters_by_keywords(self, mock_biorxiv_response):
-        """Search should filter papers by query keywords."""
-        respx.get(url__startswith="https://api.biorxiv.org/details").mock(
-            return_value=Response(200, json=mock_biorxiv_response)
-        )
-        tool = BioRxivTool()
-        # Search for metformin - should match first paper
-        results = await tool.search("metformin")
-        assert len(results) == 1
-        assert "metformin" in results[0].citation.title.lower()
-        # Search for COVID - should match second paper
-        results = await tool.search("covid vaccine")
-        assert len(results) == 1
-        assert "covid" in results[0].citation.title.lower()
-    @pytest.mark.asyncio
-    @respx.mock
-    async def test_search_marks_as_preprint(self, mock_biorxiv_response):
-        """Evidence content should note it's a preprint."""
-        respx.get(url__startswith="https://api.biorxiv.org/details").mock(
-            return_value=Response(200, json=mock_biorxiv_response)
-        )
-        tool = BioRxivTool()
-        results = await tool.search("metformin")
-        assert "PREPRINT" in results[0].content
-        assert "Not peer-reviewed" in results[0].content
-    @pytest.mark.asyncio
-    @respx.mock
-    async def test_search_empty_results(self):
-        """Search should handle empty results gracefully."""
-        respx.get(url__startswith="https://api.biorxiv.org/details").mock(
-            return_value=Response(200, json={"collection": [], "messages": []})
-        )
-        tool = BioRxivTool()
-        results = await tool.search("xyznonexistent")
-        assert results == []
-    @pytest.mark.asyncio
-    @respx.mock
-    async def test_search_api_error(self):
-        """Search should raise SearchError on API failure."""
-        from src.utils.exceptions import SearchError
-        respx.get(url__startswith="https://api.biorxiv.org/details").mock(
-            return_value=Response(500, text="Internal Server Error")
-        )
-        tool = BioRxivTool()
-        with pytest.raises(SearchError):
-            await tool.search("metformin")
-    def test_extract_terms(self):
-        """Should extract meaningful search terms."""
-        tool = BioRxivTool()
-        terms = tool._extract_terms("metformin for Alzheimer's disease")
-        assert "metformin" in terms
-        assert "alzheimer" in terms
-        assert "disease" in terms
-        assert "for" not in terms  # Stop word
-        assert "the" not in terms  # Stop word
-class TestBioRxivIntegration:
-    """Integration tests (marked for separate run)."""
-    @pytest.mark.integration
-    @pytest.mark.asyncio
-    async def test_real_api_call(self):
-        """Test actual API call (requires network)."""
-        tool = BioRxivTool(days=30)  # Last 30 days
-        results = await tool.search("diabetes", max_results=3)
-        # May or may not find results depending on recent papers
-        assert isinstance(results, list)
-        for r in results:
-            assert isinstance(r, Evidence)
-            assert r.citation.source == "biorxiv"
-```
----
-## 7. Integration with SearchHandler
-### 7.1 Final SearchHandler Configuration
-```python
-# examples/search_demo/run_search.py
-from src.tools.biorxiv import BioRxivTool
-from src.tools.clinicaltrials import ClinicalTrialsTool
-from src.tools.pubmed import PubMedTool
-from src.tools.search_handler import SearchHandler
-search_handler = SearchHandler(
-    tools=[
-        PubMedTool(),           # Peer-reviewed papers
-        ClinicalTrialsTool(),   # Clinical trials
-        BioRxivTool(),          # Preprints (cutting edge)
-    ],
-    timeout=30.0
-)
-```
-### 7.2 Final Type Definition
-```python
-# src/utils/models.py
-sources_searched: list[Literal["pubmed", "clinicaltrials", "biorxiv"]]
-```
----
-## 8. Definition of Done
-Phase 11 is **COMPLETE** when:
-- [ ] `src/tools/biorxiv.py` implemented
-- [ ] Unit tests in `tests/unit/tools/test_biorxiv.py`
-- [ ] Integration test marked with `@pytest.mark.integration`
-- [ ] SearchHandler updated to include BioRxivTool
-- [ ] Type definitions updated in models.py
-- [ ] Example files updated
-- [ ] All unit tests pass
-- [ ] Lints pass
-- [ ] Manual verification with real API
----
-## 9. Verification Commands
-```bash
-# 1. Run unit tests
-uv run pytest tests/unit/tools/test_biorxiv.py -v
-# 2. Run integration test (requires network)
-uv run pytest tests/unit/tools/test_biorxiv.py -v -m integration
-# 3. Run full test suite
-uv run pytest tests/unit/ -v
-# 4. Run example with all three sources
-source .env && uv run python examples/search_demo/run_search.py "metformin diabetes"
-# Should show results from PubMed, ClinicalTrials.gov, AND bioRxiv/medRxiv
-```
----
-## 10. Value Delivered
-| Before | After |
-|--------|-------|
-| Only published papers | Published + Preprints |
-| 6-18 month lag | Near real-time research |
-| Miss cutting-edge | Catch breakthroughs early |
-**Demo pitch (final)**:
-> "DeepBoner searches PubMed for peer-reviewed evidence, ClinicalTrials.gov for 400,000+ clinical trials, and bioRxiv/medRxiv for cutting-edge preprints - then uses LLMs to generate mechanistic hypotheses and synthesize findings into publication-quality reports."
----
-## 11. Complete Source Architecture (After Phase 11)
-```
-User Query: "Can metformin treat Alzheimer's?"
-                    |
-                    v
-            SearchHandler
-                    |
-    ┌───────────────┼───────────────┐
-    |               |               |
-    v               v               v
-PubMedTool    ClinicalTrials   BioRxivTool
-    |          Tool               |
-    |               |               |
-    v               v               v
-"15 peer-    "3 Phase II     "2 preprints
-reviewed      trials          from last
-papers"       recruiting"     90 days"
-    |               |               |
-    └───────────────┼───────────────┘
-                    |
-                    v
-            Evidence Pool
-                    |
-                    v
-        EmbeddingService.deduplicate()
-                    |
-                    v
-        HypothesisAgent → JudgeAgent → ReportAgent
-                    |
-                    v
-        Structured Research Report
-```
-**This is the Gucci Banger stack.**

docs/implementation/11_phase_europepmc.md ADDED Viewed

	@@ -0,0 +1,181 @@

+# Phase 11 Implementation Spec: Europe PMC Integration
+> **Status**: ✅ COMPLETE
+> **Implemented**: `src/tools/europepmc.py`
+> **Tests**: `tests/unit/tools/test_europepmc.py`
+## Overview
+Europe PMC provides access to preprints and peer-reviewed literature through a single, well-designed REST API. This replaces the originally planned bioRxiv integration due to bioRxiv's API limitations (no keyword search).
+## Why Europe PMC Over bioRxiv?
+### bioRxiv API Limitations (Why We Abandoned It)
+- bioRxiv API does NOT support keyword search
+- Only supports date-range queries returning all papers
+- Would require downloading entire date ranges and filtering client-side
+- Inefficient and impractical for our use case
+### Europe PMC Advantages
+1. **Full keyword search** - Query by any term
+2. **Aggregates preprints** - Includes bioRxiv, medRxiv, ChemRxiv content
+3. **No authentication required** - Free, open API
+4. **34+ preprint servers indexed** - Not just bioRxiv
+5. **REST API with JSON** - Easy integration
+## API Reference
+**Base URL**: `https://www.ebi.ac.uk/europepmc/webservices/rest/search`
+**Documentation**: https://europepmc.org/RestfulWebService
+### Parameters
+| Parameter | Value | Description |
+|-----------|-------|-------------|
+| `query` | string | Search keywords |
+| `resultType` | `core` | Full metadata including abstracts |
+| `pageSize` | 1-100 | Results per page |
+| `format` | `json` | Response format |
+### Example Request
+```
+GET https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=metformin+alzheimer&resultType=core&pageSize=10&format=json
+```
+## Implementation
+### EuropePMCTool (`src/tools/europepmc.py`)
+```python
+class EuropePMCTool:
+    """
+    Search Europe PMC for papers and preprints.
+    Europe PMC indexes:
+    - PubMed/MEDLINE articles
+    - PMC full-text articles
+    - Preprints from bioRxiv, medRxiv, ChemRxiv, etc.
+    - Patents and clinical guidelines
+    """
+    BASE_URL = "https://www.ebi.ac.uk/europepmc/webservices/rest/search"
+    @property
+    def name(self) -> str:
+        return "europepmc"
+    async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
+        """Search Europe PMC for papers matching query."""
+        # Implementation with retry logic, error handling
+```
+### Key Features
+1. **Preprint Detection**: Automatically identifies preprints via `pubTypeList`
+2. **Preprint Marking**: Adds `[PREPRINT - Not peer-reviewed]` prefix to content
+3. **Relevance Scoring**: Preprints get 0.75, peer-reviewed get 0.9
+4. **URL Resolution**: DOI → PubMed → Europe PMC fallback chain
+5. **Retry Logic**: 3 attempts with exponential backoff via tenacity
+### Response Mapping
+| Europe PMC Field | Evidence Field |
+|------------------|----------------|
+| `title` | `citation.title` |
+| `abstractText` | `content` |
+| `doi` | Used for URL |
+| `pubYear` | `citation.date` |
+| `authorList.author` | `citation.authors` |
+| `pubTypeList.pubType` | Determines `citation.source` ("preprint" or "europepmc") |
+## Unit Tests
+### Test Coverage (`tests/unit/tools/test_europepmc.py`)
+| Test | Description |
+|------|-------------|
+| `test_tool_name` | Verifies tool name is "europepmc" |
+| `test_search_returns_evidence` | Basic search returns Evidence objects |
+| `test_search_marks_preprints` | Preprints have [PREPRINT] marker and source="preprint" |
+| `test_search_empty_results` | Handles empty results gracefully |
+### Integration Test
+```python
+@pytest.mark.integration
+async def test_real_api_call():
+    """Test actual API returns relevant results."""
+    tool = EuropePMCTool()
+    results = await tool.search("long covid treatment", max_results=3)
+    assert len(results) > 0
+```
+## SearchHandler Integration
+Europe PMC is included in `src/tools/search_handler.py` alongside PubMed and ClinicalTrials:
+```python
+from src.tools.europepmc import EuropePMCTool
+class SearchHandler:
+    def __init__(self):
+        self.tools = [
+            PubMedTool(),
+            ClinicalTrialsTool(),
+            EuropePMCTool(),  # Preprints + peer-reviewed
+        ]
+```
+## MCP Tools Integration
+Europe PMC is exposed via MCP in `src/mcp_tools.py`:
+```python
+async def search_europepmc(query: str, max_results: int = 10) -> str:
+    """Search Europe PMC for preprints and papers."""
+    results = await _europepmc.search(query, max_results)
+    # Format and return
+```
+## Verification
+```bash
+# Run unit tests
+uv run pytest tests/unit/tools/test_europepmc.py -v
+# Run integration test (real API)
+uv run pytest tests/unit/tools/test_europepmc.py -v -m integration
+```
+## Completion Checklist
+- [x] `src/tools/europepmc.py` implemented
+- [x] Unit tests in `tests/unit/tools/test_europepmc.py`
+- [x] Integration test with real API
+- [x] SearchHandler includes EuropePMCTool
+- [x] MCP wrapper in `src/mcp_tools.py`
+- [x] Preprint detection and marking
+- [x] Retry logic with exponential backoff
+## Architecture Diagram
+```
+��─────────────────────────────────────────────────────────┐
+│                    SearchHandler                        │
+├─────────────────────────────────────────────────────────┤
+│  ┌─────────────┐  ┌──────────────┐  ┌───────────────┐  │
+│  │ PubMedTool  │  │ClinicalTrials│  │ EuropePMCTool │  │
+│  │             │  │    Tool      │  │               │  │
+│  │ Peer-review │  │   Trials     │  │  Preprints +  │  │
+│  │  articles   │  │   data       │  │  peer-review  │  │
+│  └──────┬──────┘  └──────┬───────┘  └───────┬───────┘  │
+│         │                │                  │          │
+│         ▼                ▼                  ▼          │
+│    ┌─────────────────────────────────────────────┐     │
+│    │              Evidence List                  │     │
+│    │  (deduplicated, scored, with citations)     │     │
+│    └─────────────────────────────────────────────┘     │
+└─────────────────────────────────────────────────────────┘
+```

docs/implementation/roadmap.md CHANGED Viewed

@@ -42,7 +42,7 @@ src/
 │   ├── __init__.py
 │   ├── pubmed.py               # PubMed E-utilities tool
 │   ├── clinicaltrials.py       # ClinicalTrials.gov API
-│   ├── biorxiv.py              # bioRxiv/medRxiv preprints
 │   ├── code_execution.py       # Modal sandbox execution
 │   └── search_handler.py       # Orchestrates multiple tools
 ├── prompts/                    # Prompt templates
@@ -64,7 +64,7 @@ tests/
 │   ├── tools/
 │   │   ├── test_pubmed.py
 │   │   ├── test_clinicaltrials.py
-│   │   ├── test_biorxiv.py
 │   │   └── test_search_handler.py
 │   ├── agent_factory/
 │   │   └── test_judges.py
@@ -201,7 +201,7 @@ Structured Research Report
 9. **[Phase 9 Spec: Remove DuckDuckGo](09_phase_source_cleanup.md)** ✅
 10. **[Phase 10 Spec: ClinicalTrials.gov](10_phase_clinicaltrials.md)** ✅
-11. **[Phase 11 Spec: bioRxiv Preprints](11_phase_biorxiv.md)** ✅
 ### Hackathon Integration (Phases 12-14)
@@ -225,7 +225,7 @@ Structured Research Report
 | Phase 8: Report | ✅ COMPLETE | Structured scientific reports |
 | Phase 9: Source Cleanup | ✅ COMPLETE | Remove DuckDuckGo |
 | Phase 10: ClinicalTrials | ✅ COMPLETE | ClinicalTrials.gov API |
-| Phase 11: bioRxiv | ✅ COMPLETE | Preprint search |
 | Phase 12: MCP Server | ✅ COMPLETE | MCP protocol integration |
 | Phase 13: Modal Pipeline | 📝 SPEC READY | Sandboxed code execution |
 | Phase 14: Demo & Submit | 📝 SPEC READY | Hackathon submission |

 │   ├── __init__.py
 │   ├── pubmed.py               # PubMed E-utilities tool
 │   ├── clinicaltrials.py       # ClinicalTrials.gov API
+│   ├── europepmc.py            # Europe PMC (preprints + papers)
 │   ├── code_execution.py       # Modal sandbox execution
 │   └── search_handler.py       # Orchestrates multiple tools
 ├── prompts/                    # Prompt templates
 │   ├── tools/
 │   │   ├── test_pubmed.py
 │   │   ├── test_clinicaltrials.py
+│   │   ├── test_europepmc.py
 │   │   └── test_search_handler.py
 │   ├── agent_factory/
 │   │   └── test_judges.py
 9. **[Phase 9 Spec: Remove DuckDuckGo](09_phase_source_cleanup.md)** ✅
 10. **[Phase 10 Spec: ClinicalTrials.gov](10_phase_clinicaltrials.md)** ✅
+11. **[Phase 11 Spec: Europe PMC](11_phase_europepmc.md)** ✅
 ### Hackathon Integration (Phases 12-14)
 | Phase 8: Report | ✅ COMPLETE | Structured scientific reports |
 | Phase 9: Source Cleanup | ✅ COMPLETE | Remove DuckDuckGo |
 | Phase 10: ClinicalTrials | ✅ COMPLETE | ClinicalTrials.gov API |
+| Phase 11: Europe PMC | ✅ COMPLETE | Preprint search |
 | Phase 12: MCP Server | ✅ COMPLETE | MCP protocol integration |
 | Phase 13: Modal Pipeline | 📝 SPEC READY | Sandboxed code execution |
 | Phase 14: Demo & Submit | 📝 SPEC READY | Hackathon submission |

docs/index.md CHANGED Viewed

@@ -25,7 +25,7 @@ AI-powered deep research system for sexual wellness, reproductive health, and ho
 - **[Phase 8: Report](implementation/08_phase_report.md)** ✅ - Structured scientific reports
 - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** ✅ - Remove DuckDuckGo
 - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** ✅ - Clinical trials API
-- **[Phase 11: Europe PMC](implementation/11_phase_biorxiv.md)** ✅ - Preprint search
 - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** ✅ - Claude Desktop integration
 - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** ✅ - Secure code execution
 - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** ✅ - Hackathon submission

 - **[Phase 8: Report](implementation/08_phase_report.md)** ✅ - Structured scientific reports
 - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** ✅ - Remove DuckDuckGo
 - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** ✅ - Clinical trials API
+- **[Phase 11: Europe PMC](implementation/11_phase_europepmc.md)** ✅ - Preprint search
 - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** ✅ - Claude Desktop integration
 - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** ✅ - Secure code execution
 - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** ✅ - Hackathon submission

docs/workflow-diagrams.md CHANGED Viewed

@@ -85,7 +85,7 @@ graph TB
     end
     subgraph "MCP Tools"
-        WebSearch[Web Search<br/>PubMed • arXiv • bioRxiv]
         CodeExec[Code Execution<br/>Sandboxed Python]
         RAG[RAG Retrieval<br/>Vector DB • Embeddings]
         Viz[Visualization<br/>Charts • Graphs]
@@ -229,12 +229,12 @@ flowchart TD
     Strategy --> Multi[Multi-Source Search]
     Multi --> PubMed[PubMed Search<br/>via MCP]
-    Multi --> ArXiv[arXiv Search<br/>via MCP]
-    Multi --> BioRxiv[bioRxiv Search<br/>via MCP]
     PubMed --> Aggregate[Aggregate Results]
-    ArXiv --> Aggregate
-    BioRxiv --> Aggregate
     Aggregate --> Filter[Filter & Rank<br/>by Relevance]
     Filter --> Dedup[Deduplicate<br/>Cross-Reference]
@@ -388,7 +388,7 @@ graph TB
     end
     subgraph "MCP Servers"
-        Server1[Web Search Server<br/>localhost:8001<br/>• PubMed<br/>• arXiv<br/>• bioRxiv]
         Server2[Code Execution Server<br/>localhost:8002<br/>• Sandboxed Python<br/>• Package management]
         Server3[RAG Server<br/>localhost:8003<br/>• Vector embeddings<br/>• Similarity search]
         Server4[Visualization Server<br/>localhost:8004<br/>• Chart generation<br/>• Plot rendering]
@@ -396,8 +396,8 @@ graph TB
     subgraph "External Services"
         PubMed[PubMed API]
-        ArXiv[arXiv API]
-        BioRxiv[bioRxiv API]
         Modal[Modal Sandbox]
         ChromaDB[(ChromaDB)]
     end
@@ -412,8 +412,8 @@ graph TB
     Registry --> Server4
     Server1 --> PubMed
-    Server1 --> ArXiv
-    Server1 --> BioRxiv
     Server2 --> Modal
     Server3 --> ChromaDB
@@ -517,8 +517,8 @@ graph LR
     User[👤 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepBoner<br/>Magentic Workflow]
     DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
-    DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
-    DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]
     DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]
     DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]
     DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]
@@ -526,8 +526,8 @@ graph LR
     DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]
     PubMed -->|Results| DC
-    ArXiv -->|Results| DC
-    BioRxiv -->|Results| DC
     Claude -->|Responses| DC
     Modal -->|Output| DC
     Chroma -->|Context| DC
@@ -537,8 +537,8 @@ graph LR
     style User fill:#e1f5e1
     style DC fill:#ffe6e6
     style PubMed fill:#e6f3ff
-    style ArXiv fill:#e6f3ff
-    style BioRxiv fill:#e6f3ff
     style Claude fill:#ffd6d6
     style Modal fill:#f0f0f0
     style Chroma fill:#ffe6f0

     end
     subgraph "MCP Tools"
+        WebSearch[Web Search<br/>PubMed • ClinicalTrials • Europe PMC]
         CodeExec[Code Execution<br/>Sandboxed Python]
         RAG[RAG Retrieval<br/>Vector DB • Embeddings]
         Viz[Visualization<br/>Charts • Graphs]
     Strategy --> Multi[Multi-Source Search]
     Multi --> PubMed[PubMed Search<br/>via MCP]
+    Multi --> Trials[ClinicalTrials Search<br/>via MCP]
+    Multi --> EuropePMC[Europe PMC Search<br/>via MCP]
     PubMed --> Aggregate[Aggregate Results]
+    Trials --> Aggregate
+    EuropePMC --> Aggregate
     Aggregate --> Filter[Filter & Rank<br/>by Relevance]
     Filter --> Dedup[Deduplicate<br/>Cross-Reference]
     end
     subgraph "MCP Servers"
+        Server1[Web Search Server<br/>localhost:8001<br/>• PubMed<br/>• ClinicalTrials<br/>• Europe PMC]
         Server2[Code Execution Server<br/>localhost:8002<br/>• Sandboxed Python<br/>• Package management]
         Server3[RAG Server<br/>localhost:8003<br/>• Vector embeddings<br/>• Similarity search]
         Server4[Visualization Server<br/>localhost:8004<br/>• Chart generation<br/>• Plot rendering]
     subgraph "External Services"
         PubMed[PubMed API]
+        Trials[ClinicalTrials.gov API]
+        EuropePMC[Europe PMC API]
         Modal[Modal Sandbox]
         ChromaDB[(ChromaDB)]
     end
     Registry --> Server4
     Server1 --> PubMed
+    Server1 --> Trials
+    Server1 --> EuropePMC
     Server2 --> Modal
     Server3 --> ChromaDB
     User[👤 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepBoner<br/>Magentic Workflow]
     DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
+    DC -->|Clinical trials| Trials[ClinicalTrials.gov<br/>Trial data]
+    DC -->|Preprints| EuropePMC[Europe PMC API<br/>Preprints & papers]
     DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]
     DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]
     DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]
     DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]
     PubMed -->|Results| DC
+    Trials -->|Results| DC
+    EuropePMC -->|Results| DC
     Claude -->|Responses| DC
     Modal -->|Output| DC
     Chroma -->|Context| DC
     style User fill:#e1f5e1
     style DC fill:#ffe6e6
     style PubMed fill:#e6f3ff
+    style Trials fill:#e6f3ff
+    style EuropePMC fill:#e6f3ff
     style Claude fill:#ffd6d6
     style Modal fill:#f0f0f0
     style Chroma fill:#ffe6f0

src/utils/exceptions.py CHANGED Viewed

@@ -29,7 +29,3 @@ class RateLimitError(SearchError):
     """Raised when we hit API rate limits."""
     pass
-# Backwards compatibility alias
-DeepCriticalError = DeepBonerError


29	"""Raised when we hit API rate limits."""
30
31	pass