Spaces:

danielrosehill
/

Context-Cruncher

Sleeping

danielrosehill Claude commited on Oct 26

Commit

59c7f4b

1 Parent(s): d4a7435

Complete Context Cruncher deployment for Hugging Face Spaces

Major updates for HF Spaces compatibility:
- Added proper YAML frontmatter to README.md with Space metadata
- Pinned dependency versions for stability (gradio 5.9.1, google-generativeai 0.8.3)
- Added .gitignore to exclude common Python/dev files
- Added .env.example template for API key configuration
- Enhanced app.py with environment variable support and HF Spaces settings
- Configured server settings for Spaces compatibility (0.0.0.0:7860)
- Added auto-loading of GEMINI_API from environment
- Configured Git LFS for audio files (*.opus)
- Included complete application code, demo files, and examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (14) hide show

.env.example +4 -0
.gitattributes +1 -0
.gitignore +51 -0
CLAUDE.md +107 -0
README.md +172 -6
app.py +334 -0
demo-results/user_entertainment_preferences_israel_tired_parent.json +6 -0
demo-results/user_entertainment_preferences_israel_tired_parent.md +45 -0
demo.html +413 -0
example-data/movie-prefs.opus +3 -0
gemini_processor.py +189 -0
generate_demo.py +68 -0
requirements.txt +3 -0
run.sh +55 -0

.env.example ADDED Viewed

	@@ -0,0 +1,4 @@

+# Gemini API Configuration
+# Get your API key from https://ai.google.dev/
+GEMINI_API=your_api_key_here

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+*.opus filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,51 @@

+# Environment variables
+.env
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# Virtual environments
+.venv/
+venv/
+ENV/
+env/
+# IDEs
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+# OS
+.DS_Store
+Thumbs.db
+# Temporary files
+*.tmp
+*.log
+temp/
+tmp/
+# Gradio
+flagged/

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,107 @@

+This repository contains a utility that I have found very useful in various AI workflows - a utility that extracts "context data" from user-supplied voice notes.
+Here's the context:
+Voice is a great way to capture information quickly and I believe that it lends itself perfectly to a workflow that aims to capture user-specific context data proactively rather than passively (e.g. through building up memory over time).
+I use this approach when I have a lot of data to provide for a specific project. As I'm open-sourcing this repo, I've chosen a movie recommendations context recording (example-data) which aims to provide some information about the type of video content I enjoy.
+The task in this repo: create a Hugging Face space. This will be for public use and I will clone it for my personal implementation.
+LLM method: BYOK with Gemini Pro 2.5. Gemini has audio understanding.
+API documentation is here: https://ai.google.dev/gemini-api/docs/audio
+## Record Audio
+A panel for the user to either:
+1) Record audio from the browser (with controls for record, pause, stop, abort).
+2) Upload an audio file (accepting opus, mp3, wav)
+A button to extract content.
+## Context Extraction Logic
+"Context data," in this context, refers to specific information about the user that can be used to ground AI inference to produce more personalised results.
+Context data can be achieved from a typical STT transcription through:
+- Omitting irrelevant information
+- Removing duplicates
+- Reformatting from the first person to the third, referring to "the user."
+Here is a short example to model the desired type of transformation, expecting that a raw STT transcript would contain approximately this level of defect:
+"Okay so ... let's document my health problems and the meds I take for this AI project ... ehm.. where do i start ... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh .. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah .. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
+This text would be transformed to (approximately):
+{START EXAMPLE}
+## Medical Conditions
+- User has had asthma since childhood
+- User has adult ADHD
+## Medication List
+- User takes Relvar, daily, for asthma
+- User takes Vyvanse 70mg, daily, for ADHD
+{END EXAMPLE}
+The above model is written with the idea that if the user were to later provide more detailed context, an agent could easily slot it into the right places. Hence, follow a careful hierarchical structure when formatting the context notes.
+Most of my context parsing utilities to date have followed a two part approach: STT then LLM cleanup (which is why I'm creating an updated interface as Gemini multimodal means that these steps can be combined!).
+In this updated interface the audio binary should be sent to Gemini alongside the system prompt (ensuring to include the example).
+The context data which is extracted can be provided to the user as a markdown file that populates in a right hand pane called Extracted Context
+## User Identification
+I would like to provide one user editable parameter: how they would like to be identified in the context data.
+That is to say: should the user be referred to by name ("Daniel has asthma") or referred to as "the user."
+If by name, I would like the user to be able to provide their name.
+This information should be carried from the frontend into the system prompt constructed in the API call to Gemini.
+## Context Download Options
+The intended use is that after the context data is parsed, the user can download it. I would like to offer the user the ability to download it as a markdown doc or as JSON - we may as well just generate both and provide buttons. We should also provide a clipboard on the markdown body text as this will provide an easy way to grab the output that doesn't require any download.
+I would like the context data to have a unique name. The logical place to implement this is again with Gemini API.
+So we should ask it for a JSON output that provides:
+- Human readable name
+- Snake case filename
+- Context data (as markdown)
+to this we can add a timestamp.
+## Template for markdown file
+For the markdown version the filename can be the snakecase file name.md
+The actual contents can follow this template
+{Start Example}
+## Readable Context Title
+{Context Data}
+---
+Captured on: {timestamp}
+{End Example}
+## JSON Template
+For the JSON version we should construct an object with all these parameters.
+For both, after generation, they should be presented to the user with download buttons.

README.md CHANGED Viewed

@@ -1,13 +1,179 @@
 ---
 title: Context Cruncher
-emoji: 🏃
-colorFrom: green
-colorTo: red
 sdk: gradio
-sdk_version: 5.49.1
 app_file: app.py
 pinned: false
-short_description: Voice note context extraction utility
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Context Cruncher
+emoji: 🎙️
+colorFrom: blue
+colorTo: purple
 sdk: gradio
+sdk_version: 5.9.1
 app_file: app.py
 pinned: false
+license: mit
+short_description: Transform voice recordings into structured AI context data
 ---
+# Context Cruncher 🎙️
+Transform casual voice recordings into clean, structured context data for AI applications.
+[![Demo](https://img.shields.io/badge/Demo-Live-brightgreen)](demo.html)
+[![HuggingFace](https://img.shields.io/badge/🤗-Space-yellow)](https://huggingface.co/spaces/danielrosehill/Context-Cruncher)
+## What is Context Cruncher?
+Context Cruncher extracts structured context data from voice recordings using Gemini AI's multimodal capabilities. It processes audio directly, cleaning up natural speech patterns and organizing information into useful context data that AI systems can use for personalization.
+**Context data** refers to specific information about users that grounds AI inference for more personalized results. This tool achieves that by:
+- Removing irrelevant information and tangents
+- Eliminating duplicates and redundancy
+- Reformatting from first person to third person
+- Organizing information hierarchically
+- Outputting both Markdown and JSON formats
+## See it in Action
+Check out the [demo page](demo.html) to see real results from processing example audio about movie preferences.
+## Features
+- **🎤 Flexible Audio Input**: Record directly in your browser or upload audio files (MP3, WAV, OPUS)
+- **🤖 AI-Powered Extraction**: Uses Gemini 2.0 Flash for intelligent audio understanding and context extraction
+- **📝 Dual Output Formats**: Get both human-readable Markdown and machine-readable JSON
+- **👤 Customizable Identification**: Choose how you're referred to in the context data (by name or as "the user")
+- **📋 Easy Export**: Download files or copy directly to clipboard
+## Quick Start
+### Prerequisites
+- Python 3.12+
+- A [Gemini API key](https://ai.google.dev/)
+### Installation
+1. Clone the repository:
+```bash
+git clone https://github.com/danielrosehill/Context-Cruncher.git
+cd Context-Cruncher
+```
+2. Create a virtual environment and install dependencies:
+```bash
+# Using uv (recommended)
+uv venv
+source .venv/bin/activate
+uv pip install -r requirements.txt
+# Or using standard venv
+python -m venv .venv
+source .venv/bin/activate  # On Windows: .venv\Scripts\activate
+pip install -r requirements.txt
+```
+3. Create a `.env` file with your Gemini API key:
+```bash
+cp .env.example .env
+# Edit .env and add your API key:
+# GEMINI_API="your_api_key_here"
+```
+4. Run the application:
+**Option A: Using the launch script (easiest)**
+```bash
+./run.sh
+```
+**Option B: Manual launch**
+```bash
+source .venv/bin/activate
+python app.py
+```
+The app will launch in your browser at `http://localhost:7860`
+## Usage
+1. **Configure**: Enter your Gemini API key (or load from `.env`)
+2. **Choose Identification**: Select whether to be referred to by name or as "the user"
+3. **Provide Audio**: Either:
+   - Record directly in the browser using your microphone
+   - Upload an audio file (MP3, WAV, or OPUS)
+4. **Extract**: Click "Extract Context" to process your audio
+5. **Download**: Get your structured context data as Markdown or JSON
+## Example Transformation
+**Raw Audio Input:**
+> "Okay so... let's document my health problems and the meds I take for this AI project... ehm.. where do I start... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh.. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah.. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
+**Structured Output:**
+```markdown
+## Medical Conditions
+- the user has had asthma since childhood
+- the user has adult ADHD
+## Medication List
+- the user takes Relvar, daily, for asthma
+- the user takes Vyvanse 70mg, daily, for ADHD
+```
+## Generating Demo Results
+To regenerate the demo results with the example audio:
+```bash
+python generate_demo.py
+```
+This will process the `example-data/movie-prefs.opus` file and save results to `demo-results/`.
+## Privacy Note
+Your audio is processed using the Gemini API. Review [Google's privacy policies](https://policies.google.com/) before using this tool with sensitive information.
+## Use Cases
+- **AI Assistant Personalization**: Provide context to chatbots and AI assistants
+- **Knowledge Management**: Convert verbal notes into structured information
+- **Preference Mapping**: Document likes, dislikes, and preferences
+- **Medical History**: Organize health information (note privacy considerations)
+- **Project Context**: Capture project requirements and preferences
+## Technical Details
+- **Frontend**: Gradio web interface
+- **AI Model**: Gemini 2.0 Flash (with multimodal audio understanding)
+- **Audio Processing**: Direct audio file upload to Gemini API
+- **Output Formats**: Markdown and JSON
+## Repository Structure
+```
+Context-Cruncher/
+├── app.py                  # Main Gradio application
+├── gemini_processor.py     # Gemini API integration
+├── generate_demo.py        # Demo generation script
+├── run.sh                  # Launch script
+├── requirements.txt        # Python dependencies
+├── .env.example           # Environment variable template
+├── demo.html              # Demo results page
+├── example-data/          # Example audio files
+└── demo-results/          # Generated demo outputs
+```
+## Contributing
+Contributions welcome! Please feel free to submit issues or pull requests.
+## License
+MIT License - See LICENSE file for details
+## Author
+Daniel Rosehill
+- Website: [danielrosehill.com](https://danielrosehill.com)
+- GitHub: [@danielrosehill](https://github.com/danielrosehill)

app.py ADDED Viewed

	@@ -0,0 +1,334 @@

+"""
+Context Cruncher - Gradio Application
+Extract structured context data from voice recordings using Gemini AI.
+"""
+import gradio as gr
+import os
+from pathlib import Path
+import tempfile
+from dotenv import load_dotenv
+from gemini_processor import (
+    process_audio_with_gemini,
+    create_markdown_file,
+    create_json_file
+)
+# Load environment variables
+load_dotenv()
+def process_audio(
+    audio_input,
+    uploaded_file,
+    api_key: str,
+    user_identification: str,
+    user_name: str = ""
+) -> tuple:
+    """
+    Process audio from either recording or upload.
+    Args:
+        audio_input: Audio from microphone recording
+        uploaded_file: Uploaded audio file
+        api_key: Gemini API key
+        user_identification: "name" or "user"
+        user_name: User's name if using name identification
+    Returns:
+        Tuple of (markdown_content, markdown_file, json_file, status_message)
+    """
+    try:
+        # Validate API key
+        if not api_key or api_key.strip() == "":
+            return (
+                "",
+                None,
+                None,
+                "Error: Please provide a Gemini API key"
+            )
+        # Determine which audio source to use
+        audio_path = None
+        if audio_input is not None:
+            audio_path = audio_input
+        elif uploaded_file is not None:
+            audio_path = uploaded_file.name
+        if audio_path is None:
+            return (
+                "",
+                None,
+                None,
+                "Error: Please record audio or upload an audio file"
+            )
+        # Determine user reference
+        user_ref = None
+        if user_identification == "name":
+            if not user_name or user_name.strip() == "":
+                return (
+                    "",
+                    None,
+                    None,
+                    "Error: Please provide your name when using name identification"
+                )
+            user_ref = user_name.strip()
+        # Process with Gemini
+        status_msg = "Processing audio with Gemini API..."
+        context_markdown, human_readable_name, snake_case_filename = process_audio_with_gemini(
+            audio_path,
+            api_key,
+            user_ref
+        )
+        # Create output files
+        md_filename, md_content = create_markdown_file(
+            context_markdown,
+            human_readable_name,
+            snake_case_filename
+        )
+        json_filename, json_content = create_json_file(
+            context_markdown,
+            human_readable_name,
+            snake_case_filename
+        )
+        # Write files to temp directory for download
+        temp_dir = tempfile.mkdtemp()
+        md_path = Path(temp_dir) / md_filename
+        json_path = Path(temp_dir) / json_filename
+        with open(md_path, 'w') as f:
+            f.write(md_content)
+        with open(json_path, 'w') as f:
+            f.write(json_content)
+        return (
+            md_content,
+            str(md_path),
+            str(json_path),
+            f"Success! Context extracted: {human_readable_name}"
+        )
+    except Exception as e:
+        return (
+            "",
+            None,
+            None,
+            f"Error: {str(e)}"
+        )
+# Custom CSS for better styling
+custom_css = """
+.gradio-container {
+    font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
+}
+.main-header {
+    text-align: center;
+    margin-bottom: 1.5rem;
+    padding-bottom: 1rem;
+    border-bottom: 2px solid #e5e7eb;
+}
+.main-header h1 {
+    font-size: 2rem;
+    font-weight: 600;
+    color: #1f2937;
+    margin-bottom: 0.5rem;
+}
+.main-header p {
+    color: #6b7280;
+    font-size: 1rem;
+}
+.section-header {
+    font-weight: 600;
+    color: #374151;
+    margin-bottom: 1rem;
+}
+"""
+# Create Gradio interface
+with gr.Blocks(css=custom_css, title="Context Cruncher") as demo:
+    gr.Markdown(
+        """
+        # Context Cruncher
+        Extract structured context data from voice recordings using AI
+        """,
+        elem_classes="main-header"
+    )
+    with gr.Tabs():
+        with gr.Tab("Extract"):
+            with gr.Row():
+                with gr.Column(scale=1):
+                    with gr.Accordion("Configuration", open=True):
+                        api_key_input = gr.Textbox(
+                            label="Gemini API Key",
+                            placeholder="Enter your Gemini API key",
+                            type="password",
+                            value=os.getenv("GEMINI_API", ""),
+                            info="Get your API key from https://ai.google.dev/"
+                        )
+                        user_identification = gr.Radio(
+                            choices=["user", "name"],
+                            value="user",
+                            label="User Identification",
+                            info="How should you be referred to in the context data?"
+                        )
+                        user_name_input = gr.Textbox(
+                            label="Your Name",
+                            placeholder="Enter your name",
+                            visible=False,
+                            info="Used when 'name' is selected above"
+                        )
+                    gr.Markdown("### Audio Input", elem_classes="section-header")
+                    audio_recording = gr.Audio(
+                        sources=["microphone"],
+                        type="filepath",
+                        label="Record Audio"
+                    )
+                    gr.Markdown("**OR**")
+                    audio_upload = gr.File(
+                        label="Upload Audio File",
+                        file_types=["audio"],
+                        type="filepath"
+                    )
+                    process_btn = gr.Button("Extract Context", variant="primary", size="lg")
+                with gr.Column(scale=1):
+                    gr.Markdown("### Results", elem_classes="section-header")
+                    status_output = gr.Textbox(
+                        label="Status",
+                        interactive=False,
+                        show_label=True
+                    )
+                    context_display = gr.Textbox(
+                        label="Context Data (Markdown)",
+                        lines=18,
+                        interactive=False,
+                        show_copy_button=True
+                    )
+                    with gr.Row():
+                        markdown_download = gr.File(label="Download Markdown")
+                        json_download = gr.File(label="Download JSON")
+        with gr.Tab("About"):
+            gr.Markdown(
+                """
+                ## What is Context Cruncher?
+                Context Cruncher transforms casual voice recordings into clean, structured context data
+                that AI systems can use for personalization.
+                **Context data** refers to specific information about users that grounds AI inference
+                for more personalized results.
+                ## How It Works
+                1. **Configure** - Enter your Gemini API key and choose how you want to be identified
+                2. **Input Audio** - Either record directly in your browser or upload an audio file (MP3, WAV, OPUS)
+                3. **Extract** - Click the button and let AI clean up your recording into structured context data
+                4. **Download** - Get your context data as Markdown or JSON, or copy directly from the text area
+                ## What Gets Extracted
+                This tool processes your audio by:
+                - Removing irrelevant information and tangents
+                - Eliminating duplicates and redundancy
+                - Reformatting from first person to third person
+                - Organizing information hierarchically
+                - Outputting both Markdown and JSON formats
+                ## Example Transformation
+                **Raw Audio:**
+                > "Okay so... let's document my health problems... I've had asthma since I was a kid.
+                > I take a daily inhaler called Relvar for that. Oh hey Jay! What's up!
+                > Okay, where was I... I also take Vyvanse for ADHD."
+                **Structured Output:**
+                ```markdown
+                ## Medical Conditions
+                - the user has had asthma since childhood
+                - the user has adult ADHD
+                ## Medication List
+                - the user takes Relvar, daily, for asthma
+                - the user takes Vyvanse for ADHD
+                ```
+                ## Privacy Notice
+                Your audio is processed using the Gemini API. Review Google's privacy policies
+                before using this tool with sensitive information.
+                ## Technical Details
+                - **AI Model**: Gemini 2.0 Flash (multimodal audio understanding)
+                - **Processing**: Direct audio file upload to Gemini API
+                - **Output Formats**: Markdown and JSON
+                ## Use Cases
+                - AI assistant personalization
+                - Knowledge management
+                - Preference mapping
+                - Medical history documentation (note privacy considerations)
+                - Project context capture
+                """
+            )
+    # Show/hide name input based on identification method
+    def toggle_name_input(identification_choice):
+        return gr.update(visible=identification_choice == "name")
+    user_identification.change(
+        fn=toggle_name_input,
+        inputs=[user_identification],
+        outputs=[user_name_input]
+    )
+    # Process button click
+    process_btn.click(
+        fn=process_audio,
+        inputs=[
+            audio_recording,
+            audio_upload,
+            api_key_input,
+            user_identification,
+            user_name_input
+        ],
+        outputs=[
+            context_display,
+            markdown_download,
+            json_download,
+            status_output
+        ]
+    )
+if __name__ == "__main__":
+    # For Hugging Face Spaces, share should be False
+    # Set server_name to 0.0.0.0 for Spaces compatibility
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False
+    )

demo-results/user_entertainment_preferences_israel_tired_parent.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "human_readable_name": "User's Entertainment Preferences - Israel-Based, Tired Parent",
+  "snake_case_filename": "user_entertainment_preferences_israel_tired_parent",
+  "context_data": "## Entertainment Preferences\n\n### General\n\n- the user prefers to watch content that is either based on a true story or is credible\n- the user is not a fan of science fiction\n- the user enjoys content with intriguing stories\n- the user occasionally enjoys horror movies\n- the user likes thoughtful content\n- the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy\n- the user prefers movies that take their time rather than overload with special effects and violence\n- the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests\n- the user likes comedy movies, but not rom-coms\n- the user likes obscure travel content\n- the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence\n\n### Formats and Platforms\n\n- the user enjoys Netflix documentary series, especially offbeat documentaries\n- the user appreciates the content produced by Vice\n\n### Specific Movies and Genres\n\n- the user enjoys the genre of the absurd (e.g., Waiting for Godot)\n- the user liked the movies \"Get Out\", \"The Matrix\", \"That Guy\" (potentially \"Vanilla Sky\"), and \"Limitless\"\n- the user is interested in movies that explore the question of reality, such as \"Inception\"\n\n### Content Preferences Related to Israel\n\n- the user lives in Israel and follows geopolitical content\n- the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides\n- the user finds shows like \"Fauda\" too real and conflict-heavy\n- the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel\n\n### Recommendation Preferences\n\n- the user is interested in recommendations of recently released content\n- the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there\n",
+  "captured_on": "2025-10-26T21:25:43.200350"
+}

demo-results/user_entertainment_preferences_israel_tired_parent.md ADDED Viewed

	@@ -0,0 +1,45 @@

+## User's Entertainment Preferences - Israel-Based, Tired Parent
+## Entertainment Preferences
+### General
+- the user prefers to watch content that is either based on a true story or is credible
+- the user is not a fan of science fiction
+- the user enjoys content with intriguing stories
+- the user occasionally enjoys horror movies
+- the user likes thoughtful content
+- the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy
+- the user prefers movies that take their time rather than overload with special effects and violence
+- the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests
+- the user likes comedy movies, but not rom-coms
+- the user likes obscure travel content
+- the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence
+### Formats and Platforms
+- the user enjoys Netflix documentary series, especially offbeat documentaries
+- the user appreciates the content produced by Vice
+### Specific Movies and Genres
+- the user enjoys the genre of the absurd (e.g., Waiting for Godot)
+- the user liked the movies "Get Out", "The Matrix", "That Guy" (potentially "Vanilla Sky"), and "Limitless"
+- the user is interested in movies that explore the question of reality, such as "Inception"
+### Content Preferences Related to Israel
+- the user lives in Israel and follows geopolitical content
+- the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides
+- the user finds shows like "Fauda" too real and conflict-heavy
+- the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel
+### Recommendation Preferences
+- the user is interested in recommendations of recently released content
+- the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there
+---
+Captured on: 2025-10-26 21:25:43

demo.html ADDED Viewed

	@@ -0,0 +1,413 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Context Cruncher - Demo</title>
+    <style>
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }
+        body {
+            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+            line-height: 1.6;
+            color: #333;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            padding: 2rem;
+        }
+        .container {
+            max-width: 1200px;
+            margin: 0 auto;
+            background: white;
+            border-radius: 12px;
+            box-shadow: 0 10px 40px rgba(0, 0, 0, 0.2);
+            overflow: hidden;
+        }
+        .header {
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            color: white;
+            padding: 3rem 2rem;
+            text-align: center;
+        }
+        .header h1 {
+            font-size: 2.5rem;
+            margin-bottom: 0.5rem;
+        }
+        .header p {
+            font-size: 1.2rem;
+            opacity: 0.9;
+        }
+        .content {
+            padding: 2rem;
+        }
+        .intro {
+            background: #f8f9fa;
+            padding: 1.5rem;
+            border-radius: 8px;
+            margin-bottom: 2rem;
+            border-left: 4px solid #667eea;
+        }
+        .intro h2 {
+            color: #667eea;
+            margin-bottom: 1rem;
+        }
+        .demo-section {
+            margin-bottom: 3rem;
+        }
+        .demo-section h2 {
+            color: #333;
+            margin-bottom: 1.5rem;
+            padding-bottom: 0.5rem;
+            border-bottom: 2px solid #667eea;
+        }
+        .columns {
+            display: grid;
+            grid-template-columns: 1fr 1fr;
+            gap: 2rem;
+            margin-top: 2rem;
+        }
+        .column {
+            background: #f8f9fa;
+            padding: 1.5rem;
+            border-radius: 8px;
+            border: 1px solid #e0e0e0;
+        }
+        .column h3 {
+            color: #667eea;
+            margin-bottom: 1rem;
+            font-size: 1.2rem;
+        }
+        .output-box {
+            background: white;
+            padding: 1.5rem;
+            border-radius: 6px;
+            border: 1px solid #ddd;
+            max-height: 600px;
+            overflow-y: auto;
+            font-family: 'Courier New', monospace;
+            font-size: 0.9rem;
+            white-space: pre-wrap;
+            word-wrap: break-word;
+        }
+        .markdown-output {
+            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+            line-height: 1.8;
+        }
+        .markdown-output h2 {
+            color: #333;
+            margin-top: 1.5rem;
+            margin-bottom: 1rem;
+            font-size: 1.4rem;
+        }
+        .markdown-output h3 {
+            color: #555;
+            margin-top: 1rem;
+            margin-bottom: 0.5rem;
+            font-size: 1.1rem;
+        }
+        .markdown-output ul {
+            margin-left: 2rem;
+            margin-bottom: 1rem;
+        }
+        .markdown-output li {
+            margin-bottom: 0.5rem;
+        }
+        .markdown-output hr {
+            margin: 2rem 0;
+            border: none;
+            border-top: 1px solid #ddd;
+        }
+        .badge {
+            display: inline-block;
+            background: #667eea;
+            color: white;
+            padding: 0.25rem 0.75rem;
+            border-radius: 4px;
+            font-size: 0.85rem;
+            margin-bottom: 1rem;
+        }
+        .cta {
+            text-align: center;
+            padding: 2rem;
+            background: #f8f9fa;
+            border-radius: 8px;
+            margin-top: 2rem;
+        }
+        .cta h3 {
+            color: #333;
+            margin-bottom: 1rem;
+        }
+        .cta a {
+            display: inline-block;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            color: white;
+            padding: 1rem 2rem;
+            text-decoration: none;
+            border-radius: 6px;
+            font-weight: bold;
+            transition: transform 0.2s;
+        }
+        .cta a:hover {
+            transform: translateY(-2px);
+            box-shadow: 0 4px 12px rgba(102, 126, 234, 0.4);
+        }
+        .tabs {
+            display: flex;
+            border-bottom: 2px solid #e0e0e0;
+            background: #f8f9fa;
+            margin: 0;
+            padding: 0 2rem;
+        }
+        .tab {
+            padding: 1rem 2rem;
+            cursor: pointer;
+            border: none;
+            background: none;
+            font-size: 1rem;
+            font-weight: 500;
+            color: #666;
+            border-bottom: 3px solid transparent;
+            transition: all 0.3s;
+        }
+        .tab:hover {
+            color: #667eea;
+            background: rgba(102, 126, 234, 0.05);
+        }
+        .tab.active {
+            color: #667eea;
+            border-bottom-color: #667eea;
+            background: white;
+        }
+        .tab-content {
+            display: none;
+        }
+        .tab-content.active {
+            display: block;
+        }
+        @media (max-width: 768px) {
+            .columns {
+                grid-template-columns: 1fr;
+            }
+            .header h1 {
+                font-size: 2rem;
+            }
+            .content {
+                padding: 1rem;
+            }
+            .tabs {
+                flex-direction: column;
+                padding: 0;
+            }
+            .tab {
+                padding: 0.75rem 1rem;
+                text-align: left;
+            }
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <div class="header">
+            <h1>Context Cruncher Demo</h1>
+            <p>See how audio becomes structured context data</p>
+        </div>
+        <div class="tabs">
+            <button class="tab active" onclick="switchTab(event, 'overview')">Overview</button>
+            <button class="tab" onclick="switchTab(event, 'demo')">Demo Results</button>
+            <button class="tab" onclick="switchTab(event, 'features')">Features</button>
+        </div>
+        <div class="content">
+            <div id="overview" class="tab-content active">
+                <div class="intro">
+                <h2>What is Context Cruncher?</h2>
+                <p>
+                    Context Cruncher transforms casual voice recordings into clean, structured context data
+                    that AI systems can use for more personalized results. Using Gemini AI's multimodal capabilities,
+                    it processes audio directly - understanding, cleaning, and organizing your spoken words into
+                    useful context data.
+                </p>
+                <p style="margin-top: 1rem;">
+                    Below is a real example using a voice recording about movie preferences. Notice how the raw,
+                    conversational audio has been transformed into organized, third-person context data ready
+                    for AI applications.
+                </p>
+                </div>
+                <div class="cta">
+                    <h3>Ready to try it yourself?</h3>
+                    <p style="margin-bottom: 1.5rem;">Process your own audio and create structured context data</p>
+                    <a href="https://huggingface.co/spaces/danielrosehill/Context-Cruncher" target="_blank">Launch Context Cruncher</a>
+                </div>
+            </div>
+            <div id="demo" class="tab-content">
+                <h2>Demo Results</h2>
+                <p>
+                    This demo processes the example audio file (<code>movie-prefs.opus</code>) included in the repository.
+                    The audio contains casual thoughts about entertainment preferences, and Context Cruncher has
+                    extracted structured context data from it.
+                </p>
+                <div class="columns">
+                    <div class="column">
+                        <h3>Markdown Output</h3>
+                        <span class="badge">Human-Readable</span>
+                        <div class="output-box markdown-output">
+                            <h2>User's Entertainment Preferences - Israel-Based, Tired Parent</h2>
+                            <h3>Entertainment Preferences</h3>
+                            <h4>General</h4>
+                            <ul>
+                                <li>the user prefers to watch content that is either based on a true story or is credible</li>
+                                <li>the user is not a fan of science fiction</li>
+                                <li>the user enjoys content with intriguing stories</li>
+                                <li>the user occasionally enjoys horror movies</li>
+                                <li>the user likes thoughtful content</li>
+                                <li>the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy</li>
+                                <li>the user prefers movies that take their time rather than overload with special effects and violence</li>
+                                <li>the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests</li>
+                                <li>the user likes comedy movies, but not rom-coms</li>
+                                <li>the user likes obscure travel content</li>
+                                <li>the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence</li>
+                            </ul>
+                            <h4>Formats and Platforms</h4>
+                            <ul>
+                                <li>the user enjoys Netflix documentary series, especially offbeat documentaries</li>
+                                <li>the user appreciates the content produced by Vice</li>
+                            </ul>
+                            <h4>Specific Movies and Genres</h4>
+                            <ul>
+                                <li>the user enjoys the genre of the absurd (e.g., Waiting for Godot)</li>
+                                <li>the user liked the movies "Get Out", "The Matrix", "That Guy" (potentially "Vanilla Sky"), and "Limitless"</li>
+                                <li>the user is interested in movies that explore the question of reality, such as "Inception"</li>
+                            </ul>
+                            <h4>Content Preferences Related to Israel</h4>
+                            <ul>
+                                <li>the user lives in Israel and follows geopolitical content</li>
+                                <li>the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides</li>
+                                <li>the user finds shows like "Fauda" too real and conflict-heavy</li>
+                                <li>the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel</li>
+                            </ul>
+                            <h4>Recommendation Preferences</h4>
+                            <ul>
+                                <li>the user is interested in recommendations of recently released content</li>
+                                <li>the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there</li>
+                            </ul>
+                            <hr>
+                            <p><em>Captured on: 2025-10-26 21:25:43</em></p>
+                        </div>
+                    </div>
+                    <div class="column">
+                        <h3>JSON Output</h3>
+                        <span class="badge">Machine-Readable</span>
+                        <div class="output-box">{
+  "human_readable_name": "User's Entertainment Preferences - Israel-Based, Tired Parent",
+  "snake_case_filename": "user_entertainment_preferences_israel_tired_parent",
+  "context_data": "## Entertainment Preferences\n\n### General\n\n- the user prefers to watch content that is either based on a true story or is credible\n- the user is not a fan of science fiction\n- the user enjoys content with intriguing stories\n- the user occasionally enjoys horror movies\n- the user likes thoughtful content\n- the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy\n- the user prefers movies that take their time rather than overload with special effects and violence\n- the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests\n- the user likes comedy movies, but not rom-coms\n- the user likes obscure travel content\n- the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence\n\n### Formats and Platforms\n\n- the user enjoys Netflix documentary series, especially offbeat documentaries\n- the user appreciates the content produced by Vice\n\n### Specific Movies and Genres\n\n- the user enjoys the genre of the absurd (e.g., Waiting for Godot)\n- the user liked the movies \"Get Out\", \"The Matrix\", \"That Guy\" (potentially \"Vanilla Sky\"), and \"Limitless\"\n- the user is interested in movies that explore the question of reality, such as \"Inception\"\n\n### Content Preferences Related to Israel\n\n- the user lives in Israel and follows geopolitical content\n- the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides\n- the user finds shows like \"Fauda\" too real and conflict-heavy\n- the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel\n\n### Recommendation Preferences\n\n- the user is interested in recommendations of recently released content\n- the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there\n",
+  "captured_on": "2025-10-26T21:25:43.200350"
+}</div>
+                    </div>
+                </div>
+            </div>
+            <div id="features" class="tab-content">
+                <h2>Key Features Demonstrated</h2>
+                <div class="columns">
+                    <div class="column">
+                        <h3>Intelligent Cleaning</h3>
+                        <p>Removes filler words, tangents, and irrelevant information while preserving all meaningful context.</p>
+                    </div>
+                    <div class="column">
+                        <h3>Structured Organization</h3>
+                        <p>Automatically organizes information into logical categories and hierarchies.</p>
+                    </div>
+                    <div class="column">
+                        <h3>Third-Person Conversion</h3>
+                        <p>Transforms first-person narratives into third-person context data about "the user".</p>
+                    </div>
+                    <div class="column">
+                        <h3>Multiple Formats</h3>
+                        <p>Outputs both human-readable Markdown and machine-readable JSON formats.</p>
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+    <script>
+        function switchTab(event, tabName) {
+            // Hide all tab contents
+            const tabContents = document.getElementsByClassName('tab-content');
+            for (let content of tabContents) {
+                content.classList.remove('active');
+            }
+            // Remove active class from all tabs
+            const tabs = document.getElementsByClassName('tab');
+            for (let tab of tabs) {
+                tab.classList.remove('active');
+            }
+            // Show the selected tab content
+            document.getElementById(tabName).classList.add('active');
+            // Add active class to the clicked tab
+            event.currentTarget.classList.add('active');
+        }
+    </script>
+</body>
+</html>

example-data/movie-prefs.opus ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:93e065151228d8b4386403cca36ecc4e120af22a2d54e6b7823baea72a033bd6
+size 2514076

gemini_processor.py ADDED Viewed

	@@ -0,0 +1,189 @@

+"""
+Gemini API integration for processing audio and extracting context data.
+"""
+import google.generativeai as genai
+import json
+from datetime import datetime
+from typing import Dict, Tuple
+def get_system_prompt(user_name: str = None) -> str:
+    """
+    Generate the system prompt for context extraction.
+    Args:
+        user_name: Optional name to use instead of "the user"
+    Returns:
+        System prompt string
+    """
+    user_reference = user_name if user_name else "the user"
+    return f"""You are a context extraction assistant. Your task is to analyze audio recordings where users provide personal context information and extract it in a clean, structured format.
+## Your Task
+Extract context data from the user's audio recording. Context data refers to specific information about the user that can be used to ground AI inference for more personalized results.
+## Transformation Guidelines
+1. Remove irrelevant information (e.g., tangential conversations, notes to self)
+2. Remove duplicates and redundancy
+3. Reformat from first person to third person, referring to "{user_reference}"
+4. Organize information hierarchically with clear sections
+5. Present information in a clean, structured markdown format
+## Example Transformation
+INPUT (raw audio transcript):
+"Okay so ... let's document my health problems and the meds I take for this AI project ... ehm.. where do i start ... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh .. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah .. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
+OUTPUT (cleaned context data):
+## Medical Conditions
+- {user_reference} has had asthma since childhood
+- {user_reference} has adult ADHD
+## Medication List
+- {user_reference} takes Relvar, daily, for asthma
+- {user_reference} takes Vyvanse 70mg, daily, for ADHD
+## Important Notes
+Follow a careful hierarchical structure that allows additional context to be easily integrated later. Use clear section headers and bullet points for organization.
+Now process the provided audio recording and extract the context data following these guidelines."""
+def get_naming_prompt() -> str:
+    """Get the prompt for generating context data names."""
+    return """Based on the context data you just extracted, provide a JSON object with:
+1. human_readable_name: A clear, descriptive title for this context (e.g., "Medical History and Medications", "Movie Preferences")
+2. snake_case_filename: A snake_case version suitable for a filename (e.g., "medical_history_medications", "movie_preferences")
+Respond ONLY with a valid JSON object in this exact format:
+{
+  "human_readable_name": "Your Title Here",
+  "snake_case_filename": "your_filename_here"
+}"""
+def process_audio_with_gemini(
+    audio_file_path: str,
+    api_key: str,
+    user_name: str = None
+) -> Tuple[str, str, str]:
+    """
+    Process audio file with Gemini API to extract context data.
+    Args:
+        audio_file_path: Path to the audio file
+        api_key: Gemini API key
+        user_name: Optional user name for personalization
+    Returns:
+        Tuple of (context_markdown, human_readable_name, snake_case_filename)
+    Raises:
+        Exception: If API call fails
+    """
+    genai.configure(api_key=api_key)
+    # Use Gemini Pro 2.5 with audio understanding
+    model = genai.GenerativeModel('gemini-2.0-flash-exp')
+    # Upload the audio file
+    audio_file = genai.upload_file(audio_file_path)
+    # Generate context data
+    system_prompt = get_system_prompt(user_name)
+    response = model.generate_content([system_prompt, audio_file])
+    context_markdown = response.text
+    # Generate naming information
+    naming_response = model.generate_content([
+        context_markdown,
+        get_naming_prompt()
+    ])
+    # Parse the JSON response
+    try:
+        # Extract JSON from response (handle potential markdown code blocks)
+        naming_text = naming_response.text.strip()
+        if naming_text.startswith('```'):
+            # Remove markdown code block markers
+            lines = naming_text.split('\n')
+            naming_text = '\n'.join(lines[1:-1])
+        naming_data = json.loads(naming_text)
+        human_readable_name = naming_data['human_readable_name']
+        snake_case_filename = naming_data['snake_case_filename']
+    except (json.JSONDecodeError, KeyError) as e:
+        # Fallback to generic naming if parsing fails
+        human_readable_name = "Context Data"
+        snake_case_filename = "context_data"
+    return context_markdown, human_readable_name, snake_case_filename
+def create_markdown_file(
+    context_markdown: str,
+    human_readable_name: str,
+    snake_case_filename: str
+) -> Tuple[str, str]:
+    """
+    Create a formatted markdown file content.
+    Args:
+        context_markdown: The extracted context data
+        human_readable_name: Human readable title
+        snake_case_filename: Filename
+    Returns:
+        Tuple of (filename, content)
+    """
+    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+    content = f"""## {human_readable_name}
+{context_markdown}
+---
+Captured on: {timestamp}
+"""
+    filename = f"{snake_case_filename}.md"
+    return filename, content
+def create_json_file(
+    context_markdown: str,
+    human_readable_name: str,
+    snake_case_filename: str
+) -> Tuple[str, str]:
+    """
+    Create a JSON file content.
+    Args:
+        context_markdown: The extracted context data
+        human_readable_name: Human readable title
+        snake_case_filename: Filename
+    Returns:
+        Tuple of (filename, json_content)
+    """
+    timestamp = datetime.now().isoformat()
+    data = {
+        "human_readable_name": human_readable_name,
+        "snake_case_filename": snake_case_filename,
+        "context_data": context_markdown,
+        "captured_on": timestamp
+    }
+    filename = f"{snake_case_filename}.json"
+    json_content = json.dumps(data, indent=2)
+    return filename, json_content

generate_demo.py ADDED Viewed

	@@ -0,0 +1,68 @@

+"""
+Generate demo results by processing the example audio file.
+"""
+import os
+from pathlib import Path
+from gemini_processor import (
+    process_audio_with_gemini,
+    create_markdown_file,
+    create_json_file
+)
+from dotenv import load_dotenv
+# Load environment variables
+load_dotenv()
+def main():
+    # Get API key from environment
+    api_key = os.getenv('GEMINI_API')
+    if not api_key:
+        raise ValueError("GEMINI_API not found in .env file")
+    # Path to example audio
+    audio_path = "example-data/movie-prefs.opus"
+    print(f"Processing {audio_path}...")
+    # Process with Gemini (using "user" identification)
+    context_markdown, human_readable_name, snake_case_filename = process_audio_with_gemini(
+        audio_path,
+        api_key,
+        user_name=None  # Use "the user" format
+    )
+    print(f"Extracted context: {human_readable_name}")
+    # Create output files
+    md_filename, md_content = create_markdown_file(
+        context_markdown,
+        human_readable_name,
+        snake_case_filename
+    )
+    json_filename, json_content = create_json_file(
+        context_markdown,
+        human_readable_name,
+        snake_case_filename
+    )
+    # Create demo-results directory
+    demo_dir = Path("demo-results")
+    demo_dir.mkdir(exist_ok=True)
+    # Write files
+    md_path = demo_dir / md_filename
+    json_path = demo_dir / json_filename
+    with open(md_path, 'w') as f:
+        f.write(md_content)
+    print(f"Saved: {md_path}")
+    with open(json_path, 'w') as f:
+        f.write(json_content)
+    print(f"Saved: {json_path}")
+    print("\nDemo results generated successfully!")
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+gradio==5.9.1
+google-generativeai==0.8.3
+python-dotenv==1.0.1

run.sh ADDED Viewed

	@@ -0,0 +1,55 @@

+#!/bin/bash
+# Context Cruncher - Launch Script
+# Wrapper script to easily launch the Context Cruncher application
+set -e  # Exit on error
+echo "🎙️  Context Cruncher - Launch Script"
+echo "===================================="
+echo ""
+# Check if virtual environment exists
+if [ ! -d ".venv" ]; then
+    echo "❌ Virtual environment not found!"
+    echo "Creating virtual environment with uv..."
+    uv venv
+    echo "✅ Virtual environment created"
+    echo ""
+fi
+# Activate virtual environment
+echo "📦 Activating virtual environment..."
+source .venv/bin/activate
+# Check if dependencies are installed
+if ! python -c "import gradio" 2>/dev/null; then
+    echo "📥 Installing dependencies..."
+    uv pip install -r requirements.txt
+    echo "✅ Dependencies installed"
+    echo ""
+fi
+# Check if .env file exists
+if [ ! -f ".env" ]; then
+    echo "⚠️  Warning: .env file not found!"
+    echo "Please create a .env file with your Gemini API key."
+    echo "You can copy .env.example and add your key:"
+    echo ""
+    echo "  cp .env.example .env"
+    echo "  # Then edit .env and add your GEMINI_API key"
+    echo ""
+    read -p "Do you want to continue anyway? (y/n) " -n 1 -r
+    echo ""
+    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
+        exit 1
+    fi
+fi
+# Launch the application
+echo "🚀 Launching Context Cruncher..."
+echo "The app will open in your browser at http://localhost:7860"
+echo ""
+echo "Press Ctrl+C to stop the server"
+echo ""
+python app.py