danielrosehill Claude commited on
Commit
59c7f4b
Β·
1 Parent(s): d4a7435

Complete Context Cruncher deployment for Hugging Face Spaces

Browse files

Major updates for HF Spaces compatibility:
- Added proper YAML frontmatter to README.md with Space metadata
- Pinned dependency versions for stability (gradio 5.9.1, google-generativeai 0.8.3)
- Added .gitignore to exclude common Python/dev files
- Added .env.example template for API key configuration
- Enhanced app.py with environment variable support and HF Spaces settings
- Configured server settings for Spaces compatibility (0.0.0.0:7860)
- Added auto-loading of GEMINI_API from environment
- Configured Git LFS for audio files (*.opus)
- Included complete application code, demo files, and examples

πŸ€– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

.env.example ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ # Gemini API Configuration
2
+ # Get your API key from https://ai.google.dev/
3
+
4
+ GEMINI_API=your_api_key_here
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.opus filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Environment variables
2
+ .env
3
+
4
+ # Python
5
+ __pycache__/
6
+ *.py[cod]
7
+ *$py.class
8
+ *.so
9
+ .Python
10
+ build/
11
+ develop-eggs/
12
+ dist/
13
+ downloads/
14
+ eggs/
15
+ .eggs/
16
+ lib/
17
+ lib64/
18
+ parts/
19
+ sdist/
20
+ var/
21
+ wheels/
22
+ *.egg-info/
23
+ .installed.cfg
24
+ *.egg
25
+ MANIFEST
26
+
27
+ # Virtual environments
28
+ .venv/
29
+ venv/
30
+ ENV/
31
+ env/
32
+
33
+ # IDEs
34
+ .vscode/
35
+ .idea/
36
+ *.swp
37
+ *.swo
38
+ *~
39
+
40
+ # OS
41
+ .DS_Store
42
+ Thumbs.db
43
+
44
+ # Temporary files
45
+ *.tmp
46
+ *.log
47
+ temp/
48
+ tmp/
49
+
50
+ # Gradio
51
+ flagged/
CLAUDE.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This repository contains a utility that I have found very useful in various AI workflows - a utility that extracts "context data" from user-supplied voice notes.
2
+
3
+ Here's the context:
4
+
5
+ Voice is a great way to capture information quickly and I believe that it lends itself perfectly to a workflow that aims to capture user-specific context data proactively rather than passively (e.g. through building up memory over time).
6
+
7
+ I use this approach when I have a lot of data to provide for a specific project. As I'm open-sourcing this repo, I've chosen a movie recommendations context recording (example-data) which aims to provide some information about the type of video content I enjoy.
8
+
9
+ The task in this repo: create a Hugging Face space. This will be for public use and I will clone it for my personal implementation.
10
+
11
+ LLM method: BYOK with Gemini Pro 2.5. Gemini has audio understanding.
12
+
13
+ API documentation is here: https://ai.google.dev/gemini-api/docs/audio
14
+
15
+ ## Record Audio
16
+
17
+ A panel for the user to either:
18
+
19
+ 1) Record audio from the browser (with controls for record, pause, stop, abort).
20
+ 2) Upload an audio file (accepting opus, mp3, wav)
21
+
22
+ A button to extract content.
23
+
24
+ ## Context Extraction Logic
25
+
26
+ "Context data," in this context, refers to specific information about the user that can be used to ground AI inference to produce more personalised results.
27
+
28
+ Context data can be achieved from a typical STT transcription through:
29
+
30
+ - Omitting irrelevant information
31
+ - Removing duplicates
32
+ - Reformatting from the first person to the third, referring to "the user."
33
+
34
+ Here is a short example to model the desired type of transformation, expecting that a raw STT transcript would contain approximately this level of defect:
35
+
36
+ "Okay so ... let's document my health problems and the meds I take for this AI project ... ehm.. where do i start ... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh .. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah .. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
37
+
38
+ This text would be transformed to (approximately):
39
+
40
+ {START EXAMPLE}
41
+
42
+ ## Medical Conditions
43
+
44
+ - User has had asthma since childhood
45
+ - User has adult ADHD
46
+
47
+ ## Medication List
48
+
49
+ - User takes Relvar, daily, for asthma
50
+ - User takes Vyvanse 70mg, daily, for ADHD
51
+
52
+ {END EXAMPLE}
53
+
54
+ The above model is written with the idea that if the user were to later provide more detailed context, an agent could easily slot it into the right places. Hence, follow a careful hierarchical structure when formatting the context notes.
55
+
56
+ Most of my context parsing utilities to date have followed a two part approach: STT then LLM cleanup (which is why I'm creating an updated interface as Gemini multimodal means that these steps can be combined!).
57
+
58
+ In this updated interface the audio binary should be sent to Gemini alongside the system prompt (ensuring to include the example).
59
+
60
+ The context data which is extracted can be provided to the user as a markdown file that populates in a right hand pane called Extracted Context
61
+
62
+ ## User Identification
63
+
64
+ I would like to provide one user editable parameter: how they would like to be identified in the context data.
65
+
66
+ That is to say: should the user be referred to by name ("Daniel has asthma") or referred to as "the user."
67
+
68
+ If by name, I would like the user to be able to provide their name.
69
+
70
+ This information should be carried from the frontend into the system prompt constructed in the API call to Gemini.
71
+
72
+ ## Context Download Options
73
+
74
+ The intended use is that after the context data is parsed, the user can download it. I would like to offer the user the ability to download it as a markdown doc or as JSON - we may as well just generate both and provide buttons. We should also provide a clipboard on the markdown body text as this will provide an easy way to grab the output that doesn't require any download.
75
+
76
+ I would like the context data to have a unique name. The logical place to implement this is again with Gemini API.
77
+
78
+ So we should ask it for a JSON output that provides:
79
+
80
+ - Human readable name
81
+ - Snake case filename
82
+ - Context data (as markdown)
83
+
84
+ to this we can add a timestamp.
85
+
86
+ ## Template for markdown file
87
+
88
+ For the markdown version the filename can be the snakecase file name.md
89
+
90
+ The actual contents can follow this template
91
+
92
+ {Start Example}
93
+ ## Readable Context Title
94
+
95
+ {Context Data}
96
+
97
+ ---
98
+
99
+ Captured on: {timestamp}
100
+
101
+ {End Example}
102
+
103
+ ## JSON Template
104
+
105
+ For the JSON version we should construct an object with all these parameters.
106
+
107
+ For both, after generation, they should be presented to the user with download buttons.
README.md CHANGED
@@ -1,13 +1,179 @@
1
  ---
2
  title: Context Cruncher
3
- emoji: πŸƒ
4
- colorFrom: green
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
- short_description: Voice note context extraction utility
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Context Cruncher
3
+ emoji: πŸŽ™οΈ
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.9.1
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: Transform voice recordings into structured AI context data
12
  ---
13
 
14
+ # Context Cruncher πŸŽ™οΈ
15
+
16
+ Transform casual voice recordings into clean, structured context data for AI applications.
17
+
18
+ [![Demo](https://img.shields.io/badge/Demo-Live-brightgreen)](demo.html)
19
+ [![HuggingFace](https://img.shields.io/badge/πŸ€—-Space-yellow)](https://huggingface.co/spaces/danielrosehill/Context-Cruncher)
20
+
21
+ ## What is Context Cruncher?
22
+
23
+ Context Cruncher extracts structured context data from voice recordings using Gemini AI's multimodal capabilities. It processes audio directly, cleaning up natural speech patterns and organizing information into useful context data that AI systems can use for personalization.
24
+
25
+ **Context data** refers to specific information about users that grounds AI inference for more personalized results. This tool achieves that by:
26
+
27
+ - Removing irrelevant information and tangents
28
+ - Eliminating duplicates and redundancy
29
+ - Reformatting from first person to third person
30
+ - Organizing information hierarchically
31
+ - Outputting both Markdown and JSON formats
32
+
33
+ ## See it in Action
34
+
35
+ Check out the [demo page](demo.html) to see real results from processing example audio about movie preferences.
36
+
37
+ ## Features
38
+
39
+ - **🎀 Flexible Audio Input**: Record directly in your browser or upload audio files (MP3, WAV, OPUS)
40
+ - **πŸ€– AI-Powered Extraction**: Uses Gemini 2.0 Flash for intelligent audio understanding and context extraction
41
+ - **πŸ“ Dual Output Formats**: Get both human-readable Markdown and machine-readable JSON
42
+ - **πŸ‘€ Customizable Identification**: Choose how you're referred to in the context data (by name or as "the user")
43
+ - **πŸ“‹ Easy Export**: Download files or copy directly to clipboard
44
+
45
+ ## Quick Start
46
+
47
+ ### Prerequisites
48
+
49
+ - Python 3.12+
50
+ - A [Gemini API key](https://ai.google.dev/)
51
+
52
+ ### Installation
53
+
54
+ 1. Clone the repository:
55
+ ```bash
56
+ git clone https://github.com/danielrosehill/Context-Cruncher.git
57
+ cd Context-Cruncher
58
+ ```
59
+
60
+ 2. Create a virtual environment and install dependencies:
61
+ ```bash
62
+ # Using uv (recommended)
63
+ uv venv
64
+ source .venv/bin/activate
65
+ uv pip install -r requirements.txt
66
+
67
+ # Or using standard venv
68
+ python -m venv .venv
69
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
70
+ pip install -r requirements.txt
71
+ ```
72
+
73
+ 3. Create a `.env` file with your Gemini API key:
74
+ ```bash
75
+ cp .env.example .env
76
+ # Edit .env and add your API key:
77
+ # GEMINI_API="your_api_key_here"
78
+ ```
79
+
80
+ 4. Run the application:
81
+
82
+ **Option A: Using the launch script (easiest)**
83
+ ```bash
84
+ ./run.sh
85
+ ```
86
+
87
+ **Option B: Manual launch**
88
+ ```bash
89
+ source .venv/bin/activate
90
+ python app.py
91
+ ```
92
+
93
+ The app will launch in your browser at `http://localhost:7860`
94
+
95
+ ## Usage
96
+
97
+ 1. **Configure**: Enter your Gemini API key (or load from `.env`)
98
+ 2. **Choose Identification**: Select whether to be referred to by name or as "the user"
99
+ 3. **Provide Audio**: Either:
100
+ - Record directly in the browser using your microphone
101
+ - Upload an audio file (MP3, WAV, or OPUS)
102
+ 4. **Extract**: Click "Extract Context" to process your audio
103
+ 5. **Download**: Get your structured context data as Markdown or JSON
104
+
105
+ ## Example Transformation
106
+
107
+ **Raw Audio Input:**
108
+ > "Okay so... let's document my health problems and the meds I take for this AI project... ehm.. where do I start... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh.. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah.. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
109
+
110
+ **Structured Output:**
111
+ ```markdown
112
+ ## Medical Conditions
113
+
114
+ - the user has had asthma since childhood
115
+ - the user has adult ADHD
116
+
117
+ ## Medication List
118
+
119
+ - the user takes Relvar, daily, for asthma
120
+ - the user takes Vyvanse 70mg, daily, for ADHD
121
+ ```
122
+
123
+ ## Generating Demo Results
124
+
125
+ To regenerate the demo results with the example audio:
126
+
127
+ ```bash
128
+ python generate_demo.py
129
+ ```
130
+
131
+ This will process the `example-data/movie-prefs.opus` file and save results to `demo-results/`.
132
+
133
+ ## Privacy Note
134
+
135
+ Your audio is processed using the Gemini API. Review [Google's privacy policies](https://policies.google.com/) before using this tool with sensitive information.
136
+
137
+ ## Use Cases
138
+
139
+ - **AI Assistant Personalization**: Provide context to chatbots and AI assistants
140
+ - **Knowledge Management**: Convert verbal notes into structured information
141
+ - **Preference Mapping**: Document likes, dislikes, and preferences
142
+ - **Medical History**: Organize health information (note privacy considerations)
143
+ - **Project Context**: Capture project requirements and preferences
144
+
145
+ ## Technical Details
146
+
147
+ - **Frontend**: Gradio web interface
148
+ - **AI Model**: Gemini 2.0 Flash (with multimodal audio understanding)
149
+ - **Audio Processing**: Direct audio file upload to Gemini API
150
+ - **Output Formats**: Markdown and JSON
151
+
152
+ ## Repository Structure
153
+
154
+ ```
155
+ Context-Cruncher/
156
+ β”œβ”€β”€ app.py # Main Gradio application
157
+ β”œβ”€β”€ gemini_processor.py # Gemini API integration
158
+ β”œβ”€β”€ generate_demo.py # Demo generation script
159
+ β”œβ”€β”€ run.sh # Launch script
160
+ β”œβ”€β”€ requirements.txt # Python dependencies
161
+ β”œβ”€β”€ .env.example # Environment variable template
162
+ β”œβ”€β”€ demo.html # Demo results page
163
+ β”œβ”€β”€ example-data/ # Example audio files
164
+ └── demo-results/ # Generated demo outputs
165
+ ```
166
+
167
+ ## Contributing
168
+
169
+ Contributions welcome! Please feel free to submit issues or pull requests.
170
+
171
+ ## License
172
+
173
+ MIT License - See LICENSE file for details
174
+
175
+ ## Author
176
+
177
+ Daniel Rosehill
178
+ - Website: [danielrosehill.com](https://danielrosehill.com)
179
+ - GitHub: [@danielrosehill](https://github.com/danielrosehill)
app.py ADDED
@@ -0,0 +1,334 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Context Cruncher - Gradio Application
3
+ Extract structured context data from voice recordings using Gemini AI.
4
+ """
5
+ import gradio as gr
6
+ import os
7
+ from pathlib import Path
8
+ import tempfile
9
+ from dotenv import load_dotenv
10
+ from gemini_processor import (
11
+ process_audio_with_gemini,
12
+ create_markdown_file,
13
+ create_json_file
14
+ )
15
+
16
+ # Load environment variables
17
+ load_dotenv()
18
+
19
+
20
+ def process_audio(
21
+ audio_input,
22
+ uploaded_file,
23
+ api_key: str,
24
+ user_identification: str,
25
+ user_name: str = ""
26
+ ) -> tuple:
27
+ """
28
+ Process audio from either recording or upload.
29
+
30
+ Args:
31
+ audio_input: Audio from microphone recording
32
+ uploaded_file: Uploaded audio file
33
+ api_key: Gemini API key
34
+ user_identification: "name" or "user"
35
+ user_name: User's name if using name identification
36
+
37
+ Returns:
38
+ Tuple of (markdown_content, markdown_file, json_file, status_message)
39
+ """
40
+ try:
41
+ # Validate API key
42
+ if not api_key or api_key.strip() == "":
43
+ return (
44
+ "",
45
+ None,
46
+ None,
47
+ "Error: Please provide a Gemini API key"
48
+ )
49
+
50
+ # Determine which audio source to use
51
+ audio_path = None
52
+ if audio_input is not None:
53
+ audio_path = audio_input
54
+ elif uploaded_file is not None:
55
+ audio_path = uploaded_file.name
56
+
57
+ if audio_path is None:
58
+ return (
59
+ "",
60
+ None,
61
+ None,
62
+ "Error: Please record audio or upload an audio file"
63
+ )
64
+
65
+ # Determine user reference
66
+ user_ref = None
67
+ if user_identification == "name":
68
+ if not user_name or user_name.strip() == "":
69
+ return (
70
+ "",
71
+ None,
72
+ None,
73
+ "Error: Please provide your name when using name identification"
74
+ )
75
+ user_ref = user_name.strip()
76
+
77
+ # Process with Gemini
78
+ status_msg = "Processing audio with Gemini API..."
79
+ context_markdown, human_readable_name, snake_case_filename = process_audio_with_gemini(
80
+ audio_path,
81
+ api_key,
82
+ user_ref
83
+ )
84
+
85
+ # Create output files
86
+ md_filename, md_content = create_markdown_file(
87
+ context_markdown,
88
+ human_readable_name,
89
+ snake_case_filename
90
+ )
91
+
92
+ json_filename, json_content = create_json_file(
93
+ context_markdown,
94
+ human_readable_name,
95
+ snake_case_filename
96
+ )
97
+
98
+ # Write files to temp directory for download
99
+ temp_dir = tempfile.mkdtemp()
100
+ md_path = Path(temp_dir) / md_filename
101
+ json_path = Path(temp_dir) / json_filename
102
+
103
+ with open(md_path, 'w') as f:
104
+ f.write(md_content)
105
+
106
+ with open(json_path, 'w') as f:
107
+ f.write(json_content)
108
+
109
+ return (
110
+ md_content,
111
+ str(md_path),
112
+ str(json_path),
113
+ f"Success! Context extracted: {human_readable_name}"
114
+ )
115
+
116
+ except Exception as e:
117
+ return (
118
+ "",
119
+ None,
120
+ None,
121
+ f"Error: {str(e)}"
122
+ )
123
+
124
+
125
+ # Custom CSS for better styling
126
+ custom_css = """
127
+ .gradio-container {
128
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
129
+ }
130
+ .main-header {
131
+ text-align: center;
132
+ margin-bottom: 1.5rem;
133
+ padding-bottom: 1rem;
134
+ border-bottom: 2px solid #e5e7eb;
135
+ }
136
+ .main-header h1 {
137
+ font-size: 2rem;
138
+ font-weight: 600;
139
+ color: #1f2937;
140
+ margin-bottom: 0.5rem;
141
+ }
142
+ .main-header p {
143
+ color: #6b7280;
144
+ font-size: 1rem;
145
+ }
146
+ .section-header {
147
+ font-weight: 600;
148
+ color: #374151;
149
+ margin-bottom: 1rem;
150
+ }
151
+ """
152
+
153
+ # Create Gradio interface
154
+ with gr.Blocks(css=custom_css, title="Context Cruncher") as demo:
155
+ gr.Markdown(
156
+ """
157
+ # Context Cruncher
158
+
159
+ Extract structured context data from voice recordings using AI
160
+ """,
161
+ elem_classes="main-header"
162
+ )
163
+
164
+ with gr.Tabs():
165
+ with gr.Tab("Extract"):
166
+ with gr.Row():
167
+ with gr.Column(scale=1):
168
+ with gr.Accordion("Configuration", open=True):
169
+ api_key_input = gr.Textbox(
170
+ label="Gemini API Key",
171
+ placeholder="Enter your Gemini API key",
172
+ type="password",
173
+ value=os.getenv("GEMINI_API", ""),
174
+ info="Get your API key from https://ai.google.dev/"
175
+ )
176
+
177
+ user_identification = gr.Radio(
178
+ choices=["user", "name"],
179
+ value="user",
180
+ label="User Identification",
181
+ info="How should you be referred to in the context data?"
182
+ )
183
+
184
+ user_name_input = gr.Textbox(
185
+ label="Your Name",
186
+ placeholder="Enter your name",
187
+ visible=False,
188
+ info="Used when 'name' is selected above"
189
+ )
190
+
191
+ gr.Markdown("### Audio Input", elem_classes="section-header")
192
+
193
+ audio_recording = gr.Audio(
194
+ sources=["microphone"],
195
+ type="filepath",
196
+ label="Record Audio"
197
+ )
198
+
199
+ gr.Markdown("**OR**")
200
+
201
+ audio_upload = gr.File(
202
+ label="Upload Audio File",
203
+ file_types=["audio"],
204
+ type="filepath"
205
+ )
206
+
207
+ process_btn = gr.Button("Extract Context", variant="primary", size="lg")
208
+
209
+ with gr.Column(scale=1):
210
+ gr.Markdown("### Results", elem_classes="section-header")
211
+
212
+ status_output = gr.Textbox(
213
+ label="Status",
214
+ interactive=False,
215
+ show_label=True
216
+ )
217
+
218
+ context_display = gr.Textbox(
219
+ label="Context Data (Markdown)",
220
+ lines=18,
221
+ interactive=False,
222
+ show_copy_button=True
223
+ )
224
+
225
+ with gr.Row():
226
+ markdown_download = gr.File(label="Download Markdown")
227
+ json_download = gr.File(label="Download JSON")
228
+
229
+ with gr.Tab("About"):
230
+ gr.Markdown(
231
+ """
232
+ ## What is Context Cruncher?
233
+
234
+ Context Cruncher transforms casual voice recordings into clean, structured context data
235
+ that AI systems can use for personalization.
236
+
237
+ **Context data** refers to specific information about users that grounds AI inference
238
+ for more personalized results.
239
+
240
+ ## How It Works
241
+
242
+ 1. **Configure** - Enter your Gemini API key and choose how you want to be identified
243
+ 2. **Input Audio** - Either record directly in your browser or upload an audio file (MP3, WAV, OPUS)
244
+ 3. **Extract** - Click the button and let AI clean up your recording into structured context data
245
+ 4. **Download** - Get your context data as Markdown or JSON, or copy directly from the text area
246
+
247
+ ## What Gets Extracted
248
+
249
+ This tool processes your audio by:
250
+
251
+ - Removing irrelevant information and tangents
252
+ - Eliminating duplicates and redundancy
253
+ - Reformatting from first person to third person
254
+ - Organizing information hierarchically
255
+ - Outputting both Markdown and JSON formats
256
+
257
+ ## Example Transformation
258
+
259
+ **Raw Audio:**
260
+ > "Okay so... let's document my health problems... I've had asthma since I was a kid.
261
+ > I take a daily inhaler called Relvar for that. Oh hey Jay! What's up!
262
+ > Okay, where was I... I also take Vyvanse for ADHD."
263
+
264
+ **Structured Output:**
265
+ ```markdown
266
+ ## Medical Conditions
267
+
268
+ - the user has had asthma since childhood
269
+ - the user has adult ADHD
270
+
271
+ ## Medication List
272
+
273
+ - the user takes Relvar, daily, for asthma
274
+ - the user takes Vyvanse for ADHD
275
+ ```
276
+
277
+ ## Privacy Notice
278
+
279
+ Your audio is processed using the Gemini API. Review Google's privacy policies
280
+ before using this tool with sensitive information.
281
+
282
+ ## Technical Details
283
+
284
+ - **AI Model**: Gemini 2.0 Flash (multimodal audio understanding)
285
+ - **Processing**: Direct audio file upload to Gemini API
286
+ - **Output Formats**: Markdown and JSON
287
+
288
+ ## Use Cases
289
+
290
+ - AI assistant personalization
291
+ - Knowledge management
292
+ - Preference mapping
293
+ - Medical history documentation (note privacy considerations)
294
+ - Project context capture
295
+ """
296
+ )
297
+
298
+ # Show/hide name input based on identification method
299
+ def toggle_name_input(identification_choice):
300
+ return gr.update(visible=identification_choice == "name")
301
+
302
+ user_identification.change(
303
+ fn=toggle_name_input,
304
+ inputs=[user_identification],
305
+ outputs=[user_name_input]
306
+ )
307
+
308
+ # Process button click
309
+ process_btn.click(
310
+ fn=process_audio,
311
+ inputs=[
312
+ audio_recording,
313
+ audio_upload,
314
+ api_key_input,
315
+ user_identification,
316
+ user_name_input
317
+ ],
318
+ outputs=[
319
+ context_display,
320
+ markdown_download,
321
+ json_download,
322
+ status_output
323
+ ]
324
+ )
325
+
326
+
327
+ if __name__ == "__main__":
328
+ # For Hugging Face Spaces, share should be False
329
+ # Set server_name to 0.0.0.0 for Spaces compatibility
330
+ demo.launch(
331
+ server_name="0.0.0.0",
332
+ server_port=7860,
333
+ share=False
334
+ )
demo-results/user_entertainment_preferences_israel_tired_parent.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "human_readable_name": "User's Entertainment Preferences - Israel-Based, Tired Parent",
3
+ "snake_case_filename": "user_entertainment_preferences_israel_tired_parent",
4
+ "context_data": "## Entertainment Preferences\n\n### General\n\n- the user prefers to watch content that is either based on a true story or is credible\n- the user is not a fan of science fiction\n- the user enjoys content with intriguing stories\n- the user occasionally enjoys horror movies\n- the user likes thoughtful content\n- the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy\n- the user prefers movies that take their time rather than overload with special effects and violence\n- the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests\n- the user likes comedy movies, but not rom-coms\n- the user likes obscure travel content\n- the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence\n\n### Formats and Platforms\n\n- the user enjoys Netflix documentary series, especially offbeat documentaries\n- the user appreciates the content produced by Vice\n\n### Specific Movies and Genres\n\n- the user enjoys the genre of the absurd (e.g., Waiting for Godot)\n- the user liked the movies \"Get Out\", \"The Matrix\", \"That Guy\" (potentially \"Vanilla Sky\"), and \"Limitless\"\n- the user is interested in movies that explore the question of reality, such as \"Inception\"\n\n### Content Preferences Related to Israel\n\n- the user lives in Israel and follows geopolitical content\n- the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides\n- the user finds shows like \"Fauda\" too real and conflict-heavy\n- the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel\n\n### Recommendation Preferences\n\n- the user is interested in recommendations of recently released content\n- the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there\n",
5
+ "captured_on": "2025-10-26T21:25:43.200350"
6
+ }
demo-results/user_entertainment_preferences_israel_tired_parent.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## User's Entertainment Preferences - Israel-Based, Tired Parent
2
+
3
+ ## Entertainment Preferences
4
+
5
+ ### General
6
+
7
+ - the user prefers to watch content that is either based on a true story or is credible
8
+ - the user is not a fan of science fiction
9
+ - the user enjoys content with intriguing stories
10
+ - the user occasionally enjoys horror movies
11
+ - the user likes thoughtful content
12
+ - the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy
13
+ - the user prefers movies that take their time rather than overload with special effects and violence
14
+ - the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests
15
+ - the user likes comedy movies, but not rom-coms
16
+ - the user likes obscure travel content
17
+ - the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence
18
+
19
+ ### Formats and Platforms
20
+
21
+ - the user enjoys Netflix documentary series, especially offbeat documentaries
22
+ - the user appreciates the content produced by Vice
23
+
24
+ ### Specific Movies and Genres
25
+
26
+ - the user enjoys the genre of the absurd (e.g., Waiting for Godot)
27
+ - the user liked the movies "Get Out", "The Matrix", "That Guy" (potentially "Vanilla Sky"), and "Limitless"
28
+ - the user is interested in movies that explore the question of reality, such as "Inception"
29
+
30
+ ### Content Preferences Related to Israel
31
+
32
+ - the user lives in Israel and follows geopolitical content
33
+ - the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides
34
+ - the user finds shows like "Fauda" too real and conflict-heavy
35
+ - the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel
36
+
37
+ ### Recommendation Preferences
38
+
39
+ - the user is interested in recommendations of recently released content
40
+ - the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there
41
+
42
+
43
+ ---
44
+
45
+ Captured on: 2025-10-26 21:25:43
demo.html ADDED
@@ -0,0 +1,413 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Context Cruncher - Demo</title>
7
+ <style>
8
+ * {
9
+ margin: 0;
10
+ padding: 0;
11
+ box-sizing: border-box;
12
+ }
13
+
14
+ body {
15
+ font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
16
+ line-height: 1.6;
17
+ color: #333;
18
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
19
+ min-height: 100vh;
20
+ padding: 2rem;
21
+ }
22
+
23
+ .container {
24
+ max-width: 1200px;
25
+ margin: 0 auto;
26
+ background: white;
27
+ border-radius: 12px;
28
+ box-shadow: 0 10px 40px rgba(0, 0, 0, 0.2);
29
+ overflow: hidden;
30
+ }
31
+
32
+ .header {
33
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
34
+ color: white;
35
+ padding: 3rem 2rem;
36
+ text-align: center;
37
+ }
38
+
39
+ .header h1 {
40
+ font-size: 2.5rem;
41
+ margin-bottom: 0.5rem;
42
+ }
43
+
44
+ .header p {
45
+ font-size: 1.2rem;
46
+ opacity: 0.9;
47
+ }
48
+
49
+ .content {
50
+ padding: 2rem;
51
+ }
52
+
53
+ .intro {
54
+ background: #f8f9fa;
55
+ padding: 1.5rem;
56
+ border-radius: 8px;
57
+ margin-bottom: 2rem;
58
+ border-left: 4px solid #667eea;
59
+ }
60
+
61
+ .intro h2 {
62
+ color: #667eea;
63
+ margin-bottom: 1rem;
64
+ }
65
+
66
+ .demo-section {
67
+ margin-bottom: 3rem;
68
+ }
69
+
70
+ .demo-section h2 {
71
+ color: #333;
72
+ margin-bottom: 1.5rem;
73
+ padding-bottom: 0.5rem;
74
+ border-bottom: 2px solid #667eea;
75
+ }
76
+
77
+ .columns {
78
+ display: grid;
79
+ grid-template-columns: 1fr 1fr;
80
+ gap: 2rem;
81
+ margin-top: 2rem;
82
+ }
83
+
84
+ .column {
85
+ background: #f8f9fa;
86
+ padding: 1.5rem;
87
+ border-radius: 8px;
88
+ border: 1px solid #e0e0e0;
89
+ }
90
+
91
+ .column h3 {
92
+ color: #667eea;
93
+ margin-bottom: 1rem;
94
+ font-size: 1.2rem;
95
+ }
96
+
97
+ .output-box {
98
+ background: white;
99
+ padding: 1.5rem;
100
+ border-radius: 6px;
101
+ border: 1px solid #ddd;
102
+ max-height: 600px;
103
+ overflow-y: auto;
104
+ font-family: 'Courier New', monospace;
105
+ font-size: 0.9rem;
106
+ white-space: pre-wrap;
107
+ word-wrap: break-word;
108
+ }
109
+
110
+ .markdown-output {
111
+ font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
112
+ line-height: 1.8;
113
+ }
114
+
115
+ .markdown-output h2 {
116
+ color: #333;
117
+ margin-top: 1.5rem;
118
+ margin-bottom: 1rem;
119
+ font-size: 1.4rem;
120
+ }
121
+
122
+ .markdown-output h3 {
123
+ color: #555;
124
+ margin-top: 1rem;
125
+ margin-bottom: 0.5rem;
126
+ font-size: 1.1rem;
127
+ }
128
+
129
+ .markdown-output ul {
130
+ margin-left: 2rem;
131
+ margin-bottom: 1rem;
132
+ }
133
+
134
+ .markdown-output li {
135
+ margin-bottom: 0.5rem;
136
+ }
137
+
138
+ .markdown-output hr {
139
+ margin: 2rem 0;
140
+ border: none;
141
+ border-top: 1px solid #ddd;
142
+ }
143
+
144
+ .badge {
145
+ display: inline-block;
146
+ background: #667eea;
147
+ color: white;
148
+ padding: 0.25rem 0.75rem;
149
+ border-radius: 4px;
150
+ font-size: 0.85rem;
151
+ margin-bottom: 1rem;
152
+ }
153
+
154
+ .cta {
155
+ text-align: center;
156
+ padding: 2rem;
157
+ background: #f8f9fa;
158
+ border-radius: 8px;
159
+ margin-top: 2rem;
160
+ }
161
+
162
+ .cta h3 {
163
+ color: #333;
164
+ margin-bottom: 1rem;
165
+ }
166
+
167
+ .cta a {
168
+ display: inline-block;
169
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
170
+ color: white;
171
+ padding: 1rem 2rem;
172
+ text-decoration: none;
173
+ border-radius: 6px;
174
+ font-weight: bold;
175
+ transition: transform 0.2s;
176
+ }
177
+
178
+ .cta a:hover {
179
+ transform: translateY(-2px);
180
+ box-shadow: 0 4px 12px rgba(102, 126, 234, 0.4);
181
+ }
182
+
183
+ .tabs {
184
+ display: flex;
185
+ border-bottom: 2px solid #e0e0e0;
186
+ background: #f8f9fa;
187
+ margin: 0;
188
+ padding: 0 2rem;
189
+ }
190
+
191
+ .tab {
192
+ padding: 1rem 2rem;
193
+ cursor: pointer;
194
+ border: none;
195
+ background: none;
196
+ font-size: 1rem;
197
+ font-weight: 500;
198
+ color: #666;
199
+ border-bottom: 3px solid transparent;
200
+ transition: all 0.3s;
201
+ }
202
+
203
+ .tab:hover {
204
+ color: #667eea;
205
+ background: rgba(102, 126, 234, 0.05);
206
+ }
207
+
208
+ .tab.active {
209
+ color: #667eea;
210
+ border-bottom-color: #667eea;
211
+ background: white;
212
+ }
213
+
214
+ .tab-content {
215
+ display: none;
216
+ }
217
+
218
+ .tab-content.active {
219
+ display: block;
220
+ }
221
+
222
+ @media (max-width: 768px) {
223
+ .columns {
224
+ grid-template-columns: 1fr;
225
+ }
226
+
227
+ .header h1 {
228
+ font-size: 2rem;
229
+ }
230
+
231
+ .content {
232
+ padding: 1rem;
233
+ }
234
+
235
+ .tabs {
236
+ flex-direction: column;
237
+ padding: 0;
238
+ }
239
+
240
+ .tab {
241
+ padding: 0.75rem 1rem;
242
+ text-align: left;
243
+ }
244
+ }
245
+ </style>
246
+ </head>
247
+ <body>
248
+ <div class="container">
249
+ <div class="header">
250
+ <h1>Context Cruncher Demo</h1>
251
+ <p>See how audio becomes structured context data</p>
252
+ </div>
253
+
254
+ <div class="tabs">
255
+ <button class="tab active" onclick="switchTab(event, 'overview')">Overview</button>
256
+ <button class="tab" onclick="switchTab(event, 'demo')">Demo Results</button>
257
+ <button class="tab" onclick="switchTab(event, 'features')">Features</button>
258
+ </div>
259
+
260
+ <div class="content">
261
+ <div id="overview" class="tab-content active">
262
+ <div class="intro">
263
+ <h2>What is Context Cruncher?</h2>
264
+ <p>
265
+ Context Cruncher transforms casual voice recordings into clean, structured context data
266
+ that AI systems can use for more personalized results. Using Gemini AI's multimodal capabilities,
267
+ it processes audio directly - understanding, cleaning, and organizing your spoken words into
268
+ useful context data.
269
+ </p>
270
+ <p style="margin-top: 1rem;">
271
+ Below is a real example using a voice recording about movie preferences. Notice how the raw,
272
+ conversational audio has been transformed into organized, third-person context data ready
273
+ for AI applications.
274
+ </p>
275
+ </div>
276
+
277
+ <div class="cta">
278
+ <h3>Ready to try it yourself?</h3>
279
+ <p style="margin-bottom: 1.5rem;">Process your own audio and create structured context data</p>
280
+ <a href="https://huggingface.co/spaces/danielrosehill/Context-Cruncher" target="_blank">Launch Context Cruncher</a>
281
+ </div>
282
+ </div>
283
+
284
+ <div id="demo" class="tab-content">
285
+ <h2>Demo Results</h2>
286
+ <p>
287
+ This demo processes the example audio file (<code>movie-prefs.opus</code>) included in the repository.
288
+ The audio contains casual thoughts about entertainment preferences, and Context Cruncher has
289
+ extracted structured context data from it.
290
+ </p>
291
+
292
+ <div class="columns">
293
+ <div class="column">
294
+ <h3>Markdown Output</h3>
295
+ <span class="badge">Human-Readable</span>
296
+ <div class="output-box markdown-output">
297
+ <h2>User's Entertainment Preferences - Israel-Based, Tired Parent</h2>
298
+
299
+ <h3>Entertainment Preferences</h3>
300
+
301
+ <h4>General</h4>
302
+
303
+ <ul>
304
+ <li>the user prefers to watch content that is either based on a true story or is credible</li>
305
+ <li>the user is not a fan of science fiction</li>
306
+ <li>the user enjoys content with intriguing stories</li>
307
+ <li>the user occasionally enjoys horror movies</li>
308
+ <li>the user likes thoughtful content</li>
309
+ <li>the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy</li>
310
+ <li>the user prefers movies that take their time rather than overload with special effects and violence</li>
311
+ <li>the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests</li>
312
+ <li>the user likes comedy movies, but not rom-coms</li>
313
+ <li>the user likes obscure travel content</li>
314
+ <li>the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence</li>
315
+ </ul>
316
+
317
+ <h4>Formats and Platforms</h4>
318
+
319
+ <ul>
320
+ <li>the user enjoys Netflix documentary series, especially offbeat documentaries</li>
321
+ <li>the user appreciates the content produced by Vice</li>
322
+ </ul>
323
+
324
+ <h4>Specific Movies and Genres</h4>
325
+
326
+ <ul>
327
+ <li>the user enjoys the genre of the absurd (e.g., Waiting for Godot)</li>
328
+ <li>the user liked the movies "Get Out", "The Matrix", "That Guy" (potentially "Vanilla Sky"), and "Limitless"</li>
329
+ <li>the user is interested in movies that explore the question of reality, such as "Inception"</li>
330
+ </ul>
331
+
332
+ <h4>Content Preferences Related to Israel</h4>
333
+
334
+ <ul>
335
+ <li>the user lives in Israel and follows geopolitical content</li>
336
+ <li>the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides</li>
337
+ <li>the user finds shows like "Fauda" too real and conflict-heavy</li>
338
+ <li>the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel</li>
339
+ </ul>
340
+
341
+ <h4>Recommendation Preferences</h4>
342
+
343
+ <ul>
344
+ <li>the user is interested in recommendations of recently released content</li>
345
+ <li>the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there</li>
346
+ </ul>
347
+
348
+ <hr>
349
+
350
+ <p><em>Captured on: 2025-10-26 21:25:43</em></p>
351
+ </div>
352
+ </div>
353
+
354
+ <div class="column">
355
+ <h3>JSON Output</h3>
356
+ <span class="badge">Machine-Readable</span>
357
+ <div class="output-box">{
358
+ "human_readable_name": "User's Entertainment Preferences - Israel-Based, Tired Parent",
359
+ "snake_case_filename": "user_entertainment_preferences_israel_tired_parent",
360
+ "context_data": "## Entertainment Preferences\n\n### General\n\n- the user prefers to watch content that is either based on a true story or is credible\n- the user is not a fan of science fiction\n- the user enjoys content with intriguing stories\n- the user occasionally enjoys horror movies\n- the user likes thoughtful content\n- the user is not a fan of alarmist commentary about technology, but is interested in themes surrounding the parameters of privacy\n- the user prefers movies that take their time rather than overload with special effects and violence\n- the user does not enjoy World War movies or historical movies that depict wars, battles, and conquests\n- the user likes comedy movies, but not rom-coms\n- the user likes obscure travel content\n- the user is interested in technology, artificial intelligence, cybersecurity, politics, hacking, surveillance, and intelligence\n\n### Formats and Platforms\n\n- the user enjoys Netflix documentary series, especially offbeat documentaries\n- the user appreciates the content produced by Vice\n\n### Specific Movies and Genres\n\n- the user enjoys the genre of the absurd (e.g., Waiting for Godot)\n- the user liked the movies \"Get Out\", \"The Matrix\", \"That Guy\" (potentially \"Vanilla Sky\"), and \"Limitless\"\n- the user is interested in movies that explore the question of reality, such as \"Inception\"\n\n### Content Preferences Related to Israel\n\n- the user lives in Israel and follows geopolitical content\n- the user prefers content that explores conflict with less emphasis on the military aspect, focusing instead on personal narratives and interpersonal connections across divides\n- the user finds shows like \"Fauda\" too real and conflict-heavy\n- the user is a new parent and tired, making it difficult to navigate the geo-specific content restrictions imposed by streaming providers in Israel\n\n### Recommendation Preferences\n\n- the user is interested in recommendations of recently released content\n- the user wants recommendations that take into account the user's geographic location in Israel, and exclude content that is difficult to access there\n",
361
+ "captured_on": "2025-10-26T21:25:43.200350"
362
+ }</div>
363
+ </div>
364
+ </div>
365
+ </div>
366
+
367
+ <div id="features" class="tab-content">
368
+ <h2>Key Features Demonstrated</h2>
369
+ <div class="columns">
370
+ <div class="column">
371
+ <h3>Intelligent Cleaning</h3>
372
+ <p>Removes filler words, tangents, and irrelevant information while preserving all meaningful context.</p>
373
+ </div>
374
+ <div class="column">
375
+ <h3>Structured Organization</h3>
376
+ <p>Automatically organizes information into logical categories and hierarchies.</p>
377
+ </div>
378
+ <div class="column">
379
+ <h3>Third-Person Conversion</h3>
380
+ <p>Transforms first-person narratives into third-person context data about "the user".</p>
381
+ </div>
382
+ <div class="column">
383
+ <h3>Multiple Formats</h3>
384
+ <p>Outputs both human-readable Markdown and machine-readable JSON formats.</p>
385
+ </div>
386
+ </div>
387
+ </div>
388
+ </div>
389
+ </div>
390
+
391
+ <script>
392
+ function switchTab(event, tabName) {
393
+ // Hide all tab contents
394
+ const tabContents = document.getElementsByClassName('tab-content');
395
+ for (let content of tabContents) {
396
+ content.classList.remove('active');
397
+ }
398
+
399
+ // Remove active class from all tabs
400
+ const tabs = document.getElementsByClassName('tab');
401
+ for (let tab of tabs) {
402
+ tab.classList.remove('active');
403
+ }
404
+
405
+ // Show the selected tab content
406
+ document.getElementById(tabName).classList.add('active');
407
+
408
+ // Add active class to the clicked tab
409
+ event.currentTarget.classList.add('active');
410
+ }
411
+ </script>
412
+ </body>
413
+ </html>
example-data/movie-prefs.opus ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93e065151228d8b4386403cca36ecc4e120af22a2d54e6b7823baea72a033bd6
3
+ size 2514076
gemini_processor.py ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Gemini API integration for processing audio and extracting context data.
3
+ """
4
+ import google.generativeai as genai
5
+ import json
6
+ from datetime import datetime
7
+ from typing import Dict, Tuple
8
+
9
+
10
+ def get_system_prompt(user_name: str = None) -> str:
11
+ """
12
+ Generate the system prompt for context extraction.
13
+
14
+ Args:
15
+ user_name: Optional name to use instead of "the user"
16
+
17
+ Returns:
18
+ System prompt string
19
+ """
20
+ user_reference = user_name if user_name else "the user"
21
+
22
+ return f"""You are a context extraction assistant. Your task is to analyze audio recordings where users provide personal context information and extract it in a clean, structured format.
23
+
24
+ ## Your Task
25
+
26
+ Extract context data from the user's audio recording. Context data refers to specific information about the user that can be used to ground AI inference for more personalized results.
27
+
28
+ ## Transformation Guidelines
29
+
30
+ 1. Remove irrelevant information (e.g., tangential conversations, notes to self)
31
+ 2. Remove duplicates and redundancy
32
+ 3. Reformat from first person to third person, referring to "{user_reference}"
33
+ 4. Organize information hierarchically with clear sections
34
+ 5. Present information in a clean, structured markdown format
35
+
36
+ ## Example Transformation
37
+
38
+ INPUT (raw audio transcript):
39
+ "Okay so ... let's document my health problems and the meds I take for this AI project ... ehm.. where do i start ... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh .. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah .. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
40
+
41
+ OUTPUT (cleaned context data):
42
+
43
+ ## Medical Conditions
44
+
45
+ - {user_reference} has had asthma since childhood
46
+ - {user_reference} has adult ADHD
47
+
48
+ ## Medication List
49
+
50
+ - {user_reference} takes Relvar, daily, for asthma
51
+ - {user_reference} takes Vyvanse 70mg, daily, for ADHD
52
+
53
+ ## Important Notes
54
+
55
+ Follow a careful hierarchical structure that allows additional context to be easily integrated later. Use clear section headers and bullet points for organization.
56
+
57
+ Now process the provided audio recording and extract the context data following these guidelines."""
58
+
59
+
60
+ def get_naming_prompt() -> str:
61
+ """Get the prompt for generating context data names."""
62
+ return """Based on the context data you just extracted, provide a JSON object with:
63
+ 1. human_readable_name: A clear, descriptive title for this context (e.g., "Medical History and Medications", "Movie Preferences")
64
+ 2. snake_case_filename: A snake_case version suitable for a filename (e.g., "medical_history_medications", "movie_preferences")
65
+
66
+ Respond ONLY with a valid JSON object in this exact format:
67
+ {
68
+ "human_readable_name": "Your Title Here",
69
+ "snake_case_filename": "your_filename_here"
70
+ }"""
71
+
72
+
73
+ def process_audio_with_gemini(
74
+ audio_file_path: str,
75
+ api_key: str,
76
+ user_name: str = None
77
+ ) -> Tuple[str, str, str]:
78
+ """
79
+ Process audio file with Gemini API to extract context data.
80
+
81
+ Args:
82
+ audio_file_path: Path to the audio file
83
+ api_key: Gemini API key
84
+ user_name: Optional user name for personalization
85
+
86
+ Returns:
87
+ Tuple of (context_markdown, human_readable_name, snake_case_filename)
88
+
89
+ Raises:
90
+ Exception: If API call fails
91
+ """
92
+ genai.configure(api_key=api_key)
93
+
94
+ # Use Gemini Pro 2.5 with audio understanding
95
+ model = genai.GenerativeModel('gemini-2.0-flash-exp')
96
+
97
+ # Upload the audio file
98
+ audio_file = genai.upload_file(audio_file_path)
99
+
100
+ # Generate context data
101
+ system_prompt = get_system_prompt(user_name)
102
+ response = model.generate_content([system_prompt, audio_file])
103
+ context_markdown = response.text
104
+
105
+ # Generate naming information
106
+ naming_response = model.generate_content([
107
+ context_markdown,
108
+ get_naming_prompt()
109
+ ])
110
+
111
+ # Parse the JSON response
112
+ try:
113
+ # Extract JSON from response (handle potential markdown code blocks)
114
+ naming_text = naming_response.text.strip()
115
+ if naming_text.startswith('```'):
116
+ # Remove markdown code block markers
117
+ lines = naming_text.split('\n')
118
+ naming_text = '\n'.join(lines[1:-1])
119
+
120
+ naming_data = json.loads(naming_text)
121
+ human_readable_name = naming_data['human_readable_name']
122
+ snake_case_filename = naming_data['snake_case_filename']
123
+ except (json.JSONDecodeError, KeyError) as e:
124
+ # Fallback to generic naming if parsing fails
125
+ human_readable_name = "Context Data"
126
+ snake_case_filename = "context_data"
127
+
128
+ return context_markdown, human_readable_name, snake_case_filename
129
+
130
+
131
+ def create_markdown_file(
132
+ context_markdown: str,
133
+ human_readable_name: str,
134
+ snake_case_filename: str
135
+ ) -> Tuple[str, str]:
136
+ """
137
+ Create a formatted markdown file content.
138
+
139
+ Args:
140
+ context_markdown: The extracted context data
141
+ human_readable_name: Human readable title
142
+ snake_case_filename: Filename
143
+
144
+ Returns:
145
+ Tuple of (filename, content)
146
+ """
147
+ timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
148
+
149
+ content = f"""## {human_readable_name}
150
+
151
+ {context_markdown}
152
+
153
+ ---
154
+
155
+ Captured on: {timestamp}
156
+ """
157
+
158
+ filename = f"{snake_case_filename}.md"
159
+ return filename, content
160
+
161
+
162
+ def create_json_file(
163
+ context_markdown: str,
164
+ human_readable_name: str,
165
+ snake_case_filename: str
166
+ ) -> Tuple[str, str]:
167
+ """
168
+ Create a JSON file content.
169
+
170
+ Args:
171
+ context_markdown: The extracted context data
172
+ human_readable_name: Human readable title
173
+ snake_case_filename: Filename
174
+
175
+ Returns:
176
+ Tuple of (filename, json_content)
177
+ """
178
+ timestamp = datetime.now().isoformat()
179
+
180
+ data = {
181
+ "human_readable_name": human_readable_name,
182
+ "snake_case_filename": snake_case_filename,
183
+ "context_data": context_markdown,
184
+ "captured_on": timestamp
185
+ }
186
+
187
+ filename = f"{snake_case_filename}.json"
188
+ json_content = json.dumps(data, indent=2)
189
+ return filename, json_content
generate_demo.py ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Generate demo results by processing the example audio file.
3
+ """
4
+ import os
5
+ from pathlib import Path
6
+ from gemini_processor import (
7
+ process_audio_with_gemini,
8
+ create_markdown_file,
9
+ create_json_file
10
+ )
11
+ from dotenv import load_dotenv
12
+
13
+ # Load environment variables
14
+ load_dotenv()
15
+
16
+ def main():
17
+ # Get API key from environment
18
+ api_key = os.getenv('GEMINI_API')
19
+ if not api_key:
20
+ raise ValueError("GEMINI_API not found in .env file")
21
+
22
+ # Path to example audio
23
+ audio_path = "example-data/movie-prefs.opus"
24
+
25
+ print(f"Processing {audio_path}...")
26
+
27
+ # Process with Gemini (using "user" identification)
28
+ context_markdown, human_readable_name, snake_case_filename = process_audio_with_gemini(
29
+ audio_path,
30
+ api_key,
31
+ user_name=None # Use "the user" format
32
+ )
33
+
34
+ print(f"Extracted context: {human_readable_name}")
35
+
36
+ # Create output files
37
+ md_filename, md_content = create_markdown_file(
38
+ context_markdown,
39
+ human_readable_name,
40
+ snake_case_filename
41
+ )
42
+
43
+ json_filename, json_content = create_json_file(
44
+ context_markdown,
45
+ human_readable_name,
46
+ snake_case_filename
47
+ )
48
+
49
+ # Create demo-results directory
50
+ demo_dir = Path("demo-results")
51
+ demo_dir.mkdir(exist_ok=True)
52
+
53
+ # Write files
54
+ md_path = demo_dir / md_filename
55
+ json_path = demo_dir / json_filename
56
+
57
+ with open(md_path, 'w') as f:
58
+ f.write(md_content)
59
+ print(f"Saved: {md_path}")
60
+
61
+ with open(json_path, 'w') as f:
62
+ f.write(json_content)
63
+ print(f"Saved: {json_path}")
64
+
65
+ print("\nDemo results generated successfully!")
66
+
67
+ if __name__ == "__main__":
68
+ main()
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ gradio==5.9.1
2
+ google-generativeai==0.8.3
3
+ python-dotenv==1.0.1
run.sh ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Context Cruncher - Launch Script
3
+ # Wrapper script to easily launch the Context Cruncher application
4
+
5
+ set -e # Exit on error
6
+
7
+ echo "πŸŽ™οΈ Context Cruncher - Launch Script"
8
+ echo "===================================="
9
+ echo ""
10
+
11
+ # Check if virtual environment exists
12
+ if [ ! -d ".venv" ]; then
13
+ echo "❌ Virtual environment not found!"
14
+ echo "Creating virtual environment with uv..."
15
+ uv venv
16
+ echo "βœ… Virtual environment created"
17
+ echo ""
18
+ fi
19
+
20
+ # Activate virtual environment
21
+ echo "πŸ“¦ Activating virtual environment..."
22
+ source .venv/bin/activate
23
+
24
+ # Check if dependencies are installed
25
+ if ! python -c "import gradio" 2>/dev/null; then
26
+ echo "πŸ“₯ Installing dependencies..."
27
+ uv pip install -r requirements.txt
28
+ echo "βœ… Dependencies installed"
29
+ echo ""
30
+ fi
31
+
32
+ # Check if .env file exists
33
+ if [ ! -f ".env" ]; then
34
+ echo "⚠️ Warning: .env file not found!"
35
+ echo "Please create a .env file with your Gemini API key."
36
+ echo "You can copy .env.example and add your key:"
37
+ echo ""
38
+ echo " cp .env.example .env"
39
+ echo " # Then edit .env and add your GEMINI_API key"
40
+ echo ""
41
+ read -p "Do you want to continue anyway? (y/n) " -n 1 -r
42
+ echo ""
43
+ if [[ ! $REPLY =~ ^[Yy]$ ]]; then
44
+ exit 1
45
+ fi
46
+ fi
47
+
48
+ # Launch the application
49
+ echo "πŸš€ Launching Context Cruncher..."
50
+ echo "The app will open in your browser at http://localhost:7860"
51
+ echo ""
52
+ echo "Press Ctrl+C to stop the server"
53
+ echo ""
54
+
55
+ python app.py