A newer version of the Gradio SDK is available:
6.1.0
π Caching System - Instant Playback for Examples
Overview
The Science Storyteller now includes an intelligent caching system that provides instant playback for pre-generated example topics. This dramatically improves user experience while maintaining full transparency.
How It Works
For Users
- β‘ Click any example topic β Instant playback (< 1 second)
- βοΈ Enter custom topic β Fresh generation (~1.5 minutes)
- π Transparency β Status clearly shows if result is cached or freshly generated
For Developers
Cache Architecture
cache/
βββ metadata.json # Topic β files mapping
βββ {hash}_summary.txt # Cached summaries
βββ {hash}_script.txt # Cached scripts
βββ {hash}_paper.txt # Cached paper info
βββ (audio files stored in assets/audio/)
Cache Key Generation
- Topics are normalized (lowercase, stripped)
- MD5 hash ensures consistent lookup
- Example: "AlphaFold 3" β
a1b2c3d4e5f6...
Implementation Details
Files Added/Modified
utils/cache_manager.py(NEW)CacheManagerclass handles all caching logic- Methods:
get(),set(),has(),list_cached() - Stores metadata in
cache/metadata.json
app.py(MODIFIED)- Cache check at start of
process_topic() - Instant return if topic is cached
- Auto-cache new generations for future use
- Status messages show cache/fresh indicator
- Cache check at start of
generate_cache.py(NEW)- Pre-generates all 8 example topics
- Runs automatically to populate cache
- Takes ~15-20 minutes for all examples
test_cache.py(NEW)- Tests cache hit/miss scenarios
- Measures speedup (typically 90s β <1s)
- Validates transparency indicators
UI Changes
Header:
β‘ Click examples below for instant playback (pre-generated)
| Custom topics take ~1.5 min to generate
Status Messages:
- Cached:
β‘ Instant result from cache! ... [Pre-generated example] - Fresh:
β Success! Generated ... [Freshly generated]
Usage
Generate Pre-cached Examples
# Run once to pre-generate all 8 examples
uv run --env-file .env python generate_cache.py
This will:
- Generate podcasts for all example topics
- Store in
cache/directory - Enable instant playback for examples
Test Cache Performance
# Test cache functionality
uv run --env-file .env python test_cache.py
Expected results:
- Fresh generation: ~90-120 seconds
- Cached retrieval: <1 second
- Speedup: ~100x faster
Check Cached Topics
from utils.cache_manager import CacheManager
cache = CacheManager()
print(cache.list_cached())
# Output: ['AlphaFold 3', 'transformer neural network', ...]
Performance Impact
| Scenario | Without Cache | With Cache | Improvement |
|---|---|---|---|
| Example topics | ~90 seconds | <1 second | ~100x faster β‘ |
| Custom topics | ~90 seconds | ~90 seconds | Same (first time) |
| Repeat custom | ~90 seconds | <1 second | ~100x faster β‘ |
Transparency Features
User Visibility
β
Clear indicators in status message
β
Different emojis (β‘ vs β
)
β
Explicit labels ([Pre-generated example] vs [Freshly generated])
Why This Matters
- Users understand what they're getting
- Meets hackathon requirement for transparency
- Builds trust with clear communication
- No "hidden magic"
Technical Benefits
- Instant UX - Examples load immediately
- Cost Savings - Reduces repeated API calls
- Reliability - Examples always work
- Scalability - Easy to add more cached topics
- Transparency - Users know the difference
Example Topics (Pre-cached)
- AlphaFold 3
- transformer neural network
- mRNA vaccine technology
- tuberculosis vaccine
- CRISPR/Cas9 gene editing applications
- 3I/ATLAS
- quantum entanglement
- Ocean acidification
Maintenance
Adding More Cached Topics
Edit
generate_cache.py:EXAMPLE_TOPICS = [ # ... existing topics "your new topic here", ]Run generator:
uv run --env-file .env python generate_cache.pyUpdate UI examples in
app.pyto match
Clearing Cache
# Remove all cached content
rm -rf cache/
Cache will be regenerated on next generate_cache.py run.
Cache Storage
- Metadata:
cache/metadata.json(~1KB per topic) - Text files: ~5-10KB per topic
- Audio files: ~5-15MB per topic (WAV format)
- Total per topic: ~5-15MB
- 8 examples: ~40-120MB total
Future Enhancements
Possible improvements (post-hackathon):
- LRU Cache - Auto-remove old entries
- Compression - Convert WAV to MP3 (~10x smaller)
- CDN Integration - Serve from edge locations
- Smart Preloading - Predict likely user queries
- Cache Analytics - Track hit/miss rates
Troubleshooting
Cache not working?
# Check if metadata exists
cat cache/metadata.json
# Verify files exist
ls -lh cache/
# Test cache manager
uv run --env-file .env python test_cache.py
Slow cache retrieval?
- Check file system performance
- Verify audio file exists at path
- Review logs for errors
Inconsistent results?
- Clear cache and regenerate
- Check topic normalization (case, whitespace)
- Verify hash generation
Summary
β
Implemented: Full caching system with transparency
β
Performance: ~100x faster for cached topics
β
UX: Clear indicators for cached vs fresh content
β
Hackathon Ready: Meets all requirements
The caching system provides the best of both worlds:
- Speed for common queries (examples)
- Freshness for custom topics
- Transparency for all users