pinned
Running
on
L40S
494
MinerU OCR
📚
A data extraction tool to convert PDF to Markdown and JSON
OpenDataLab provides high-quality open datasets and tools for large models. China Large model corpus Data Alliance open source data service designated platform
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
AICC: Parse HTML Finer, Make Models Better -- A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser
A data extraction tool to convert PDF to Markdown and JSON
Demo for TRivia
Evaluate formula recognition accuracy
Demo for DocLayout-YOLO
Recognize math equations from images