LingoEDU-4B

📖 Model Introduction

LingoEDU-4B is specialization of Qwen/Qwen3-4B for document structure analysis.

With this model, we transform a linear discourse sequence into a condensed hierarchical tree, where every node is strictly anchored to the source via coordinate pointers.

📊 Performance on StructBench

Method	Type	TED (Structure) ↓	DLA (Accuracy) ↑	Cost ($/doc) ↓	Latency (s) ↓
GPT-4o	General LLM*	8.53	36.29%	0.0210	-
GPT-4.1		9.14	37.90%	0.0168	-
OpenAI o3		8.01	35.48%	0.0168	-
OpenAI o4-mini		8.45	36.29%	0.0092	-
Claude-3.7-Sonnet		9.98	35.48%	0.0286	-
Claude-4		7.98	41.53%	0.0286	-
Gemini-2.5-flash		8.12	33.74%	0.0040	-
Gemini-2.5-pro		8.15	35.89%	0.0162	-
DeepSeek-V3		9.12	34.68%	0.0012	-
DeepSeek-R1		8.44	35.08%	0.0046	-
Qwen3-32B		8.55	34.01%	0.0012	10.17^†
Qwen3-235B		9.81	27.02%	0.0012	-
Jina-Reader	Parser API	17.04	-	0.0004	-
Firecrawl	Parser API	16.81	-	0.0007	-
Our Method (LingoEDU)	Specialized	5.67	46.77%	0.0007	1.20^†

🚀 Quickstart

Get system prompt, article input and guidance grammar

System prompt: a fixed system prompt
Article input: an input string built from sentence-segmented article
Guidance grammar: an lark grammar built from sentence-segmented article

See in our Github repository DeepLangAI/LingoEDU.

Generate with vLLM

from vllm import LLM
from vllm.config import StructuredOutputsConfig
from vllm.sampling_params import SamplingParams, StructuredOutputsParams

if __name__ == "__main__":

    model_name = "deeplang-ai/LingoEDU-4B"

    vllm_llm = LLM(
        model=model_name,
        structured_outputs_config=StructuredOutputsConfig(backend="guidance")
    )
    tokenizer = vllm_llm.get_tokenizer()

    system_prompt = ...
    article_input = ...
    guidance_grammar = ...

    prompt_str = tokenizer.apply_chat_template(
        [
            {
                "role": "system",
                "content": system_prompt,
            },
            {
                "role": "user",
                "content": article_input,
            },
        ],
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False,
    )

    output = vllm_llm.generate(
        prompts=[prompt_str],
        sampling_params=SamplingParams(
            temperature=0.0,
            top_k=-1,
            top_p=1.0,
            max_tokens=8*1024,
            skip_special_tokens=False,
            n=1,
            structured_outputs=StructuredOutputsParams(grammar=guidance_grammar),
        ),
    )[0].outputs[0].text

📌 Limitations

Not fine-tuned for general chat.
Handles only text-based documents; no multimodal input.

📝 Citation

If you find our work helpful, feel free to give us a cite.

@misc{zhou2025contextedusfaithfulstructured,
      title={From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition}, 
      author={Yiqing Zhou and Yu Lei and Shuzheng Si and Qingyan Sun and Wei Wang and Yifei Wu and Hao Wen and Gang Chen and Fanchao Qi and Maosong Sun},
      year={2025},
      eprint={2512.14244},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.14244}, 
}

Downloads last month: 25

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for deeplang-ai/LingoEDU-4B

Quantizations

2 models