LingoEDU-4B

πŸ“œ Paper πŸ’» Github Repo

πŸ“– Model Introduction

LingoEDU-4B is specialization of Qwen/Qwen3-4B for document structure analysis.

With this model, we transform a linear discourse sequence into a condensed hierarchical tree, where every node is strictly anchored to the source via coordinate pointers.

πŸ“Š Performance on StructBench

Method Type TED (Structure) ↓ DLA (Accuracy) ↑ Cost ($/doc) ↓ Latency (s) ↓
GPT-4o General LLM* 8.53 36.29% 0.0210 -
GPT-4.1 9.14 37.90% 0.0168 -
OpenAI o3 8.01 35.48% 0.0168 -
OpenAI o4-mini 8.45 36.29% 0.0092 -
Claude-3.7-Sonnet 9.98 35.48% 0.0286 -
Claude-4 7.98 41.53% 0.0286 -
Gemini-2.5-flash 8.12 33.74% 0.0040 -
Gemini-2.5-pro 8.15 35.89% 0.0162 -
DeepSeek-V3 9.12 34.68% 0.0012 -
DeepSeek-R1 8.44 35.08% 0.0046 -
Qwen3-32B 8.55 34.01% 0.0012 10.17†
Qwen3-235B 9.81 27.02% 0.0012 -
Jina-Reader Parser API 17.04 - 0.0004 -
Firecrawl 16.81 - 0.0007 -
Our Method (LingoEDU) Specialized 5.67 46.77% 0.0007 1.20†

πŸš€ Quickstart

Get system prompt, article input and guidance grammar

  • System prompt: a fixed system prompt
  • Article input: an input string built from sentence-segmented article
  • Guidance grammar: an lark grammar built from sentence-segmented article

See in our Github repository DeepLangAI/LingoEDU.

Generate with vLLM

from vllm import LLM
from vllm.config import StructuredOutputsConfig
from vllm.sampling_params import SamplingParams, StructuredOutputsParams

if __name__ == "__main__":

    model_name = "deeplang-ai/LingoEDU-4B"

    vllm_llm = LLM(
        model=model_name,
        structured_outputs_config=StructuredOutputsConfig(backend="guidance")
    )
    tokenizer = vllm_llm.get_tokenizer()

    system_prompt = ...
    article_input = ...
    guidance_grammar = ...

    prompt_str = tokenizer.apply_chat_template(
        [
            {
                "role": "system",
                "content": system_prompt,
            },
            {
                "role": "user",
                "content": article_input,
            },
        ],
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False,
    )

    output = vllm_llm.generate(
        prompts=[prompt_str],
        sampling_params=SamplingParams(
            temperature=0.0,
            top_k=-1,
            top_p=1.0,
            max_tokens=8*1024,
            skip_special_tokens=False,
            n=1,
            structured_outputs=StructuredOutputsParams(grammar=guidance_grammar),
        ),
    )[0].outputs[0].text

πŸ“Œ Limitations

  • Not fine-tuned for general chat.
  • Handles only text-based documents; no multimodal input.

πŸ“ Citation

If you find our work helpful, feel free to give us a cite.

@misc{zhou2025contextedusfaithfulstructured,
      title={From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition}, 
      author={Yiqing Zhou and Yu Lei and Shuzheng Si and Qingyan Sun and Wei Wang and Yifei Wu and Hao Wen and Gang Chen and Fanchao Qi and Maosong Sun},
      year={2025},
      eprint={2512.14244},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.14244}, 
}
Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for deeplang-ai/LingoEDU-4B

Quantizations
2 models