Papers
arxiv:2602.14492

Query as Anchor: Scenario-Adaptive User Representation via Large Language Model

Published on Feb 16
ยท Submitted by
Jiahao Yuan
on Feb 17
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

A novel framework called Query-as-Anchor is introduced that transforms user modeling from static encoding to dynamic, query-aware synthesis using large language models with specialized architectures and training methods.

AI-generated summary

Industrial-scale user representation learning requires balancing robust universality with acute task-sensitivity. However, existing paradigms primarily yield static, task-agnostic embeddings that struggle to reconcile the divergent requirements of downstream scenarios within unified vector spaces. Furthermore, heterogeneous multi-source data introduces inherent noise and modality conflicts, degrading representation. We propose Query-as-Anchor, a framework shifting user modeling from static encoding to dynamic, query-aware synthesis. To empower Large Language Models (LLMs) with deep user understanding, we first construct UserU, an industrial-scale pre-training dataset that aligns multi-modal behavioral sequences with user understanding semantics, and our Q-Anchor Embedding architecture integrates hierarchical coarse-to-fine encoders into dual-tower LLMs via joint contrastive-autoregressive optimization for query-aware user representation. To bridge the gap between general pre-training and specialized business logic, we further introduce Cluster-based Soft Prompt Tuning to enforce discriminative latent structures, effectively aligning model attention with scenario-specific modalities. For deployment, anchoring queries at sequence termini enables KV-cache-accelerated inference with negligible incremental latency. Evaluations on 10 Alipay industrial benchmarks show consistent SOTA performance, strong scalability, and efficient deployment. Large-scale online A/B testing in Alipay's production system across two real-world scenarios further validates its practical effectiveness. Our code is prepared for public release and will be available at: https://github.com/JhCircle/Q-Anchor.

Community

Paper author Paper submitter

Query as Anchor: Scenario-Adaptive User Representation via

Large Language Model

Q-Anchor is a query-conditioned user representation framework that transforms static user embeddings into dynamic, scenario-adaptive representations using Large Language Models (LLMs).

Instead of producing fixed task-agnostic embeddings, Q-Anchor introduces Query-as-Anchor, a mechanism that re-anchors the same user behavior profile under different downstream objectives via natural language queries. This enables a single model to serve multiple business scenarios without retraining.

๐Ÿ”‘ Key Features

  • Dynamic Query-Aware Embeddings
    Generates scenario-specific user representations conditioned on natural language queries.

  • Hierarchical Multi-Modal Encoder
    Integrates heterogeneous behavioral logs (transactions, app usage, search, navigation, tabular features) into a coarse-to-fine structure aligned with LLM latent space.

  • UserU Pretraining Dataset (100M+ samples)
    Combines:

    • Future behavior prediction supervision
    • Reflection-verified LLM-synthesized user QA pairs
      to inject temporal dynamics and semantic understanding.
  • Joint Contrastive + Generative Training
    Aligns user embeddings with semantic targets while preserving token-level grounding.

  • Lightweight Soft Prompt Tuning
    Enables efficient scenario specialization without modifying backbone weights.

  • KV-Cache Optimized Inference
    User prefixes are encoded once and reused across multiple queries, enabling low-latency multi-scenario deployment.

๐Ÿ“Š Performance

Evaluated on 10 large-scale industrial benchmarks (Engagement, Risk, Marketing):

  • SOTA AUC & KS across all domains
  • +9.8% AUC improvement over strong general embedding baselines
  • Consistent gains validated via large-scale online A/B testing

Q-Anchor bridges the gap between sparse behavioral logs and LLM-level semantic understanding, enabling scalable, interpretable, and transferable user embeddings for industrial applications.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

arXivLens breakdown of this paper ๐Ÿ‘‰ https://arxivlens.com/PaperView/Details/query-as-anchor-scenario-adaptive-user-representation-via-large-language-model-160-7b5bd704

  • Executive Summary
  • Detailed Breakdown
  • Practical Applications

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.14492 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.14492 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.14492 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.