arxiv:2602.21877

How to Take a Memorable Picture? Empowering Users with Actionable Feedback

Published on Feb 25

· Submitted by

Francesco Laiti on Mar 2

Upvote

Authors:

Francesco Laiti ,

Abstract

Memorability Feedback enables users to improve photo memorability through natural language guidance by leveraging multimodal large language models and a teacher-student steering strategy.

AI-generated summary

Image memorability, i.e., how likely an image is to be remembered, has traditionally been studied in computer vision either as a passive prediction task, with models regressing a scalar score, or with generative methods altering the visual input to boost the image likelihood of being remembered. Yet, none of these paradigms supports users at capture time, when the crucial question is how to improve a photo memorability. We introduce the task of Memorability Feedback (MemFeed), where an automated model should provide actionable, human-interpretable guidance to users with the goal to enhance an image future recall. We also present MemCoach, the first approach designed to provide concrete suggestions in natural language for memorability improvement (e.g., "emphasize facial expression," "bring the subject forward"). Our method, based on Multimodal Large Language Models (MLLMs), is training-free and employs a teacher-student steering strategy, aligning the model internal activations toward more memorable patterns learned from a teacher model progressing along least-to-most memorable samples. To enable systematic evaluation on this novel task, we further introduce MemBench, a new benchmark featuring sequence-aligned photoshoots with annotated memorability scores. Our experiments, considering multiple MLLMs, demonstrate the effectiveness of MemCoach, showing consistently improved performance over several zero-shot models. The results indicate that memorability can not only be predicted but also taught and instructed, shifting the focus from mere prediction to actionable feedback for human creators.

View arXiv page View PDF Project page GitHub 5 Add to collection

Community

laitifranz

Paper author Paper submitter about 1 hour ago

📸 🧠 [CVPR 2026] MemCoach: Actionable Memorability Feedback via MLLM Steering

MemCoach is a framework designed to bridge the gap between image memorability scoring and practical image improvement. Rather than providing a simple numerical score, it leverages MLLM steering to generate human-interpretable feedback on how to make an image more memorable.

Core Components
🔹 MemFeed Formalization: The first work to define Memorability Feedback (MemFeed); generating actionable, natural-language guidance to improve image memorability at capture time.
🔹 Contrastive Activation Steering: A training-free framework that injects memorability-aware behavior into MLLMs (e.g., Qwen2.5-VL, InternVL3.5) by distilling directions from teacher-aware vs. student-neutral activations.
🔹 MemBench & Evaluation: Release of a dedicated benchmark and a new evaluation protocol that combines editing-based metrics with perplexity-based feedback alignment.

Summary
This research project provides the tools to move beyond "black-box" memorability scores toward a more transparent, instructional approach for creative and photographic assistance.

GitHub: https://github.com/laitifranz/MemCoach
Project Page: https://laitifranz.github.io/MemCoach/
Paper: https://arxiv.org/abs/2602.21877

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.21877 in a model README.md to link it from this page.

Datasets citing this paper 2

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.21877 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.