One Model for Two Tasks: IRSum Combines Summarization and Retrieval

By: Prof. Dr. Kai Eckert | Thu, 10 Jul 2025

New research demonstrates efficient unified approach for document summarization and retrieval in large-scale systems

We are pleased to announce the publication of “IRSum: One Model to Rule Summarization and Retrieval” at the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), presented by Sotaro Takeshita, Simone Paolo Ponzetto, and Kai Eckert.

Addressing System Redundancy

Many modern applications that manage large document collections provide both summarization and retrieval functionalities to help users efficiently digest vast amounts of information. Currently, such systems must run two separate task-specific models redundantly on the same set of documents, leading to computational inefficiencies and increased resource requirements.

The IRSum Solution

The research team developed IRSum, an innovative approach that reuses hidden representations produced during summary generation for retrieval tasks. This unified model eliminates the need for running two separate systems, promising significant improvements in both computational efficiency and system architecture simplicity.

Overcoming Technical Challenges

A key challenge addressed in this research is that existing models, including recent large language models, do not naturally produce retrieval-friendly embeddings during summarization due to the lack of contrastive objectives in their training. The IRSum approach specifically tackles this limitation through specialized training techniques.

Impressive Performance Results

Through empirical evaluation, the researchers demonstrated that IRSum can perform on par with or even outperform the combination of two task-specific models in some cases. More impressively, the unified model achieves these results while improving throughput by up to 17% and reducing FLOPs (floating-point operations) by up to 20%, representing substantial gains in computational efficiency.

This work has important implications for the design of large-scale document management systems, offering a path toward more efficient and streamlined architectures that maintain or exceed current performance standards.

Citation: Sotaro Takeshita, Simone Paolo Ponzetto, Kai Eckert (2025): IRSum: One Model to Rule Summarization and Retrieval. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 262–275, Vienna, Austria.