IJCAI 2025 Tutorial

Beyond Text: Advanced Retrieval-Augmented Generation for Complex and Multimodal Data

Abstract

Retrieval-Augmented Generation (RAG) is a cutting-edge framework that combines retrieval-based methods with generative models to enhance the accuracy and relevance of responses by retrieving relevant information from a knowledge base before generating answers. Its significance lies in its ability to handle complex, knowledge-intensive tasks like question answering, document summarization, and conversational AI, making it a powerful tool for applications in healthcare, finance, education, and more. As RAG rapidly evolves, it is being applied to increasingly diverse domains, requiring it to handle broader types of data, including text, images, tables, graphs, and time-series data. However, this expansion introduces challenges such as cross-modal retrieval, unified representation learning, data fusion, scalability, noise handling, and evaluation. Addressing these challenges is urgent to ensure RAG's effectiveness in real-world applications, where data is often heterogeneous, dynamic, and imperfect, and to unlock its full potential across a wide range of industries and use cases.

This tutorial will cover a broad range of topics in recent progress of retrieval augmented generation, by reviewing and introducing the fundamental concepts and algorithms of RAGs, new research frontiers and technical advancement of RAGs for complex data, as well as corresponding applications and evaluations. In addition, rich tutorial materials will be included and introduced to help the audience gain a systematic understanding beyond our recently published survey paper and open-source repositories of state-of-the-art RAG algorithms.

Liang Zhao1 Chao Huang2

Program

TBA

Resources

TBA

About

TBA