International Journal of Advance Computational Engineering and Networking (IJACEN)
.
Follow Us On :
current issues
Volume-12,Issue-9  ( Sep, 2024 )
Past issues
  1. Volume-12,Issue-8  ( Aug, 2024 )
  2. Volume-12,Issue-7  ( Jul, 2024 )
  3. Volume-12,Issue-6  ( Jun, 2024 )
  4. Volume-12,Issue-5  ( May, 2024 )
  5. Volume-12,Issue-4  ( Apr, 2024 )
  6. Volume-12,Issue-3  ( Mar, 2024 )
  7. Volume-12,Issue-2  ( Feb, 2024 )
  8. Volume-12,Issue-1  ( Jan, 2024 )
  9. Volume-11,Issue-12  ( Dec, 2023 )
  10. Volume-11,Issue-11  ( Nov, 2023 )

Statistics report
Feb. 2025
Submitted Papers : 80
Accepted Papers : 10
Rejected Papers : 70
Acc. Perc : 12%
Issue Published : 141
Paper Published : 1672
No. of Authors : 4423
  Journal Paper


Paper Title :
Enhancing Conversational Artificial Intelligence with Multimodal Retrieval Augmented Generation

Author :Sanjay Bhatt, Srushti Gajbhiye, Praveen Thenraj Gunasekaran, Selvakuberan Karuppasamy, Anshuman A. Mahapatra

Article Citation :Sanjay Bhatt ,Srushti Gajbhiye ,Praveen Thenraj Gunasekaran ,Selvakuberan Karuppasamy ,Anshuman A. Mahapatra , (2024 ) " Enhancing Conversational Artificial Intelligence with Multimodal Retrieval Augmented Generation " , International Journal of Advance Computational Engineering and Networking (IJACEN) , pp. 25-30, Volume-12,Issue-9

Abstract : Recently, retrieval-augmented generation (RAG) based solutions have improved language generation by leveraging an external nonparametric index, showing impressive performance despite constrained model sizes. However, these models are limited to retrieving only textual knowledge. Multimodal RAG (MuRAG) was introduced to address these limitations of RAG, which leverages image data along with text to utilize the full power of retrieval models. This paper proposes a multimodal retrieval-augmented framework combining Simple RAG and MuRAG to enhance chatbot experiences by utilizing domain-specific repositories with both text and image data. After explaining Simple RAG and reviewing existing RAG and MuRAG research, a case study is presented where a company struggles to develop an efficient Multimodal RAG framework for handling image and text data. The proposed comprehensive framework uses the latest libraries and models, uniquely leveraging image data as a knowledge source to improve LLM-generated responses. Recommendations are provided to enhance performance, efficiency, and reliability, ensuring accurate, contextually appropriate, and standards-aligned responses. Keywords - Artificial intelligence (AI), Large Language Model (LLM), Retrieval Augmented Generation (RAG), Multimodal Model, Generative Pre-Trained Transformers (GPT), LangChain, Facebook AI Similarity Search (FAISS)

Type : Research paper

Published : Volume-12,Issue-9


DOIONLINE NO - IJACEN-IRAJ-DOIONLINE-21243   View Here

Copyright: © Institute of Research and Journals

| PDF |
Viewed - 11
| Published on 2025-01-08
   
   
IRAJ Other Journals
IJACEN updates
Paper Submission is open now for upcoming Issue.
The Conference World

JOURNAL SUPPORTED BY