IJACEN Visual Question Answering: An Analysis of Various AI Models and Datasets

Journal Paper

Paper Title :Visual Question Answering: An Analysis of Various AI Models and Datasets

Author :Neeraj Sanish S Joseph

Article Citation :Neeraj Sanish S Joseph , (2019 ) " Visual Question Answering: An Analysis of Various AI Models and Datasets " , International Journal of Advance Computational Engineering and Networking (IJACEN) , pp. 26-30, Volume-7, Issue-4

Abstract : Visual Question Answering is considered to be one of the latest advances in the field of Artificial Intelligence (AI). This is a unique task, which combines the three most important realms of AI, namely-Computer Vision (CV), Natural Language Processing (NLP) and Knowledge representation and reasoning (KR), each of which is being researched extensively. Given an image and an open-ended natural language question about the image, the VQA model needs to provide an open-ended natural language answer. To achieve this, the model would need to develop an understanding of the different entities of an image and language, and their dependencies. This is regarded as a true AI task. In this review we detail out the various algorithms proposed to build a VQA model, by classifying them based on the mechanisms used to extract and map the input visual and natural language features to a common feature vector space. Finally, we analyze the correctness of these models and propose some alternatives using Capsule Networks (CapsNet) for future directions.

Type : Research paper

Published : Volume-7, Issue-4


	\|		PDF	\|	Viewed - 72	\|	Published on 2019-06-24

Apr. 2024
Submitted Papers	:	80
Accepted Papers	:	10
Rejected Papers	:	70
Acc. Perc	:	12%
Issue Published	:	133
Paper Published	:	1552
No. of Authors	:	4025

Published : Volume-7, Issue-4

JOURNAL SUPPORTED BY