Track Chair: Asst. Prof. Dr. Maleerat Maliyaem, King Mongkut's University of Technology North Bangkok, Thailand

The rapid advancement of AI has made the integration of natural language and visual understanding a critical frontier in research. This Track aims to explore cutting-edge research at the intersection of natural language processing (NLP) and computer vision (CV). As multimodal AI systems become increasingly pivotal in real-world applications—from generative AI and robotics to healthcare and human-computer interaction—this workshop will foster discussions on innovative methodologies, challenges, and future directions for unifying linguistic and visual intelligence.

Topics

Multimodal Learning and Fusion
• Joint representation learning and alignment of text and images
• Applications and improvements of multimodal pre-trained models (e.g., CLIP, BLIP)
• Cross-modal retrieval and matching techniques

Image Captioning and Text-to-Image Generation
• Deep learning-based image-to-text generation techniques
• Semantic consistency and detail control in text-to-image generation
• Quality evaluation and optimization of generative models

Visual Question Answering and Cross-Modal Reasoning
• Design of visual-language integrated question answering systems
• Cross-modal reasoning and logical inference
• Visual question answering techniques for complex scenarios

Multimodal Applications and Datasets
• Multimodal applications in healthcare, education, autonomous driving, and other fields
• Construction, evaluation, and open sharing of multimodal datasets
• Fairness, robustness, and interpretability of multimodal models

Efficiency and Optimization of Multimodal Models
• Lightweighting and acceleration of multimodal models
• Optimization of multimodal models for edge computing
• Compression and quantization techniques for multimodal models

 

 

 

IMPORTANT DATES

* Welcome to submit papers to NLPAI 2025 through Electronic Submission System or Conference Email Box: nlpai@cbees.net. (For paper publication, a full paper is required to be submitted; for presentation only without paper publication, an abstract can be submitted).

* Welcome to join in NLPAI 2025 as the listener if you do not want to publish any paper and present at the conference. The registration should be finished through the Online Registration System before the registration deadline.