Exploring Audio-Visual Correlation for Real-Time Texture Analysis

Main Article Content

Shyam Maheshwari, Dr. Hemant Makwana, Dr.Devesh Kumar Lal

Abstract

This study presents a novel exploration of the correlation between surface images and audio generated by linear movement over surfaces using three key similarity metrics: Euclidean Distance (ED), Cosine Similarity, and Pearson Correlation. A key contribution of this research is the application of these metrics to the LMT TUM Texture dataset, revealing new insights into their comparative effectiveness for multisensory fusion. The results demonstrate that Cosine Similarity and Pearson Correlation maintain high stability across varying surface textures, making them ideal for real-time applications in augmented reality (AR), human-robot interaction (HRI), and industrial monitoring. In contrast, Euclidean Distance exhibits greater sensitivity to texture changes, highlighting its utility for detecting subtle variations in surface properties, especially for fault detection and industrial applications. The study also identifies how increasing the Sigma value enhances similarity, as Euclidean Distance decreases while Cosine Similarity and Pearson Correlation approach near-perfect correlation. Although the analysis is insightful, the research is limited by the narrow range of surface textures and the absence of real-time implementation. Future research will expand the dataset and integrate machine learning techniques to enhance real-time performance. This work advances the field by offering a robust framework for understanding and optimizing multisensory fusion systems across practical applications.

Article Details

Section
Articles