A Comparative Analysis of Machine Learning Algorithms for Sentiment Analysis in Indian Social Media
Main Article Content
Abstract
This research paper presents a comprehensive comparative analysis of various machine learning algorithms for sentiment analysis in the diverse and multilingual context of Indian social media. The primary objective of the study was to evaluate and compare the effectiveness of different machine learning models, with a specific focus on the Support Vector Machine (SVM) algorithm, in accurately deciphering sentiments expressed in Indian languages on social media platforms, predominantly Twitter. The methodology employed a structured approach to data collection and analysis, extracting around 100,000 tweets in multiple Indian languages using the Twitter API. The SVM algorithm was then applied to this data, and its performance was assessed based on accuracy, efficiency, and adaptability in handling multilingual content.
Key findings revealed that while SVM shows reasonable accuracy in sentiment analysis, its performance varies across different languages, with English tweets exhibiting the highest accuracy. The algorithm also faced challenges in processing mixed-language tweets, indicating a need for more advanced, linguistically versatile models. Comparative analysis with other machine learning and deep learning models suggested potential for improved performance with models like BERT, especially in linguistic and contextual understanding.
The implications of these findings are significant, highlighting the necessity for more culturally and linguistically nuanced sentiment analysis tools in India's diverse digital landscape. This research contributes to the field by providing insights into the selection of appropriate algorithms for sentiment analysis, particularly in linguistically diverse settings, and underscores the need for continuous innovation in machine learning applications for social media analysis.
Keywords: