Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

by   Aftab Anjum, et al.

In sentiment classification, the enormous amount of textual data, its immense dimensionality, and inherent noise make it extremely difficult for machine learning classifiers to extract high-level and complex abstractions. In order to make the data less sparse and more statistically significant, the dimensionality reduction techniques are needed. But in the existing dimensionality reduction techniques, the number of components needs to be set manually which results in loss of the most prominent features, thus reducing the performance of the classifiers. Our prior work, i.e., Term Presence Count (TPC) and Term Presence Ratio (TPR) have proven to be effective techniques as they reject the less separable features. However, the most prominent and separable features might still get removed from the initial feature set despite having higher distributions among positive and negative tagged documents. To overcome this problem, we have proposed a new framework that consists of two-dimensionality reduction techniques i.e., Sentiment Term Presence Count (SentiTPC) and Sentiment Term Presence Ratio (SentiTPR). These techniques reject the features by considering term presence difference for SentiTPC and ratio of the distribution distinction for SentiTPR. Additionally, these methods also analyze the total distribution information. Extensive experimental results exhibit that the proposed framework reduces the feature dimension by a large scale, and thus significantly improve the classification performance.


page 5

page 6

page 7

page 8

page 9

page 11

page 12

page 13


Performance Analysis of Deep Autoencoder and NCA Dimensionality Reduction Techniques with KNN, ENN and SVM Classifiers

The central aim of this paper is to implement Deep Autoencoder and Neigh...

Feature Dimensionality Reduction for Video Affect Classification: A Comparative Study

Affective computing has become a very important research area in human-m...

Dimensionality Reduction for Wasserstein Barycenter

The Wasserstein barycenter is a geometric construct which captures the n...

Towards Exploratory Landscape Analysis for Large-scale Optimization: A Dimensionality Reduction Framework

Although exploratory landscape analysis (ELA) has shown its effectivenes...

A Fuzzy Approach for Feature Evaluation and Dimensionality Reduction to Improve the Quality of Web Usage Mining Results

Web Usage Mining is the application of data mining techniques to web usa...

A survey of dimensionality reduction techniques

Experimental life sciences like biology or chemistry have seen in the re...

AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction

High dimensionality, i.e. data having a large number of variables, tends...

Please sign up or login with your details

Forgot password? Click here to reset