Semi-supervised Text Categorization Using Recursive K-means Clustering

06/24/2017
by   Harsha S. Gowda, et al.
0

In this paper, we present a semi-supervised learning algorithm for classification of text documents. A method of labeling unlabeled text documents is presented. The presented method is based on the principle of divide and conquer strategy. It uses recursive K-means algorithm for partitioning both labeled and unlabeled data collection. The K-means algorithm is applied recursively on each partition till a desired level partition is achieved such that each partition contains labeled documents of a single class. Once the desired clusters are obtained, the respective cluster centroids are considered as representatives of the clusters and the nearest neighbor rule is used for classifying an unknown text document. Series of experiments have been conducted to bring out the superiority of the proposed model over other recent state of the art models on 20Newsgroups dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2022

A soft nearest-neighbor framework for continual semi-supervised learning

Despite significant advances, the performance of state-of-the-art contin...
research
02/25/2019

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

Hierarchical text classification has many real-world applications. Howev...
research
07/17/2012

Ensemble Clustering with Logic Rules

In this article, the logic rule ensembles approach to supervised learnin...
research
02/22/2016

Semi-supervised Clustering for Short Text via Deep Representation Learning

In this work, we propose a semi-supervised method for short text cluster...
research
04/06/2015

Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding

This paper presents a new semi-supervised framework with convolutional n...
research
05/22/2020

Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language

In this paper, we show how selecting and combining encodings of natural ...

Please sign up or login with your details

Forgot password? Click here to reset