Unsupervised Online Feature Selection for Cost-Sensitive Medical Diagnosis

by   Arun Verma, et al.

In medical diagnosis, physicians predict the state of a patient by checking measurements (features) obtained from a sequence of tests, e.g., blood test, urine test, followed by invasive tests. As tests are often costly, one would like to obtain only those features (tests) that can establish the presence or absence of the state conclusively. Another aspect of medical diagnosis is that we are often faced with unsupervised prediction tasks as the true state of the patients may not be known. Motivated by such medical diagnosis problems, we consider a Cost-Sensitive Medical Diagnosis (CSMD) problem, where the true state of patients is unknown. We formulate the CSMD problem as a feature selection problem where each test gives a feature that can be used in a prediction model. Our objective is to learn strategies for selecting the features that give the best trade-off between accuracy and costs. We exploit the `Weak Dominance' property of problem to develop online algorithms that identify a set of features which provides an `optimal' trade-off between cost and accuracy of prediction without requiring to know the true state of the medical condition. Our empirical results validate the performance of our algorithms on problem instances generated from real-world datasets.


Online Algorithm for Unsupervised Sensor Selection

In many security and healthcare systems, the detection and diagnosis sys...

Cost effective approach on feature selection using genetic algorithms and fuzzy logic for diabetes diagnosis

A way to enhance the performance of a model that combines genetic algori...

Bayesian statistical models to explore the use glucometer measurements of capillary blood sugar for OGTT Tests

A common test for the diagnosis of type 2 diabetes is the Oral Glucose T...

FIT: a Fast and Accurate Framework for Solving Medical Inquiring and Diagnosing Tasks

Automatic self-diagnosis provides low-cost and accessible healthcare via...

The accuracy vs. coverage trade-off in patient-facing diagnosis models

A third of adults in America use the Internet to diagnose medical concer...

Identifying Individual Disease Dynamics in a Stochastic Multi-pathogen Model From Aggregated Reports and Laboratory Data

Influenza and respiratory syncytial virus are the leading etiologic agen...

Peri-Diagnostic Decision Support Through Cost-Efficient Feature Acquisition at Test-Time

Computer-aided diagnosis (CADx) algorithms in medicine provide patient-s...

Please sign up or login with your details

Forgot password? Click here to reset