A novel model to optimize multiple imputation algorithm for missing data using evolution methods.

TytułA novel model to optimize multiple imputation algorithm for missing data using evolution methods.
Publication TypeJournal Article
Rok publikacji2022
AutorzyMohammed YSalaheldin, Abdelkader H, Pławiak P, Hammad M
JournalBiomedical Signal Processing and Control
Volume76
ISSN1746-8094
Abstract

The concept of missing data is considered significant when applying statistical methods to a dataset and the quality of the data analysis results is based on the correct data completeness. As a result, improving missing data filling processes is vital in order to give more reliable data throughout the phase of analysis. Here, we present a novel method for optimizing multiple regression imputation processes and obtaining the best fitness values for missing data from patients by combining multiple imputations with a genetic algorithm. To train and assess our proposed method, we employed 583 patient records from a publicly available database, divided into 416 records of liver patients and 167 records of the non-liver patients. The proposed approach offers the largest improvement for missing data findings, according to the results. Instead of employing the normal equation in multiple imputations, which yielded 92.72 as the utmost fitness value with Mean Absolute Error (MAE) 0.5877 from 1.1840 after our second optimization, we were able to achieve a fitness value of 233. The proposed approach might be tested using a large database and used in Hepatocellular carcinoma (HCC) labs to help clinicians make accurate diagnoses.

URLhttps://www.sciencedirect.com/science/article/pii/S1746809422001835
DOI10.1016/j.bspc.2022.103661