APPLICATION OF TRANSFORMERS TRANSFER LEARNING IN AUTOCODIFICATION OF INTERNATIONAL CLASSIFICATION OF DISEASES 10TH REVISION (ICD-10) CODING FOR MEDICAL DIAGNOSIS IN MALAYSIA

MUHAMMAD NAUFAL BIN NORDIN

Publication:
APPLICATION OF TRANSFORMERS TRANSFER LEARNING IN AUTOCODIFICATION OF INTERNATIONAL CLASSIFICATION OF DISEASES 10TH REVISION (ICD-10) CODING FOR MEDICAL DIAGNOSIS IN MALAYSIA

dc.contributor.author	MUHAMMAD NAUFAL BIN NORDIN
dc.date.accessioned	2024-03-07T09:29:15Z
dc.date.available	2024-03-07T09:29:15Z
dc.date.issued	2023
dc.description.abstract	The process of converting unstructured medical diagnoses into structured data using the International Classification of Diseases 10th Revision (ICD-10) codes presents a significant challenge to healthcare facilities in Malaysia. The reliance on manual codification leads to potential errors, backlogs, and delays in data availability for analysis and decision-making which can negatively affect healthcare planning and resource allocation. To address these challenges, this study proposes the use of Artificial Intelligence (AI) specifically Transfer Learning and Natural Language Processing (NLP) to auto-codify free text medical diagnoses into standardized ICD-10 codes. The primary aim is to demonstrate that the fine-tuned machine learning model is capable of achieving over 85% prediction accuracy. The research objectives include identifying the best-pretrained model, determining the optimal model parameters, and investigating the impact of different training dataset sizes on prediction accuracy. Through these targeted strategies, this study seeks to provide a viable AI solution that enhances the accuracy, efficiency, and timeliness of medical data codification. This study successfully identified the finetuned Generative Pretrained Transformers 2 (GPT2) Large model as the most accurate prediction model for ICD-10 classification task with an optimal configuration that achieved a prediction F1 score of 86.27%, exceeding the initial target of 85%. However, it is worth noting that the Bidirectional Encoder Representations from Transformers (BERT) variant model namely ‘BioClinicalBERT’, which has been pre-trained on healthcare domain-specific data demonstrated significant efficiency in training with fewer parameters compared to the GPT2 Large Model. This finding underscores the potential of balancing domain-specific pre-training, selection of pre-trained model based on parameters and training dataset size in creating efficient models for complex healthcare tasks such as ICD-10 coding, suggesting an alternative route for future model development and improvement.
dc.identifier.uri	https://hdl.handle.net/20.500.14377/36059
dc.language.iso	en
dc.publisher	International Medical University
dc.subject	International Classification of Diseases
dc.subject	Diagnosis
dc.subject	Artificial Intelligence
dc.subject	Statistics
dc.title	APPLICATION OF TRANSFORMERS TRANSFER LEARNING IN AUTOCODIFICATION OF INTERNATIONAL CLASSIFICATION OF DISEASES 10TH REVISION (ICD-10) CODING FOR MEDICAL DIAGNOSIS IN MALAYSIA
dc.type	Thesis
dspace.entity.type	Publication
oairecerif.author.affiliation	#PLACEHOLDER_PARENT_METADATA_VALUE#