Publication:
APPLICATION OF TRANSFORMERS TRANSFER LEARNING IN AUTOCODIFICATION OF INTERNATIONAL CLASSIFICATION OF DISEASES 10TH REVISION (ICD-10) CODING FOR MEDICAL DIAGNOSIS IN MALAYSIA

dc.contributor.authorMUHAMMAD NAUFAL BIN NORDIN
dc.date.accessioned2024-03-07T09:29:15Z
dc.date.available2024-03-07T09:29:15Z
dc.date.issued2023
dc.description.abstractThe process of converting unstructured medical diagnoses into structured data using the International Classification of Diseases 10th Revision (ICD-10) codes presents a significant challenge to healthcare facilities in Malaysia. The reliance on manual codification leads to potential errors, backlogs, and delays in data availability for analysis and decision-making which can negatively affect healthcare planning and resource allocation. To address these challenges, this study proposes the use of Artificial Intelligence (AI) specifically Transfer Learning and Natural Language Processing (NLP) to auto-codify free text medical diagnoses into standardized ICD-10 codes. The primary aim is to demonstrate that the fine-tuned machine learning model is capable of achieving over 85% prediction accuracy. The research objectives include identifying the best-pretrained model, determining the optimal model parameters, and investigating the impact of different training dataset sizes on prediction accuracy. Through these targeted strategies, this study seeks to provide a viable AI solution that enhances the accuracy, efficiency, and timeliness of medical data codification. This study successfully identified the finetuned Generative Pretrained Transformers 2 (GPT2) Large model as the most accurate prediction model for ICD-10 classification task with an optimal configuration that achieved a prediction F1 score of 86.27%, exceeding the initial target of 85%. However, it is worth noting that the Bidirectional Encoder Representations from Transformers (BERT) variant model namely ‘BioClinicalBERT’, which has been pre-trained on healthcare domain-specific data demonstrated significant efficiency in training with fewer parameters compared to the GPT2 Large Model. This finding underscores the potential of balancing domain-specific pre-training, selection of pre-trained model based on parameters and training dataset size in creating efficient models for complex healthcare tasks such as ICD-10 coding, suggesting an alternative route for future model development and improvement.
dc.identifier.urihttps://hdl.handle.net/20.500.14377/36059
dc.language.isoen
dc.publisherInternational Medical University
dc.subjectInternational Classification of Diseases
dc.subjectDiagnosis
dc.subjectArtificial Intelligence
dc.subjectStatistics
dc.titleAPPLICATION OF TRANSFORMERS TRANSFER LEARNING IN AUTOCODIFICATION OF INTERNATIONAL CLASSIFICATION OF DISEASES 10TH REVISION (ICD-10) CODING FOR MEDICAL DIAGNOSIS IN MALAYSIA
dc.typeThesis
dspace.entity.typePublication
oairecerif.author.affiliation#PLACEHOLDER_PARENT_METADATA_VALUE#
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2023_MuhammadNaufal.pdf
Size:
1.58 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description: