Abstract
Artificial Intelligence (AI), especially deep learning (DL) has been very popular to be implemented in medical-related domains. Medical image analysis is the most benefitted domain because of the rapid growth of advanced deep learning methods for computer vision tasks. On the other hand, biomolecular research, although it also offers high dimensional data, has not been very popular among deep learning enthusiasts. It is possibly because of the more complicated biology mechanism underlying this task. However, researchers have started the effort to make use of the power of deep learning to solve complex patterns of biomolecular data including protein, RNA, and DNA data analysis. In DNA topic area, recently, several machine learning (ML) methods such as random forest, support vector machine (SVM), and gradient boosting have been used to build a several classification and prediction model. However, these ML methods are outperformed by the traditional statstical approach, due to the limitation in handling high dimensional data. DL as the most advanced ML method to date gives a glimpse of hope due to its success stories in understanding the latent information in other biomolecular data especially protein data. Therefore, this proposed study is aimed to build a DL-based model to learn useful pattern from genomics data including ancestry information, and gene-disease association
Keywords
AI, genomics, deep learning, health informatics, DNA