klasifikasi status mikrosatelit pada sel kanker
TRANSCRIPT
i
Klasifikasi Status Mikrosatelit Pada Sel Kanker Gastrointestinal
Menggunakan Algoritma Convolutional Neural Networks
Laporan Tugas Akhir
Diajukan Untuk Memenuhi
Persyaratan Guna Meraih Gelar Sarjana
Informatika Universitas Muhammadiyah Malang
Muhammad Rifal Alfarizy
201710370311219
Bidang Minat
Data Sains
PROGRAM STUDI INFORMATIKA
FAKULTAS TEKNIK
UNIVERSITAS MUHAMMADIYAH MALANG
2021
ii
LEMBAR PERSETUJUAN
Klasifikasi Status Mikrosatelit Pada Sel Kanker Gastrointestinal
Menggunakan Algoritma Convolutional Neural Networks
TUGAS AKHIR
Sebagai Persyaratan Guna Meraih Gelar Sarjana Strata Ⅰ
Informatika Universitas Muhammadiyah Malang
Menyetujui,
Malang, 26 Juni 2021
Pembimbing Ⅰ Pembimbing Ⅱ
Agus Eko Minarno, S.Kom., M.Kom.
NIDN: 0729118203
Yufis Azhar, S.Kom.,M.Kom.
NIDN: 0728088701
viii
KATA PENGANTAR
Dengan memanjatkan puji syukur kehadirat Allah SWT. Atas limpahan rahmat
dan hidayah-NYA sehingga peneliti dapat menyelesaikan tugas akhir yang berjudul
“KLASIFIKASI STATUS MIKROSATELIT PADA SEL KANKER
GASTROINTESTINAL MENGGUNAKAN ALGORITMA Convolutional
Neural Networks”
Di dalam tulisan ini disajikan pokok-pokok bahasan yang meliputi pengaruh
model yang diusulkan, teknik augmentasi yang diusulkan dan modifikasi
penempatan dan jumlah layer dropout terhadap klasifikasi data status mikrosatelit
sel kanker gastrointestinal dengan menggunakan algoritma CNN.
Peneliti menyadari sepenuhnya bahwa dalam penulisan tugas akhir ini masih
banyak kekurangan dan keterbatasan. Oleh sebab itu peneliti mengharapkan saran
yang membangun agar tulisan ini bermanfaat bagi perkembangan ilmu
pengetahuan.
Malang, 26 Juni 2021
Muhammad Rifal Alfarizy
ix
DAFTAR ISI
HALAMAN JUDUL
LEMBAR PERSETUJUAN ................................................................................... ii
LEMBAR PENGESAHAN ................................................................................... iii
LEMBAR PERNYATAAN ................................................................................... iv
ABSTRAK ...............................................................................................................v
ABSTRACT ........................................................................................................... vi
LEMBAR PERSEMBAHAN ............................................................................... vii
KATA PENGANTAR ......................................................................................... viii
DAFTAR ISI .......................................................................................................... ix
DAFTAR GAMBAR ............................................................................................ xii
DAFTAR TABEL ................................................................................................ xiv
BAB Ⅰ PENDAHULUAN ........................................................................................1
1.1. Latar Belakang ..........................................................................................1
1.2. Rumusan Masalah .....................................................................................4
1.3. Tujuan Penelitian .......................................................................................4
1.4. Batasan Masalah ........................................................................................4
BAB Ⅱ TINJAUAN PUSTAKA ..............................................................................6
2.1. Studi Literatur ...........................................................................................6
2.2. Microsatellite Instability ...........................................................................7
2.3. Convolutional Neural Networks ................................................................7
2.6.1. Input Layer .........................................................................................8
2.6.2. Convolutional Layer ..........................................................................8
2.6.3. Batch Normalization Layer ................................................................9
2.6.4. Pooling Layer .....................................................................................9
2.6.5. Dropout Layer ..................................................................................10
x
2.6.6. Fully Connected Layer .....................................................................10
2.4. VGG19 ....................................................................................................11
2.5. Pengujian Klasifikasi Model ...................................................................11
BAB Ⅲ METODE PENELITIAN .........................................................................14
3.1. Tahapan Penelitian ..................................................................................14
3.2. Lingkungan Kerja ....................................................................................15
3.1. Dataset .....................................................................................................15
3.3.1. Pembagian Dataset ...........................................................................16
3.4. Preprocessing ..........................................................................................16
3.4.1. Augmentasi Data ..............................................................................16
3.5. Hyperparameter Tuning ..........................................................................17
3.6. Model Arsitektur .....................................................................................17
3.7. Skenario Pengujian ..................................................................................19
BAB Ⅵ HASIL DAN PEMBAHASAN ...............................................................20
4.1. Augmentasi Data .....................................................................................20
4.2. Hyperparameter Tuning ..........................................................................21
4.3. Pengujian Data Sel Kanker Usus ............................................................24
4.3.1. Skenario 1 Model Usulan .................................................................25
4.3.2. Skenario 2 Model Usulan + Augmentasi .........................................27
4.3.3. Skenario 3 Model Usulan + Augmentasi + Dropout APL ...............28
4.3.4. Evaluasi Hasil ..................................................................................30
4.4. Pengujian Data Sel Kanker Lambung .....................................................36
4.4.1. Evaluasi Hasil ..................................................................................38
4.5. Perbandingan Hasil .................................................................................40
BAB Ⅴ KESIMPULAN .........................................................................................43
5.1. Kesimpulan ..............................................................................................43
xi
5.2. Saran ........................................................................................................44
DAFTAR PUSTAKA ............................................................................................45
LAMPIRAN ...........................................................................................................50
xii
DAFTAR GAMBAR
Gambar 1. Proses Konvolusi ....................................................................................8
Gambar 2. Max Pooling ...........................................................................................9
Gambar 3. Average Pooling .....................................................................................9
Gambar 4. Proses Dropout .....................................................................................10
Gambar 5. Struktur Model VGG19 .......................................................................11
Gambar 6. Grafik AUC-ROC ................................................................................13
Gambar 7. Diagram Alur Penelitian ......................................................................14
Gambar 8. Sample Data Sel Kanker Usus .............................................................15
Gambar 9. Sample Data Sel Kanker Lambung ......................................................15
Gambar 10. Source Code Augmetnasi Data ..........................................................20
Gambar 11. Hasil Augmentasi Data ......................................................................20
Gambar 12. Source Code Hyperparameter Tuning ................................................23
Gambar 13. Source Code Parameter Pengujian Model COAD .............................25
Gambar 14. Source Code Struktur Model Skenario 1 COAD ...............................26
Gambar 15. Source Code Struktur Model Skenario 2 COAD ...............................28
Gambar 16. Source Code Struktur Model Skenario 3 COAD ...............................30
Gambar 17. Source Code Grafik Akurasi dan Loss COAD ..................................30
Gambar 18. Grafik Skenario 1 COAD, (a) Grafik Akurasi dan (b) Grafik Loss ...31
Gambar 19. Grafik Skenario 2 COAD, (a) Grafik Akurasi dan (b) Grafik Loss ...31
Gambar 20. Grafik Skenario 3 COAD, (a) Grafik Akurasi dan (b) Grafik Loss ...32
Gambar 21. Source Code Model Evaluate COAD ................................................32
Gambar 22. Source Code Confusion Matrix COAD .............................................33
Gambar 23. Hasil Confusion Matrix COAD .........................................................33
Gambar 24. Source Code Classification Report COAD ........................................33
Gambar 25. Source Code Grafik Nilai AUCROC COAD .....................................34
Gambar 26. Grafik AUC Skenario Dataset COAD. (a) Skenario 1, (b) Skenario 2
dan (c) Skenario 3 ..................................................................................................35
Gambar 27. Source Code List Callbacks ...............................................................37
Gambar 28. Source Code Grafik Akurasi dan Loss STAD ...................................38
Gambar 29. Grafik Skenario STAD, (a) Grafik Akurasi dan (b) Grafik Loss .......39
xiii
Gambar 30. Hasil Confusion Matrix STAD ..........................................................39
Gambar 31. Grafik Nilai AUCROC STAD ...........................................................39
xiv
DAFTAR TABEL
Tabel 1. Penelitian Terdahulu yang Sejenis ............................................................ 6
Tabel 2. Confusion Matrix .................................................................................... 12
Tabel 3. Detail Pembagian Dataset ....................................................................... 16
Tabel 4. Parameter Teknik Augmentasi ................................................................ 17
Tabel 5. Parameter pembanding untuk Hyperparameter Tuning .......................... 17
Tabel 6. Rancangan Arsitektur Model .................................................................. 18
Tabel 7. Hasil Hyperparameter Tuning ............................................................... 23
Tabel 8. Rangkuman Hasil Klasifikasi Dataset COAD ........................................ 35
Tabel 9. Rangkuman Hasil Klasifikasi Dataset STAD ......................................... 40
Tabel 10. Perbandingan Penelitian Data COAD ................................................... 40
Tabel 11. Perbandingan Penelitian Data STAD ................................................... 41
45
DAFTAR PUSTAKA
[1] D. A. Chistiakov, B. Hellemans, and F. A. M. Volckaert, “Microsatellites
and their genomic distribution, evolution, function and applications: A
review with special reference to fish genetics,” Aquaculture, vol. 255, no. 1–
4, pp. 1–29, 2006, doi: 10.1016/j.aquaculture.2005.11.031.
[2] M. Bistro, D. Arango, A. Karhu, and L. A. Aaltonen, “Candidate driver genes
in microsatellite-unstable colorectal cancer,” vol. 1566, no. May 2011, pp.
1558–1566, 2012, doi: 10.1002/ijc.26167.
[3] G. Yang and Z. Zai, “Correlations between microsatellite instability and the
biological behaviour of tumours,” J. Cancer Res. Clin. Oncol., no.
0123456789, 2019, doi: 10.1007/s00432-019-03053-4.
[4] S. Velho, M. S. Fernandes, M. Leite, C. Figueiredo, and R. Seruca, “Causes
and consequences of microsatellite instability in gastric carcinogenesis,”
World J. Gastroenterol., vol. 20, no. 44, pp. 16433–16442, 2014, doi:
10.3748/wjg.v20.i44.16433.
[5] C. G. Kim et al., “Effects of microsatellite instability on recurrence patterns
and outcomes in colorectal cancers,” Br. J. Cancer, vol. 115, no. 1, pp. 25–
33, 2016, doi: 10.1038/bjc.2016.161.
[6] Y. Fu et al., “A qualitative transcriptional signature for predicting
microsatellite instability status of right-sided Colon Cancer,” pp. 1–9, 2019.
[7] A. Echle et al., “Clinical-Grade Detection of Microsatellite Instability in
Colorectal Tumors by Deep Learning,” Gastroenterology, vol. 159, no. 4,
pp. 1406-1416.e11, 2020, doi: 10.1053/j.gastro.2020.06.021.
[8] T. Wang, W. Lu, F. Yang, and L. Liu, “MICROSATELLITE INSTABILITY
PREDICTION OF UTERINE CORPUS ENDOMETRIAL CARCINOMA
BASED ON H & E HISTOLOGY WHOLE-SLIDE IMAGING Hepatology
Unit and Department of Infectious Diseases , Nanfang Hospital , Department
of Radiation Oncology , Nanfang Hospital , Ten,” pp. 1289–1292, 2020.
[9] O. J. Skrede et al., “Deep learning for prediction of colorectal cancer
outcome: a discovery and validation study,” Lancet, vol. 395, no. 10221, pp.
350–360, 2020, doi: 10.1016/S0140-6736(19)32998-8.
46
[10] L. Wang, Y. Jiao, Y. Qiao, N. Zeng, and R. Yu, “A novel approach combined
transfer learning and deep learning to predict TMB from histology image,”
Pattern Recognit. Lett., vol. 135, pp. 244–248, 2020, doi:
10.1016/j.patrec.2020.04.008.
[11] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger, and H.
Greenspan, “GAN-based synthetic medical image augmentation for
increased CNN performance in liver lesion classification,” Neurocomputing,
vol. 321, pp. 321–331, 2018, doi: 10.1016/j.neucom.2018.09.013.
[12] Y. D. Zhang, C. Pan, J. Sun, and C. Tang, “Multiple sclerosis identification
by convolutional neural network with dropout and parametric ReLU,” J.
Comput. Sci., vol. 28, pp. 1–10, 2018, doi: 10.1016/j.jocs.2018.07.003.
[13] M. Macenko et al., “A method for normalizing histology slides for
quantitative analysis,” Proc. - 2009 IEEE Int. Symp. Biomed. Imaging From
Nano to Macro, ISBI 2009, pp. 1107–1110, 2009, doi:
10.1109/ISBI.2009.5193250.
[14] Z. Wang et al., “Plasma-based microsatellite instability detection strategy to
guide immune checkpoint blockade treatment,” J. Immunother. Cancer, vol.
8, no. 2, pp. 1–8, 2020, doi: 10.1136/jitc-2020-001297.
[15] U. Blazhko, V. Shapaval, V. Kovalev, and A. Kohler, “Comparison of
augmentation and pre-processing for deep learning and chemometric
classification of infrared spectra,” Chemom. Intell. Lab. Syst., vol. 215, no.
May, p. 104367, 2021, doi: 10.1016/j.chemolab.2021.104367.
[16] Y. S. Perl et al., “Data augmentation based on dynamical systems for the
classification of brain states,” Chaos, Solitons and Fractals, vol. 139, p.
110069, 2020, doi: 10.1016/j.chaos.2020.110069.
[17] I. Kandel and M. Castelli, “The effect of batch size on the generalizability of
the convolutional neural networks on a histopathology dataset,” ICT
Express, vol. 6, no. 4, pp. 312–315, 2020, doi: 10.1016/j.icte.2020.04.010.
[18] W. Zhang, L. Jin, E. Song, and X. Xu, “Removal of impulse noise in color
images based on convolutional neural network,” Appl. Soft Comput. J., vol.
82, p. 105558, 2019, doi: 10.1016/j.asoc.2019.105558.
[19] M. Zarie, A. Jahedsaravani, and M. Massinaei, “Flotation froth image
47
classification using convolutional neural networks,” Miner. Eng., vol. 155,
no. January, p. 106443, 2020, doi: 10.1016/j.mineng.2020.106443.
[20] T. Shanthi, R. S. Sabeenian, and R. Anand, “Automatic diagnosis of skin
diseases using convolution neural network,” Microprocess. Microsyst., vol.
76, p. 103074, 2020, doi: 10.1016/j.micpro.2020.103074.
[21] S. Jahan, “Deep Indian Delicacy: Classification of Indian Food Images using
Convolutional Neural Networks,” Int. J. Res. Appl. Sci. Eng. Technol., vol.
6, no. 3, pp. 2653–2660, 2018, doi: 10.22214/ijraset.2018.3428.
[22] J. Wang, S. Li, Z. An, X. Jiang, W. Qian, and S. Ji, “Batch-normalized deep
neural networks for achieving fast intelligent fault diagnosis of machines,”
Neurocomputing, vol. 329, no. xxxx, pp. 53–65, 2019, doi:
10.1016/j.neucom.2018.10.049.
[23] D. Macêdo, C. Zanchettin, A. L. I. Oliveira, and T. Ludermir, “Enhancing
batch normalized convolutional networks using displaced rectifier linear
units: A systematic comparative study,” Expert Syst. Appl., vol. 124, pp.
271–281, 2019, doi: 10.1016/j.eswa.2019.01.066.
[24] S. Fan et al., “On line detection of defective apples using computer vision
system combined with deep learning methods,” J. Food Eng., vol. 286, no.
November 2019, p. 110102, 2020, doi: 10.1016/j.jfoodeng.2020.110102.
[25] M. Yani, B. Irawan, and C. Setiningsih, “Application of Transfer Learning
Using Convolutional Neural Network Method for Early Detection of Terry’s
Nail,” J. Phys. Conf. Ser., vol. 1201, no. 1, 2019, doi: 10.1088/1742-
6596/1201/1/012052.
[26] Z. Liu, J. Du, M. Wang, and S. S. Ge, “ADCM: attention dropout
convolutional module,” Neurocomputing, vol. 394, pp. 95–104, 2020, doi:
10.1016/j.neucom.2020.02.007.
[27] U. M. Malang, “Klasifikasi COVID-19 menggunakan Filter Gabor dan CNN
dengan Hyperparameter Tuning,” vol. 9, no. 3, pp. 493–504, 2021.
[28] Z. Fang, J. Liu, Y. Li, Y. Qiao, and H. Lu, “Improving visual question
answering using dropout and enhanced question encoder,” Pattern
Recognit., vol. 90, pp. 404–414, 2019, doi: 10.1016/j.patcog.2019.01.038.
[29] A. Ben Khalifa and H. Frigui, “Multiple Instance Fuzzy Inference Neural
48
Networks,” no. January, 2016, [Online]. Available:
http://arxiv.org/abs/1610.04973.
[30] A. Das, H. Yenala, M. Chinnakotla, and M. Shrivastava, “Together we stand:
Siamese networks for similar question retrieval,” 54th Annu. Meet. Assoc.
Comput. Linguist. ACL 2016 - Long Pap., vol. 1, pp. 378–387, 2016, doi:
10.18653/v1/p16-1036.
[31] C. Bai, L. Huang, X. Pan, J. Zheng, and S. Chen, “Optimization of deep
convolutional neural network for large scale image retrieval,”
Neurocomputing, vol. 303, pp. 60–67, 2018, doi:
10.1016/j.neucom.2018.04.034.
[32] W. Setiawan and F. Damayanti, “Layers Modification of Convolutional
Neural Network for Pneumonia Detection,” J. Phys. Conf. Ser., vol. 1477,
no. 5, 2020, doi: 10.1088/1742-6596/1477/5/052055.
[33] K. Deeba and B. Amutha, “ResNet - deep neural network architecture for
leaf disease classification,” Microprocess. Microsyst., p. 103364, 2020, doi:
10.1016/j.micpro.2020.103364.
[34] I. Düntsch and G. Gediga, “Indices for rough set approximation and the
application to confusion matrices,” Int. J. Approx. Reason., vol. 118, pp.
155–172, 2020, doi: 10.1016/j.ijar.2019.12.008.
[35] S. Qummar et al., “A Deep Learning Ensemble Approach for Diabetic
Retinopathy Detection,” IEEE Access, vol. 7, pp. 150530–150539, 2019, doi:
10.1109/ACCESS.2019.2947484.
[36] Y. S. Chen, P. P. Chong, and M. Y. Tong, “Mathematical and computer
modelling of the Pareto principle,” Math. Comput. Model., vol. 19, no. 9, pp.
61–80, 1994, doi: 10.1016/0895-7177(94)90041-8.
[37] H. B. Harvey and S. T. Sotardi, “The Pareto Principle,” J. Am. Coll. Radiol.,
vol. 15, no. 6, p. 931, 2018, doi: 10.1016/j.jacr.2018.02.026.
[38] S. Pandey, P. R. Singh, and J. Tian, “An image augmentation approach using
two-stage generative adversarial network for nuclei image segmentation,”
Biomed. Signal Process. Control, vol. 57, p. 101782, 2020, doi:
10.1016/j.bspc.2019.101782.
[39] P. L. Neary, “Automatic hyperparameter tuning in deep convolutional neural
49
networks using asynchronous reinforcement learning,” Proc. - 2018 IEEE
Int. Conf. Cogn. Comput. ICCC 2018 - Part 2018 IEEE World Congr. Serv.,
pp. 73–77, 2018, doi: 10.1109/ICCC.2018.00017.
[40] M. P. Ranjit, G. Ganapathy, K. Sridhar, and V. Arumugham, “Efficient deep
learning hyperparameter tuning using cloud infrastructure: Intelligent
distributed hyperparameter tuning with Bayesian optimization in the cloud,”
IEEE Int. Conf. Cloud Comput. CLOUD, vol. 2019-July, pp. 520–522, 2019,
doi: 10.1109/CLOUD.2019.00097.