contoh id3
TRANSCRIPT
-
1WIDODO
CONTOH PROSES ALGORITMA ID3 ID3 = Induction of Decision 3 / Iterative Dichotomiser 3 Dibuat oleh Ross Quinlan, akhir 70-an. Dikembangkan lagi oleh Quinlan pada tahun 1993 menjadi C4.5
Day Outlook Temperature Humidity Wind Play Tennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No
Target Attribute / Class Label adalah Play Tennis yang berisi Yes atau No. Atribut lainnya adalah explanatory attribute atau predictor attribute. Model klasifikasi untuk menentukan bermain tenis atau tidak dengan menggunakan algoritma ID3. Gain(S,A) = Entropy(S) - |Sv| Entropy (Sv) |S| Entropy(S) = - pi log2pi
-
2WIDODO
S adalah kumpulan dari 14 contoh di atas dengan 9 positif dan 5 negatif, ditulis dengan [9+,5-]. Jadi entropy(S) adalah: Entropy([9+,5-]) = - (9/14)log2(9/14) (5/14) log2(5/14) = 0.94029 Catatan: Entropy(S)=0 jika semua contoh pada S memiliki kelas yang sama Entropy(S)=1 jika jumlah contoh positif dan jumlah contoh negatif sama 0
-
3WIDODO
Gain(S,Temperature) = Entropy(S) (4/14)Entropy(Shot) (6/14)Entropy(Smild) (4/14)Entropy(Scool)
= 0.94029 (4/14)1.00000 (6/14)0.91830 (4/14)0.81128 = 0.02922 Isi(Outlook) = Sunny, Overcast, Rain Ssunny = [2+,3-] Sovercast = [4+,0-] Srain = [3+,2-] Gain(S,Outlook) = Entropy(S) (5/14)Entropy(Ssunny) (4/14)Entropy(Sovercast)
(5/14)Entropy(Srain) = 0.94029 (5/14)0.97075 (4/14)1.00000 (5/14)0.97075 = 0.24675 Jadi information gain untuk empat atribut adalah: Gain(S, Wind) = 0.04813 Gain(S, Humidity) = 0.15184 Gain(S, Temperature) = 0.02922 Gain(S, Outlook) = 0.24675 Maka atribut terbaik untuk menyediakan prediksi adalah outlook.
Outlook
? ? YES
Overcast
Rain Sunny
-
4WIDODO
Untuk cabang node Outlook=Sunny: Ssunny = [D1, D2, D8, D9, D11]
Day Outlook Temperature Humidity Wind Play Tennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes
D11 Sunny Mild Normal Strong Yes Isi(Temperature) = Hot, Mild, Cool Shot = [0+,2-] Smild = [1+,1-] Scool = [1+,0-]
Gain(Ssunny,Temperature) = Entropy(Ssunny) (2/5)Entropy(Shot) (2/5)Entropy(Smild) (1/5)Entropy(Scool)
= 0.97075 (2/5)0.00000 (2/5)1.00000 (1/5)0.00000 = 0.57075 Isi(Humidity) = High, Normal Shigh = [0+,3-] Snormal = [2+,0-]
Gain(Ssunny,Humidity) = Entropy(Ssunny) (3/5)Entropy(Shigh) (2/5)Entropy(Snormal) = 0.97075 (3/5)0.00000 (2/5)0.00000 = 0.97075 Isi(Wind) = Weak, Strong Sweak = [1+,2-] Sstrong = [1+,1-] Gain(Ssunny,Wind) = Entropy(Ssunny) (3/5)Entropy(Sweak) (2/5)Entropy(Sstrong) = 0.97075 (3/5)0.91830 (3/5)1.00000 = 0.01997 Pada tahap ini atribut Humidity menjadi yang terbaik untuk prediksi, sehingga decision tree-nya menjadi:
-
5WIDODO
Untuk cabang node outlook=Rain Srain = [D4, D5, D6, D10, D14]
Day Outlook Temperature Humidity Wind Play Tennis D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No
D10 Rain Mild Normal Weak Yes D14 Rain Mild High Strong No
Isi(Temperature) = Hot, Mild, Cool Smild = [2+,1-] Scool = [1+,1-]
Gain(SRain,Temperature) = Entropy(SRain) (3/5)Entropy(Smild) (2/5)Entropy(Scool) = 0.97075 (3/5)0.91830 (2/5)1.00000 = 0.01997
Outlook
Humidity ? YES
Overcast
Rain Sunny
YES NO
High Normal
-
6WIDODO
Isi(Humidity) = High, Normal Shigh = [1+,1-] Snormal = [2+,1-]
Gain(SRain,Humidity) = Entropy(SRain) (2/5)Entropy(Shigh) (3/5)Entropy(Snormal) = 0.97075 (2/5)1.00000 (3/5)0.91830 = 0.01997 Isi(Wind) = Weak, Strong Sweak = [3+,0-] Sstrong = [0+,2-] Gain(SRain,Wind) = Entropy(SRain) (3/5)Entropy(Sweak) (2/5)Entropy(Sstrong) = 0.97075 (3/5)0.00000 (2/5)0.00000 = 0.97075 Pada tahap ini atribut Wind memiliki nilai terbaik, sehingga decision tree-nya menjadi:
Outlook
Humidity Wind YES
Overcast
Rain Sunny
YES NO
High Normal
YES NO
strong Weak
-
7WIDODO
Sehingga setelah proses learning data tersebut, diperoleh rules sebagai berikut: IF outlook=sunny AND humidity=high THEN PlayTennis=No IF outlook=sunny AND humidity=normal THEN PlayTennis=Yes IF outlook=overcast THEN PlayTennis=Yes IF outlook=Rain AND wind=strong THEN PlayTennis=No IF outlook=Rain AND wind=weak THEN PlayTennis=Yes