Urgench state University Physics and mathematics faculty Speciality: «5111018-Professional education: Informatics and Information technologies» Group and student name: 181-inf Babaev Saidmukhammadjon



Download 2,72 Mb.
bet26/26
Sana21.07.2022
Hajmi2,72 Mb.
#831692
1   ...   18   19   20   21   22   23   24   25   26
Bog'liq
BabayevS (2)

Dataset

DTNB

DT

C4.5

PT

FR

RDR

CBA

SA

J&B

Breast.Can

122

22

10

20

13

13

63

20

47

Balance

31

35

35

27

44

22

77

45

79

Car.Evn

144

432

123

62

100

119

72

160

41

Vote

270

24

11

8

17

7

22

30

13

Tic-Tac-Toe

258

121

88

37

21

13

23

60

14

Nursery

1240

804

301

172

288

141

141

175

109

Hayes-root

5

8

22

14

11

10

34

45

34

Lymp

129

19

20

10

17

11

23

60

29

Spect.H

145

2

9

13

17

12

4

50

11

Adult

737

1571

279

571

150

175

126

130

97

Chess

507

101

31

29

29

30

12

120

24

Connect4

3826

4952

3973

3973

403

341

349

600

273

Average (%)

618

674

409

411

93

75

79

125

64

Statistically significant counts (wins/losses) of J&B against other rule-based classification models on classification rules is shown in Table 9.


Table 5. Statistically significant wins/losses counts of J&B method on rules.




DTNB

DT

C4.5

PT

FR

RDR

CBA

SA

W

10

7

6

6

8

5

7

10

L

2

5

4

6

4

5

3

2

W-L

0

0

2

0

0

2

2

0

Table 9 proves that J&B produced a statistically smaller classifier than DTNB and SA methods on 10 datasets, and DT, FR and CBA methods on 7 (or more than 7) datasets out of 12. The most importantly, J&B generated statistically smaller classifiers than all other models on bigger datasets, which was our main goal in this research. Experimental evaluations on bigger datasets (over 10,000 samples) are shown in Figure 7.

Figure 7. Comparison of rule-based classification methods on the average number of rules.
Figure 7 proves the advantage of the J&B method: it produced the smallest classifier among all rule-based classification models on selected datasets.
Our experiment on relevance measures (average results) such as “Precision”, “Recall” and “F-measure”, is highlighted in Figure 8. The detailed result for each dataset can be found in Appendix A.

Figure 8. Comparison of the J&B classifier on Accuracy, Precision, Recall and F-measure.


Example 1. Let us assume that we have the following class association rules (shown in Table 3.7) (satisfied the user specified minimum support and confidence thresholds) generated from a dataset. We apply here the minimum coverage thresh- old as 80%, that is, when our intended classifier covers at least 80% of the training examples, then we stop.

And, learning dataset is to build the model

In the first step, we sort the class association rules by confidence and support descending order, the result is shown in Table 3.9.

In the next step, we form our classifier by selecting the strong rules. We select strong rules which contribute to improve the overall coverage, we continue until achieving the intended training dataset coverage. Table 3.10 illustrates our final classifier.

Our classifier includes 6 rules. In this example, intended coverage is equal to 80% and 6 classification rules in the classifier cover 80% of the learning set. Since our training dataset has some examples with missing values, our classifier covered the whole training dataset (examples without missing values). Other rules also may cover unclassified examples, but we cannot exceed the user-defined training dataset coverage threshold. This is our stopping criterion and we cannot include other rules into our classifier. We also cannot include classification rules which cover only classified examples (this means it does not contribute to the improvement of overall coverage). Now, we classify the following unseen example:
{a1=1,a2=5,a3=5,a4=4,a5=5} ?
So, this example is classified by third and fourth classification rules. The class value of the rules which correctly classified the new example are 3 and 3. So, our classifier predicts that the class value of the new example is 3 (because the majority class value is 3).


Example 2.





Outlook

Temperature

Humidity

Windy

Play

1

sunny

hot

high

FALSE

no

2

sunny

hot

high

TRUE

no

3

overcast

hot

high

FALSE

yes

4

rainy

mild

high

FALSE

yes

5

rainy

cool

normal

FALSE

yes

6

rainy

cool

normal

TRUE

no

7

overcast

cool

normal

TRUE

yes

8

sunny

mild

high

FALSE

no

9

sunny

cool

normal

FALSE

yes

10

rainy

mild

normal

FALSE

yes

11

sunny

mild

normal

TRUE

yes

12

overcast

mild

high

TRUE

yes

13

overcast

hot

normal

FALSE

yes

14

rainy

mild

high

TRUE

no

We use the a priori algorithm to find the association rules.
Min support: 10%
Confidance: 80%
Car: True
If we reduce confidamce we can get more rules, if we increase we get less rules. This is one of the most important parameters. Because if we reduce it, the rules will increase and unnecessary rules will be created. If we multiply, good rules can be lost. So we have to define it as a good analysis. We can get 21 rules if we make min support 0.1 and confidance 0.8. If we lower the support to 0.05, we get 72 rules. Below you can see a table of what we did in two ways.




confidance

1. outlook=overcast ==> play=yes
2. humidity=normal windy=FALSE ==> play=yes
3. outlook=sunny humidity=high ==> play=no
4. outlook=rainy windy=FALSE ==> play=yes
5. outlook=sunny humidity=normal ==> play=yes
6. outlook=sunny temperature=hot ==> play=no
7. outlook=overcast temperature=hot ==> play=yes
8. outlook=overcast humidity=high ==> play=yes
9. outlook=overcast humidity=normal ==> play=yes
10. outlook=overcast windy=TRUE ==> play=yes
11. outlook=overcast windy=FALSE ==> play=yes
12. outlook=rainy windy=TRUE ==> play=no
13. temperature=mild humidity=normal ==> play=yes
14. temperature=cool windy=FALSE ==> play=yes
15. outlook=sunny temperature=hot humidity=high ==> play=no
16. outlook=sunny humidity=high windy=FALSE ==> play=no
17. outlook=overcast temperature=hot windy=FALSE ==> play=yes
18. outlook=rainy temperature=mild windy=FALSE ==> play=yes
19. outlook=rainy humidity=normal windy=FALSE ==> play=yes
20. temperature=cool humidity=normal windy=FALSE ==> play=yes
21. outlook=sunny temperature=cool ==> play=yes
22. outlook=overcast temperature=mild ==> play=yes
23. outlook=overcast temperature=cool ==> play=yes
24. temperature=hot humidity=normal ==> play=yes
25. temperature=hot windy=TRUE ==> play=no
26. outlook=sunny temperature=mild humidity=normal ==> play=yes
27. outlook=sunny temperature=mild windy=TRUE ==> play=yes
28. outlook=sunny temperature=cool humidity=normal ==> play=yes
29. outlook=sunny temperature=cool windy=FALSE ==> play=yes
30. outlook=sunny humidity=normal windy=TRUE ==> play=yes
31. outlook=sunny humidity=normal windy=FALSE ==> play=yes
32. outlook=sunny temperature=hot windy=TRUE ==> play=no
33. outlook=sunny temperature=hot windy=FALSE ==> play=no
34. outlook=sunny temperature=mild humidity=high ==> play=no
35. outlook=sunny temperature=mild windy=FALSE ==> play=no
36. outlook=sunny humidity=high windy=TRUE ==> play=no
37. outlook=overcast temperature=hot humidity=high ==> play=yes
38. outlook=overcast temperature=hot humidity=normal ==> play=yes
39. outlook=overcast temperature=mild humidity=high ==> play=yes
40. outlook=overcast temperature=mild windy=TRUE ==> play=yes
41. outlook=overcast temperature=cool humidity=normal ==> play=yes
42. outlook=overcast temperature=cool windy=TRUE ==> play=yes
43. outlook=overcast humidity=high windy=TRUE ==> play=yes
44. outlook=overcast humidity=high windy=FALSE ==> play=yes
45. outlook=overcast humidity=normal windy=TRUE ==> play=yes
46. outlook=overcast humidity=normal windy=FALSE ==> play=yes
47. outlook=rainy temperature=mild humidity=normal ==> play=yes
48. outlook=rainy temperature=cool windy=FALSE ==> play=yes
49. outlook=rainy humidity=high windy=FALSE ==> play=yes
50. outlook=rainy temperature=mild windy=TRUE ==> play=no
51. outlook=rainy temperature=cool windy=TRUE ==> play=no
52. outlook=rainy humidity=high windy=TRUE ==> play=no
53. outlook=rainy humidity=normal windy=TRUE ==> play=no
54. temperature=hot humidity=normal windy=FALSE ==> play=yes
55. temperature=hot humidity=high windy=TRUE ==> play=no
56. temperature=mild humidity=normal windy=TRUE ==> play=yes
57. temperature=mild humidity=normal windy=FALSE ==> play=yes
58. outlook=sunny temperature=mild humidity=normal windy=TRUE ==> play=yes
59. outlook=sunny temperature=cool humidity=normal windy=FALSE ==> play=yes
60. outlook=sunny temperature=hot humidity=high windy=TRUE ==> play=no
61. outlook=sunny temperature=hot humidity=high windy=FALSE ==> play=no
62. outlook=sunny temperature=mild humidity=high windy=FALSE ==> play=no
63. outlook=overcast temperature=hot humidity=high windy=FALSE ==> play=yes
64. outlook=overcast temperature=hot humidity=normal windy=FALSE ==> play=yes
65. outlook=overcast temperature=mild humidity=high windy=TRUE ==> play=yes
66. outlook=overcast temperature=cool humidity=normal windy=TRUE ==> play=yes
67. outlook=rainy temperature=mild humidity=high windy=FALSE ==> play=yes
68. outlook=rainy temperature=mild humidity=normal windy=FALSE ==> play=yes
69. outlook=rainy temperature=cool humidity=normal windy=FALSE ==> play=yes
70. outlook=rainy temperature=mild humidity=high windy=TRUE ==> play=no
71. outlook=rainy temperature=cool humidity=normal windy=TRUE ==> play=no
72. humidity=normal ==> play=yes

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.86





confidance

1. outlook=overcast ==> play=yes
2. humidity=normal windy=FALSE ==> play=yes
3. outlook=sunny humidity=high ==> play=no
4. outlook=rainy windy=FALSE ==> play=yes
5. outlook=sunny humidity=normal ==> play=yes
6. outlook=sunny temperature=hot ==> play=no
7. outlook=overcast temperature=hot ==> play=yes
8. outlook=overcast humidity=high ==> play=yes
9. outlook=overcast humidity=normal ==> play=yes
10. outlook=overcast windy=TRUE ==> play=yes
11. outlook=overcast windy=FALSE ==> play=yes
12. outlook=rainy windy=TRUE ==> play=no
13. temperature=mild humidity=normal ==> play=yes
14. temperature=cool windy=FALSE ==> play=yes
15. outlook=sunny temperature=hot humidity=high ==> play=no
16. outlook=sunny humidity=high windy=FALSE ==> play=no
17. outlook=overcast temperature=hot windy=FALSE ==> play=yes
18. outlook=rainy temperature=mild windy=FALSE ==> play=yes
19. outlook=rainy humidity=normal windy=FALSE ==> play=yes
20. temperature=cool humidity=normal windy=FALSE ==> play=yes
21. humidity=normal ==> play=yes

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.86

Conclusion
Our experiments on accuracy and number of rules show that our method is compact, accurate and comparable with 8 other well-known classification methods. Although it did not achieve the best average classification accuracy, it produced significantly smaller rules on bigger datasets compared to other classification algorithms. Our proposed classifier achieved reasonably high average coverage with.
Statistical significance testing shows that our method was statistically better than or equal to other classification methods on some datasets, while it obtained worse results than those methods on some other datasets. The most important achievement in this research was that J&B got significantly better results in terms of an average number of classification rules than all other classification methods, while it had comparable results to those methods on accuracy.
This research was the first and main step for our future goal, where we plan to cluster class association rules by their similarity and thus further reduce their number and increase the accuracy and understandability of the classifier.


References



  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 94 Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487-499. Chile (1994).

  2. Ali, K., Manganaris, S., Srikant, R. Partial Classification Using Association Rules. In Proceedings of KDD-97, pp. 115-118, U.S.A (1997).

  3. Baralis, E., Cagliero, L., Garza, P.: A novel pattern-based Bayesian classifier. IEEE Transactions on Knowledge and Data Engineering 25(12), 2780–2795 (2013).

  4. Bayardo, R. J. Brute-force mining of high-confidence classification rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 123-126, U.S.A (1997).

  5. Breiman L.: Random Forests. Machine Learning 45(1), pp. 5-32 (2001).

  6. Cendrowska J.: PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies 27(4), pp. 349-370 (1987).

  7. Chen, G., Liu, H., Yu, L., Wei, Q., Zhang, X.: A new approach to classification based on association rule mining. Decision Support Systems 42(2), 674–689 (2006).

  8. Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning, 3(4), 261–283 (1989).

  9. Cohen, W., W.: Fast Effective Rule Induction. In: ICML'95 Proceedings of the Twelfth International Conference on Machine Learning, pp. 115-123, Tahoe City, California (1995).

  10. Dua, D., Graff, C.: UCI Machine Learning Repository, Irvine, CA: University of California (2019).

  11. Frank, E., Witten, I.: Generating Accurate Rule Sets Without Global Optimization. In: Fifteenth International Conference on Machine Learning, pp. 144-151. USA (1998).

  12. Holte, R.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11(1), pp. 63-91 (1993).

  13. Kohavi, R.: The Power of Decision Tables. In: 8th European Conference on Machine Learning, pp. 174-189, Heraclion, Crete, Greece (1995).

  14. Lent, B., Swami, A., Widom, J.: Clustering association rules. In: ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering, pp. 220-231. England (1997).

  15. Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. in Proceedings of the 1st IEEE International Conference on Data Mining (ICDM ’01), pp. 369–376, San Jose, California, USA (2001).

  16. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. in Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD ’98), pp. 80–86, New York, USA (1998).

  17. Quinlan, J.: C4.5: Programs for Machine Learning, Machine Learning 16(3), 235-240 (1993).

  18. Xiaoxin, Y., Jiawei, H. CPAR: Classification based on Predictive Association Rules. Proceedings of the SIAM International Conference on Data Mining, pp. 331-335, San Francisco, U.S.A (2003).

  19. Zhang, M., Zhou Z.: A k-nearest neighbor based algorithm for multi-label classification. In: Proceedings of the 1st IEEE International Conference on Granular Computing (GrC’05), vol. 2, pp. 718–721, Beijing, China (2005).

  20. Zhou, Z., Liu, X.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering, 18(1), pp. 63–77 (2006).

Download 2,72 Mb.

Do'stlaringiz bilan baham:
1   ...   18   19   20   21   22   23   24   25   26




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish