Trening va sinov uchun ma'lumotlarni ajratish:
TrainD ← Split[D,size = 0,8], TrainL ← [L,size = 0,8]
TestD ← Split[D,size = 0,2], TestL ← [L,size = 0,2]
TF-IDF xususiyatini ajratib olish:
har bir ngram uchun ∈ {unigram, bigram, trigram} qiling
TrainF ← TfidfV ectorizer(analizator =0 word0, ngram, TrainD)
TestF ← TfidfV ectorizer(analizator =0 word0, ngram, TestD)
Klassifikatorlarni o'rgatish va baholash:
c ∈ uchun {Linear SV C, RBF SV M, DTC, RF, LR, MNB} bajaring
c ning giperparametrlarini ishga tushiring
Trening c TrainF va TrainL bilan
TestF va TestL bilan c testi
Ballarni hisoblang
uchun chalkashlik matritsasi oxirini hisoblang
har bir ngram uchun ∈ {bigram, trigram, fourgram} do
TrainF ← TfidfV ectorizer(analizator =0 char0, ngram, TrainD)
TestF ← TfidfV ectorizer(analizator =0 char0, ngram, TestD)
Do'stlaringiz bilan baham: |