O’ZBEKISTON RESPUBLIKASI OLIY VA O’RTA MAXSUS TA’LIM VAZIRLIGI
MIRZO ULUG’BEK NOMIDAGI O’ZBEKISTON MILLIY UNIVERSITETI
AMALIY MATEMATIKA VA INTELLEKTUAL TEXNOLOGIYALAR FAKULTETI
SUN’IY INTELLEKT KAFEDRASI
KECHKI TA’LIM
AMALIY MATEMATIKA YO’NALISH
SUN’IY INTELLEKT VA NEYRONTO’RLI TEXNOLOGIYALAR FANIDAN
1-AMALIY ISH
MAVZU: KNN
Bajardi:_________________________________
Qabul qildi:______________________________
Toshkent-2023
K-Nearest Neighbords(k-eng yaqin qo’shnilari)
k yaqin qo'shni algoritmi
G'oyasi – test tanlanmaga ko'pchilik metkasini aniqlovchi, o'rgatuvchi tanlanmadagi "k ta yaqin" metkasini tadbiq qilish.
Amaliyotda k odatda toq son olinadi.
k = 1 – eng yaqin qo'shni algoritmi.
Yaqin qo'shni algoritmi
O'rgatuvchi tanlanma berilgan bo'lsin:
|
|
Obyektlar to'plamida masofa funksiyasi aniqlangan bo'lsin:
|
|
Ixtiyoriy u obyekt uchun tartibida joylashtirilsin:
|
o'rgatuvchi tanlanma obyektlarigacha masofalar o'sish
|
Yaqin qo'shni algoritmining umumiy ko'rinishi:
|
vazn funksiyasi (alomatning muhimligi)
|
|
k yaqin qo'shni algoritmi ishlashiga misol
k yaqin qo'shni algoritmi muammolari
k qiymati, qancha yaqin qo'shnini olish kerak?
k juda kichik bo'lsa shovqin nuqtalarga sezgir bo'ladi
k juda katta bo'lsa qo'shnilar ichida boshqa sinf vakillari bo'lish ehtimolligi ortadi
Obyektlar o'rtasidagi masofani qanday hisoblash kerak?
Hisoblash murakkabligi
O'rgatuvchi tanlanma hajmi
Berilganlarning o'lchami
"Yalqov" o'rgatish
Metrikalar ( p(x, y) )
EVKLID masofasi (Euclidean Distance)
MANHETTEN masofasi (shahar bloklari masofasi, taksi geometriyasi, Manhattan distance, city block distance, or taxicab geometry)
CHEBISHEV masofasi (Chebyshev distance)
MINKOVSKI masofasi (Minkowski distance)
p=1 – Manhetten masofasi
p=2 – Evklid masofasi
p=∞ – Chebishev masofasi
Dastur:
import matplotlib
matplotlib.use('TkAgg')
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from matplotlib import pyplot as plt
iris = load_iris()
#print(iris)
#print(iris.target_names)
df = pd.DataFrame(iris.data, columns=iris.feature_names)
print(iris)
df['target'] = iris.target
#print(df.shape)
#print(df[df.target==1].head())
df0 = df[:50]
df1 = df[50:100]
df2 = df[100:]
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.scatter(df0['sepal length (cm)'], df0['sepal width (cm)'], color = 'green', marker = '*')
plt.scatter(df1['sepal length (cm)'], df1['sepal width (cm)'], color = 'red', marker = '^')
plt.scatter(df2['sepal length (cm)'], df2['sepal width (cm)'], color = 'black', marker = 'D')
plt.legend()
plt.show()
plt.xlabel('petal length')
plt.ylabel('petal width')
plt.scatter(df0['petal length (cm)'], df0['petal width (cm)'], color = 'green', marker = '*')
plt.scatter(df1['petal length (cm)'], df1['petal width (cm)'], color = 'red', marker = '^')
plt.scatter(df2['petal length (cm)'], df2['petal width (cm)'], color = 'black', marker = 'o')
plt.legend()
plt.show()
x = df.drop(['target'], axis = 'columns')
y = df.target
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.3, random_state=1)
print(len(x_train))
print(len(x_test))
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(x_train, y_train)
baxo = knn.score(x_test,y_test)
print(baxo)
y_pred = knn.predict(x_test)
cm = confusion_matrix(y_test, y_pred)
print(classification_report(y_test,y_pred))
Do'stlaringiz bilan baham: |