Machine Learning: 2 Books in 1: Machine Learning for Beginners, Machine Learning Mathematics. An Introduction Guide to Understand Data Science Through the Business Application



Download 1,94 Mb.
Pdf ko'rish
bet26/96
Sana22.06.2022
Hajmi1,94 Mb.
#692449
1   ...   22   23   24   25   26   27   28   29   ...   96
Bog'liq
2021272010247334 5836879612033894610

K Nearest Neighbors
K-nearest neighbors are one of the most straightforward and widely used
methods of data classification. It’s a form of supervised learning used for
both classification and regression, and it’s also the most basic clustering
algorithm. Simply put, it’s about taking a data point and putting it with the
most common and nearest group on the scatterplot.
In KNN, a new data point is classified by the average median value of its
neighbors K. The nearest neighbors to a new data point ‘vote' for which
classification it falls in to. K is the number of nearest neighbors that are
voting in the model. Set k to a number- this is how many closest data points
the new data point will analyze to choose which one it fits with. The
closeness of data points is measured using Euclidean distance.


Take the following two images as an example. We have our data split into
two classifications; the white dots and the black dots. A new data point is
introduced, the triangle, and we’d like to predict which classification it falls
in to.
In this model, we have chosen that K=4. If you choose k=4, then the four
closest data points are analyzed. Whichever class is most prevalent amongst
the neighboring data points is the class that the new data point will be
placed in. In this case, you can see in the image on the right that the four
white dots are the nearest classification. Therefore, the new data point is
classified in that class.
There are a few factors to consider when you are choosing the value for k.
The higher the number for k is, the closer we will get to the true
classification of our new data point. There is an optimal point where the
value of k should stop increasing to avoid overfitting.
If you choose to use a number for K that is too low, then there is a
likelihood that your model will suffer from a high level of bias. If you use a
number that is too high, then the computing power required to calculate the
value will be too costly. You might consider choosing to use an odd number


when you choose a value for K, rather than an even number. If you use an
odd number, it is less likely that you will encounter a tie between classes
voting on a data point. Data scientists often choose the number 5 as a
default setting for k.
If you use a large number for K, this will be very data-intensive. Large data
sets are also tough to use with KNN machine learning models. If you are
using larger data sets, then you must calculate the distance between
hundreds or maybe thousands of data points. It also doesn't perform well
when you use this method on a model that exists in more than two
dimensions. Again, it has to do with the computing power required to
calculate this distance between many data points.
Support Vector
Support vector is another type of classifier. It classifies using a hyperplane.
Generally, we would use a support vector model with smaller datasets,
where it performs quite well.

Download 1,94 Mb.

Do'stlaringiz bilan baham:
1   ...   22   23   24   25   26   27   28   29   ...   96




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish