Machine Learning: 2 Books in 1: Machine Learning for Beginners, Machine Learning Mathematics. An Introduction Guide to Understand Data Science Through the Business Application



Download 1,94 Mb.
Pdf ko'rish
bet29/96
Sana22.06.2022
Hajmi1,94 Mb.
#692449
1   ...   25   26   27   28   29   30   31   32   ...   96
Bog'liq
2021272010247334 5836879612033894610

Clustering
Clustering is a sub-group of unsupervised learning. Clustering is the task of
grouping similar things together When we use clustering, we can identify
characteristics and sort our data based on these characteristics. If we are
using machine learning for marketing, clustering can help us identify
similarities in groups of customers of potential clients. Unsupervised
learning can help us sort customers into categories that we might not have
created with the help of machine learning. It can also help you sort your
data when you are working with a large number of variables.
K-Means clustering
K-means clustering works similarly to K-nearest neighbors You pick a
number for k to decide how many groups you want to see. You continue to
cluster and repeat until clusters are more clearly classified.


Your data is grouped around centroids, which are the points on your graph
that you have chosen where you want to see your data clustered. You
choose them at random, and you have k of them. Once you introduce your
data to the model, data points are placed in categories indicated by the
closest centroid, which is measured by Euclidean distance. Then you take
the average value of the data points surrounding each centroid. Keep
repeating this process until your results stay the same, and you have
consistent clusters. Each data point is only assigned to one cluster.
You repeat this process by finding the average values for x and y within
each cluster. This will help you extrapolate the average value of the data
points in each cluster. K-means clustering can help you identify previously
unknown or overlooked patterns in the data.
Choose the value for k that is optimal for the number of categories you
want to create. Ideally, you should have more than 3. However, the
advantage associated with adding more clusters diminishes that higher the
number of clusters you have. The higher the value for k that you choose, the
smaller and more specific the clusters are. You wouldn’t want to use a value
for k that is the same as the number of data points because each data point
would end up in its own cluster.
You will have to know your dataset well and use your intuition to guess
how many clusters are appropriate, and what sort of differences that will be
present. However, our intuition and knowledge of the data are less helpful
once we have more than just a few potential groups.

Download 1,94 Mb.

Do'stlaringiz bilan baham:
1   ...   25   26   27   28   29   30   31   32   ...   96




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish