Machine Learning: 2 Books in 1: Machine Learning for Beginners, Machine Learning Mathematics. An Introduction Guide to Understand Data Science Through the Business Application



Download 1,94 Mb.
Pdf ko'rish
bet30/96
Sana22.06.2022
Hajmi1,94 Mb.
#692449
1   ...   26   27   28   29   30   31   32   33   ...   96
Bog'liq
2021272010247334 5836879612033894610

Dimensionality Reduction


When you are using dimensionality reduction, you are trimming down data
to remove unwanted features. Simply put, you're scaling down the number
of variables in a dataset.
When we have a lot of variables in our model, then we run the risk of
having dimensionality problems. Dimensionality problems are problems
that are unique to models with large datasets and can affect prediction
accuracy. When we have many variables, we need larger populations and
sample populations in order to create our model. With that many variables,
it’s hard to have enough data to have many possible combinations to create
a well-fitting model.
If we use too many variables, then we can also encounter overfitting.
Overfitting is the main problem which would cause a data scientist to
consider dimensionality reduction.
We must choose data that we don’t need, or that is irrelevant. If we have a
model predicting someone’s income, do we need a variable that tells us
what their favorite color is? Probably not. We can drop it out of our dataset.
Usually, it's not that easy to tell when a variable should be dropped. There
are some tools we can use to determine which variables aren’t as important.
Principle Component Analysis is a method of dimensionality reduction.
We take the old set of variables and convert them into a newer set
somehow. The new sets we’ve created are called principal components.
There is a tradeoff between reducing the number of variables while
maintaining the accuracy of your model.


We can also standardize the values of our variables. Make sure they are all
valued in the same relative scale so that you don't inflate the importance of
a variable. For example, if we have variables measured as a probability
between 0 and 1 vs. variables that are measured by whole numbers above
100.

Download 1,94 Mb.

Do'stlaringiz bilan baham:
1   ...   26   27   28   29   30   31   32   33   ...   96




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish