machine learning. Machine learning has close relations with the field of
Statistics, which is focused on generating predictions using advanced
computing tools and technologies. The research of “mathematical
optimization” provides the field of machine
learning with techniques,
theories, and implementation areas. Machine learning is also referred to as
“predictive analytics” in its implementation to address business issues. In
ML, the “target” is known as “label”, while in statistics, it’s called
“dependent variable”. A “variable” in statistics is known as “feature” in
ML. And a “feature creation” in ML is known as “transformation” in
statistics.
ML technology is also closely related to data mining and optimization. ML
and data mining often utilize the same techniques with considerable
overlap. ML focuses on generating predictions based on predefined
characteristics of the given training data. On the other hand,
data mining
pertains to the identification of unknown characteristics in a large volume
of data. Data mining utilizes many techniques of ML, but with distinct
objectives; similarly, machine learning also utilizes techniques of data
mining through the "unsupervised learning algorithms" or as a pre-
processing phase to enhance the prediction accuracy of the model. The
intersection of these two distinct research areas stems from the fundamental
assumptions with which they operate.
In machine learning, efficiency is
generally assessed about the capacity of the model to reproduce known
knowledge, while in “knowledge discovery and information mining
(KDD)” the main job is to discover new information. An “uninformed or
unsupervised” technique, evaluated in terms of known information, will be
easily outperformed by other “supervised techniques”. On the contrary,
“supervised techniques” can not be used in a typical “KDD” task owing to
the lack of training data.
Data optimization is another area that machine learning is closely linked
with. Various learning issues can be formulated as minimization of certain
“loss function” on training data set. “Loss functions”
are derived as the
difference between the predictions generated by the model being trained
and the input data values. The distinction between the two areas stems from
the objective of “generalization”. Optimization algorithms are designed to
decrease the loss of the training data set. The objective of machine learning
is to minimize the loss of input data from the real world.
Machine learning has become such a "heated" issue that its definition varies
across the world of academia, corporate companies, and the scientific
community. Here are some of the commonly
accepted definitions from
select sources that are extremely known:
“Machine learning is based on algorithms that can learn from
data without relying on rules-based programming.” – McKinsey.
“Machine Learning, at its most basic, is the practice of using
algorithms to parse data, learn from it, and then make a
determination or prediction about something in the world.” –
Nvidia
“The field of Machine Learning seeks to answer the question,
how can we build computer systems that automatically improve
with experience, and what are the fundamental laws that govern
all learning processes?” – Carnegie Mellon
University
“Machine learning is the science of getting computers to act
without being explicitly programmed.” – Stanford University