algorithm" and can map new input to the appropriate category. In the field
of statistics, the classification of data is often carried out with "logistic
regression", wherein the characteristics of the observations
are referred to as
"explanatory variables" or "independent variables" or "regressors" and the
categories used to generate predictions are known as "outcomes". These
"outcomes" are regarded as the probable values of the dependent variable.
In the context of machine learning, "observations are often referred to as
instances, the explanatory variables are referred to like features (grouped
into a feature vector) and the possible categories to be predicted are referred
to as classes".
⠀
The “Logistic regression” technique is “borrowed” by ML technology from
the world of statistical analysis. "Logistic regression" is regarded to be the
simplest algorithm for classification, even though the term sounds like a
technique of "regression", that is not the case. "Logistic regression"
produces estimates based on single or multiple
input values for the
likelihood of an event occurring. For example, a "logistic regression" will
use a patient's symptoms, blood glucose level, and family history as inputs
to generate the likelihood of the patient to develop diabetes. The model will
generate a prediction in the form of probability ranging from '1' to '10'
where '10' means full certainty. For the patient, if the projected probability
exceeds 5, the prediction would be that they will suffer from diabetes. If the
predicted probability is less than 5, it would be predicted that the patient
will not develop diabetes. Logistic regression allows a line graph to be
created which can represent the "decision boundary”.
It is widely used for binary classification tasks that involve two different
class values. Logistic regression is so named owing to the fundamental
statistical function at the root of this technique called the "logistic function".
Statisticians created the "logistic function", also called the "sigmoid
function", to define the attributes of population growth in ecosystems which
continues to grow rapidly and nearing the
maximum carrying capacity of
the environment. The logistic function is “an S-shaped curve capable of
taking any real-valued integer and mapping it to a value between '0' and '1',
but never precisely at those boundaries, where ‘e’ is the base of the natural
log (Euler's number or the EXP)” and the numerical value that you are
actually going to transform is called the ‘value’.
“1 / (1 + e^-value)”
Here is a graph of figures ranging from “-5 and 5” which has been
transformed by the logistic function into a range between 0 and 1.
Similar to the "linear regression" technique, "logistic regression" utilizes an
equation for data representation.
Input values (X) are grouped linearly to forecast an output value (Y), with
the use of weights or coefficient values (presented as the symbol "Beta"). It
is mainly different from the “linear regression” because the modeled output
value tends to be binary (0 or 1) instead of a range of values.
Below is an example of the "logistic regression" equation, where “the single
input value coefficient (X) is represented by ‘b1’, the “intercept or bias
term’ is the ‘bo’, and the “expected result” is ‘Y’.
Every column in
the input data set has a connected coefficient "b" with it, which should be
understood by learning the training data set. The actual model
representation, which is stored in a file or in the system memory would be
“the coefficients in the equation (the beta values)”.
"y=e^(b0 + b1*x)/(1 +e^(b0 + b1*x))"
The "logistic regression" algorithm's coefficients (the beta values) must be
estimated on the basis of the training data. This can be accomplished using
another statistical technique called "maximum-likelihood estimation",
which is a popular ML algorithm utilized using a multitude of other
ML algorithms. "Maximum-likelihood estimation" works by making certain
assumptions about the distribution of the input data set.
An ML model that can predict a value nearer to "0" for the “other class”
and a value nearer to "1" for the “default class” can be obtained by
employing the best coefficients of the model. The underlying assumption
for most likelihood of the "logistic regression" technique is that “a search
procedure attempts to find values for the coefficients
that will reduce the
error in the probabilities estimated by the model pertaining to the input data
set (e.g. probability of ‘0’ if the input data is not the default class)”.
Without going into mathematical details, it is sufficient to state that you will
be using “a minimization algorithm to optimize the values of the
best coefficients from your training data set”. In practice, this can be
achieved with the use of an effective “numerical optimization algorithm”,
for example, the "Quasi-newton" technique).
Do'stlaringiz bilan baham: