output”. A liner connection between the output and input variables
is also assumed. Data transforms of the input variables may lead
to a more accurate model with a higher capability
of revealing the linear relationships of the data set. For instance, to
better
reveal these relationships, we could utilize "log", "root",
"Box-Cox" and other single variable transformations.
“Remove Correlated Inputs”: If you have various highly
correlated inputs, the model could potentially be “over-fit”
similar the "linear regression" technique. To address this issue, you
can “calculate the pairwise correlations between all input data
points and remove the highly correlated inputs”.
“Failure to converge”: It is likely for the "expected likelihood
estimation"
method
that
is
trained
on
the
coefficients to fail to converge. It could occur
if the data set contains
several highly correlated inputs or there is a very limited data (e.g.
loads of "0" in the input data).
“Naïve Bayes classifier algorithm” is another “classification” learning
algorithm with a wide variety of applications. It is a method of
classification derived from the "Bayes theorem", which assumes predictors
are independent of one another. Simply put, a "Naïve Bayes
classifier" assumes that “all the features in
a class are unrelated to the
existence of any other feature in that class”. For instance, if input data has
an image of a fruit which is green, round, and about 10 inches in diameter,
the model can consider the input to be a watermelon. Although these
attributes rely on one another or the presence of a specific feature, all of the
characteristics contribute independently to the probability that the image of
the fruit is that of a watermelon, hence it is referred to as "Naive". “Naïve
Bayes model” for large volumes of data sets is relatively simple to construct
and extremely effective.
“Naïve Bayes” has reportedly outperformed even the most advanced
techniques of classification, along with its simplicity of development.
“Bayes theorem” can also provide
the means to calculate
the posterior probability "P(c|x)" using "P(c), P(x) and P(x)". On the basis
of the equation shown in the picture below, where the probability of “c” can
be calculated if “x” has already occurred.
“P(c|x)" is the posterior probability of "class (c, target)" provided by the
"predictor (x, attributes)". “P(c)" is the class's previous probability. “P(x|c)”
is the probability of the class provided by the predictor. “P(x)” is the
predictor's prior probability.
Here is an example to better explain the application of the "Bayes
Theorem". The picture below represents the data set on the problem of
identifying suitable weather days to play golf.
The columns depict the
weather features of the day and the rows contain individual entries.
Considering the first row of the data set, it can be concluded that the
weather will be too hot and humid with rain so the day is not suitable to
play golf. Now, the primary assumption here is
that all these features or
predictors are independent of one another. The other assumption being
made here is that all the predictors have potentially the same effect on the
result. Meaning, if the day was windy it would have some relevance to the
decision of playing golf as the rain. In this example, the variable (c) is the
class (playing golf) representing the decision if the weather is suitable for
golf and variable (x) represents the features or predictors.
Do'stlaringiz bilan baham: