present
such that all “confidence vectors α
i
” in the sample are “0” on the
coordinate “
j”, i.e. “
αi[j] = 0” for all “
i = 1..m”. Assume “
ej
∈
H” is the
“standard basis vector corresponding to this coordinate”. Then in the
equation shown in the picture below, “
F
S
(3)
(·)" represents the empirical risk
concerning the function “
f
(3)
(·)”.
In another scenario, if “
F
S
(3)
(·)” denotes the actual risk for the function “
f
(3)
(·)”, the equation shown in the picture below is obtained.
Thus, for any sample size “
m”, a convex “Lipschitz-continuous objective”
can be constructed in a dimension that is high enough so as to ensure that
with minimum “0.63 probability” over the sample, “
sup
h
|F
(3)
(h)−F
(3)
(h)| ≥
½”. In addition, since “
f (·; ·)” is non-negative, “
e
j
” can be denoted as an
“empirical minimizer”, even though its expected value “
F
(3)
(
e
j
) = ½” is not
at all close to the optimal expected value “min
h
F
(3)
(
h) =
F
(3)
(0) = 0”.
To explain this case with an approach that is not dependent on the sample-
size, assume “
H is the unit sphere of an infinite-dimensional Hilbert space
with orthonormal basis
e1,
e2,..., where for
v
∈
H,
we refer to its
coordinates
v[
j] = <
v,
e
j
>” with respect to
this basis”. The “confidences α”
serve as a map of every single coordinate to “[0, 1]”. This means, an
“infinite sequence of reals in [0, 1]”. The operation of the product according
to
the elements, “
α
∗
v” is defined on the basis of this mapping and the
objective function “
f
(3)
(·)” of the equation (shown in the first picture of this
example) can be easily defined in this infinite-dimensional space.
Let us now reconsider the distribution over “
z = (
x, α)” where “
x = 0” and
“α” is an infinite independent and identically distributed sequence of
“uniform Bernoulli random variables” (that is, a “Bernoulli process with
each α
i
uniform over {0, 1} and independent of all other α
j
”). It can be
implied that for any finite sample there is
high likelihood of finding a
coordinate “
j” with “α
i
[
j] = 0” for all “
I”, and therefore, an empirical
minimizer “
F
S
(3)
(
e
j
) = 0” with “
F
(3)
(
e
j
) = 1/2 > 0 =
F
(3)
(0)” can be obtained.
Consequently, it can be observed that the empirical values “
F
S
(3)
(h)” are
not uniform while converging as expected, and empirical minimization does
not guarantee a solution to the learning problem. Furthermore, one could
potentially generate
a sharper counter-example, wherein the “
unique
empirical minimizer
h
ˆ
S
” is nowhere close to the optimal expected value. In
order to accomplish this, “
f
(3)
(·)” must be augmented with the use of “a
small term which ensures its empirical minimizer is unique, and not too
close to the origin”. Considering the equation below where “
ε = 0.01”.
“f
(4)
(h;(x,α)) = f
(3)
(h;(x,α))+ε∑2−i(h[i]−1)
2
”
The objective continues to be convex and “
(1 + ε)” is still “Lipschitz”. In
addition, since the added term is strictly convex, the “
f
(4)
(h;z)” will also be
strictly convex with respect to “
h”
and that is the reason for the empirical
minimizer being unique.
Considering the same distribution over “
z: x = 0” while “
α[i]” are
independent and identically distributed uniform 0 or 1. The minimizer of
“
F
S
(4)
(h)” is referred to as the empirical minimizer which is subjected to
the constraints
“|h| ≤ 1”. The good news is that although the identification
of the solution for such a constrained optimization problem is complicated,
it is not mandatory. It is sufficient to depict that “the
optimum of the
unconstrained optimization problem
h
∗
UC
= arg minF
S
(4)
(h) (with no
constraining
h
∈
H ) has norm
|h
∗
UC
| ≥ 1”.
It should be noted that “in the unconstrained problem, wherein α
i
[
j] = 0 for
all
i = 1...
n, only the second term of
f
(4)
depends on
h[
j] and we have
h
∗
UC
[
j] = 1”. As it could happen for certain coordinate “
j”, it can be concluded
that “the solution to the constrained optimization
problem lies on the
boundary of
H , that is
|hˆ S |= 1”, which can be represented by the
equation shown in the picture below while “
F
∗
≤
F(0) = ε”.