obs
ADEXP
CONEXP
RATIO
1
87957.00
13599.00
0.006468
2
23578.00
4699.000
0.005018
3
16345.00
5473.000
0.002986
4
6550.000
6119.000
0.001070
5
10230.00
8811.000
0.001161
6
9127.000
1142.000
0.007992
7
1675.000
143.0000
0.011713
8
1110.000
138.0000
0.008043
9
3351.000
85.00000
0.039424
10
1140.000
108.0000
0.010556
11
6376.000
307.0000
0.020769
12
4500.000
1545.000
0.002913
13
1899.000
943.0000
0.002014
14
10101.00
369.0000
0.027374
15
3831.000
285.0000
0.013442
16
99528.00
1052.000
0.094608
17
15855.00
862.0000
0.018393
18
8827.000
84.00000
0.105083
19
54517.00
1174.000
0.046437
20
49593.00
2531.000
0.019594
21
39664.00
408.0000
0.097216
22
327.0000
295.0000
0.001108
23
22549.00
488.0000
0.046207
24
416422.0
19200.00
0.021689
25
14212.00
94.00000
0.151191
26
54174.00
5320.000
0.010183
27
20218.00
357.0000
0.056633
28
11041.00
159.0000
0.069440
29
22542.00
244.0000
0.092385
Note:
ADEXP
=
Advertising expenditure (£, millions)
CONEXP
=
Total consumer expenditure (£, millions)
TABLE 6.10
Advertising
Expenditure and
Total Expenditure
(in £ millions) for
29 Product
Categories in the
U.K.
Source: http://www.
Economicswebinstitute.org/
ecdata.htm.
guj75772_ch06.qxd 07/08/2008 07:00 PM Page 181
182
Part One
Single-Equation Regression Models
Appendix
6A
6A.1
Derivation of Least-Squares Estimators
for Regression through the Origin
We want to minimize
ˆ
u
2
i
=
(
Y
i
− ˆ
β
2
X
i
)
2
(1)
with respect to
ˆ
β
2
.
Differentiating (1) with respect to
ˆ
β
2
, we obtain
d
ˆ
u
2
i
d
ˆ
β
2
=
2
(
Y
i
− ˆ
β
2
X
i
)(
−
X
i
)
(2)
Setting Eq. (2) equal to zero and simplifying, we get
ˆ
β
2
=
X
i
Y
i
X
2
i
(6.1.6
)
=
(3)
Now substituting the PRF:
Y
i
=
β
2
X
i
+
u
i
into this equation, we obtain
ˆ
β
2
=
X
i
(
β
2
X
i
+
u
i
)
X
2
i
(4)
=
β
2
+
X
i
u
i
X
2
i
[
Note: E
(
ˆ
β
2
)
=
β
2
.
] Therefore,
E
(
ˆ
β
2
−
β
2
)
2
=
E
X
i
u
i
X
2
i
2
(5)
Expanding the right-hand side of Eq. (5) and noting that the
X
i
are nonstochastic and the
u
i
are ho-
moscedastic and uncorrelated, we obtain
var (
ˆ
β
2
)
=
E
(
ˆ
β
2
−
β
2
)
2
=
σ
2
X
2
i
(6.1.7)
=
(6)
Incidentally, note that from Eq. (2) we get, after equating it to zero,
ˆ
u
i
X
i
=
0
(7)
From Appendix 3A, Section 3A.1, we see that when the intercept term is present in the model, we get
in addition to Eq. (7) the condition
ˆ
u
i
=
0
.
From the mathematics just given it should be clear why
the regression through the origin model may not have the error sum,
ˆ
u
i
, equal to zero.
Suppose we want to impose the condition that
ˆ
u
i
=
0. In that case we have
Y
i
= ˆ
β
2
X
i
+
ˆ
u
i
(8)
= ˆ
β
2
X
i
,
since
ˆ
u
i
=
0 by construction
This expression then gives
ˆ
β
2
=
Y
i
X
i
(9)
=
¯
Y
¯
X
=
mean value of
Y
mean value of
X
But this estimator is not the same as Eq. (3) above or Eq. (6.1.6). And since the
ˆ
β
2
of Eq. (3) is
unbiased (why?), the
ˆ
β
2
of Eq. (9) cannot be unbiased.
The upshot is that, in regression through the origin, we cannot have both
ˆ
u
i
X
i
and
ˆ
u
i
equal
to zero, as in the conventional model. The only condition that is satisfied is that
ˆ
u
i
X
i
is zero.
guj75772_ch06.qxd 07/08/2008 07:00 PM Page 182
Chapter 6
Extensions of the Two-Variable Linear Regression Model
183
Recall that
Y
i
= ˆ
Y
i
+ ˆ
u
i
(2.6.3)
Summing this equation on both sides and dividing by
N
, the sample size, we obtain
¯
Y
= ¯ˆ
Y
+ ¯ˆ
u
(10)
Since for the zero intercept model
ˆ
u
i
and, therefore
¯ˆ
u
, need not be zero, it then follows that
¯
Y
= ¯ˆ
Y
(11)
that is, the mean of actual
Y
values need not be equal to the mean of the estimated
Y
values; the two
mean values are identical for the intercept-present model, as can be seen from Eq. (3.1.10).
It was noted that, for the zero-intercept model,
r
2
can be negative, whereas for the conventional
model it can never be negative. This condition can be shown as follows.
Using Eq. (3.5.5
a
), we can write
r
2
=
1
−
RSS
TSS
=
1
−
ˆ
u
2
i
y
2
i
(12)
Now for the conventional, or intercept-present, model, Eq. (3.3.6) shows that
RSS
=
ˆ
u
2
i
=
y
2
i
− ˆ
β
2
2
x
2
i
≤
y
2
i
(13)
unless
ˆ
β
2
is zero (i.e.,
X
has no influence on
Y
whatsoever). That is, for the conventional model,
RSS
≤
TSS, or,
r
2
can never be negative.
For the zero-intercept model it can be shown analogously that
RSS
=
ˆ
u
2
i
=
Y
2
i
− ˆ
β
2
2
X
2
i
(14)
(
Note:
The sums of squares of
Y
and
X
are not mean-adjusted.) Now there is no guarantee that this
RSS will always be less than
y
2
i
=
Y
2
i
−
N
¯
Y
2
(the TSS), which suggests that RSS can be
greater than TSS, implying that
r
2
, as conventionally defined, can be negative. Incidentally, notice that
in this case RSS will be greater than TSS if
ˆ
β
2
2
X
2
i
<
N
¯
Y
2
.
6A.2
Proof that a Standardized Variable
Has Zero Mean and Unit Variance
Consider the random variable (r.v.)
Y
with the (sample) mean value of
¯
Y
and (sample) standard devi-
ation of
S
y
. Define
Y
∗
i
=
Y
i
− ¯
Y
S
y
(15)
Hence
Y
∗
i
is a standardized variable. Notice that standardization involves a dual operation: (1) change
of the origin, which is the numerator of Eq. (15), and (2) change of scale, which is the denominator.
Thus, standardization involves both a change of the origin and change of scale.
Now
¯
Y
∗
i
=
1
S
y
(
Y
i
− ¯
Y
)
n
=
0
(16)
since the sum of deviation of a variable from its mean value is always zero. Hence the mean value of
the standardized value is zero. (
Note:
We could pull out the
S
y
term from the summation sign because
its value is known.)
Now
S
2
y
∗
=
(
Y
i
− ¯
Y
)
2
/
(
n
−
1)
S
2
y
=
1
(
n
−
1)
S
2
y
(
Y
i
− ¯
Y
)
2
(17)
=
(
n
−
1)
S
2
y
(
n
−
1)
S
2
y
=
1
guj75772_ch06.qxd 07/08/2008 07:00 PM Page 183
184
Part One
Single-Equation Regression Models
Note that
S
2
y
=
(
Y
i
− ¯
Y
)
2
n
−
1
which is the sample variance of
Y.
6A.3 Logarithms
Consider the numbers 5 and 25. We know that
25
=
5
2
(18)
We say that the
exponent
2 is the
logarithm
of 25 to the
base
5. More formally, the logarithm of a
number (e.g., 25) to a given base (e.g., 5) is the power (2) to which the base (5) must be raised to ob-
tain the given number (25).
More generally, if
Y
=
b
X
(
b
>
0)
(19)
then
log
b
Y
=
X
(20)
In mathematics the function (19) is called an
exponential function
and the function (20) is called the
log-
arithmic function
. As is clear from Eqs. (19) and (20), one function is the inverse of the other function.
Although any (positive) base can be used, in practice, the two commonly used bases are 10 and the
mathematical number
e
= 2.71828 . . . .
Logarithms to base 10 are called
common logarithms
. Thus,
log
10
100
=
2
log
10
30
≈
1
.
48
That is, in the first case, 100 = 10
2
and in the latter case, 30
≈
10
1.48
.
Logarithms to the base
e
are called
natural logarithms
. Thus,
log
e
100
≈
4
.
6051
and
log
e
30
≈
3
.
4012
All these calculations can be done routinely on a hand calculator.
By convention, the logarithm to base 10 is denoted by the letters log and to the base
e
by ln. Thus,
in the preceding example, we can write log 100 or log 30 or ln 100 or ln 30.
There is a fixed relationship between the common log and natural log, which is
ln
X
2.3026 log
X
(21)
That is, the natural log of the number
X
is equal to 2.3026 times the log of
X
to the base 10. Thus,
ln 30
2.3026 log 30
2.3026 (1.48)
3.4012 (approx.)
as before. Therefore, it does not matter whether one uses common or natural logs. But in mathemat-
ics the base that is usually preferred is
e
, that is, the natural logarithm. Hence, in this book all logs are
natural logs, unless stated explicitly. Of course, we can convert the log of a number from one basis to
the other using Eq. (21).
Keep in mind that logarithms of negative numbers are not defined. Thus, the log of (
−
5) or the ln
(
−
5) is not defined.
Some properties of logarithms are as follows: If
A
and
B
are any positive numbers, then it can be
shown that:
1.
ln (
A
B
)
ln
A
ln
B
(22)
That is, the log of the product of two (positive) numbers
A
and
B
is equal to the sum of their logs.
2.
ln (
A
B
)
ln
A
ln
B
(23)
guj75772_ch06.qxd 07/08/2008 07:00 PM Page 184
Chapter 6
Extensions of the Two-Variable Linear Regression Model
185
That is, the log of the ratio of
A
to
B
is the difference in the logs of
A
and
B
.
3.
ln (
A
±
B
)
=
ln
A
±
ln
B
(24)
That is,the log of the sum or difference of
A
and
B
is not equal to the sum or difference of
their logs.
4.
ln (
A
k
)
k
ln
A
(25)
That is, the log of
A
raised to power
k
is
k
times the log of
A
.
5.
ln
e
1
Do'stlaringiz bilan baham: |