proc print data=raw;
run;
Breaking Strength Data from Table 4-8 pg. 150 1
OBS STRENGTH DIAMETER MACHINE
1 36 20 1
2 41 25 1
3 39 24 1
4 42 25 1
5 49 32 1
6 40 22 2
7 48 28 2
8 39 22 2
9 45 30 2
10 44 28 2
11 35 21 3
12 37 23 3
13 42 26 3
14 34 21 3
15 32 15 3
note: This is just a listing of the data from Table 4-8 on page
150 of the textbook. This is how you would enter it into
your .dat file.
-------------------------------------------------------------------------------
proc plot data=raw;
plot strength*diameter=machine;
run;
Plot of STRENGTH*DIAMETER. Symbol is value of MACHINE.
STRENGTH |
|
49 + 1
|
48 + 2
|
47 +
|
46 +
|
45 + 2
|
44 + 2
|
43 +
|
42 + 1 3
|
41 + 1
|
40 + 2
|
39 + 2 1
|
38 +
|
37 + 3
|
36 + 1
|
35 + 3
|
34 + 3
|
33 +
|
32 + 3
|
--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
DIAMETER
note: The above plot is called a M-plot (or multiple plot). It
plots strength versus diameter for each of the three machines
(1, 2, and 3). This provides a little bit more information
than say, Figure 4-2 on page 150. Clearly, as the diameter
increases so does breaking strength. Thus, analysis of covariance
is appropriate here.
note: Recall that one of the assumptions about the "slope" parameter was
that it was the same for all treatments. This plot can be used to
verify this assumption by looking at the different slopes for each
of the three different groups.
-------------------------------------------------------------------------------
proc glm data=raw;
class machine;
model strength = machine diameter / p;
lsmeans machine / stderr tdiff pdiff;
output out=temp p=fit r=resid student=stdresid;
run;
General Linear Models Procedure
Class Level Information
Class Levels Values
MACHINE 3 1 2 3
Number of observations in data set = 15
General Linear Models Procedure
Dependent Variable: STRENGTH
Source DF Sum of Squares F Value Pr > F
Model 3 318.41411043 41.72 0.0001
Error 11 27.98588957
Corrected Total 14 346.40000000
R-Square C.V. STRENGTH Mean
0.919209 3.967776 40.2000000
note: In the above output, the "full" model contains both MACHINE
and DIAMETER. The "reduced" model contains only the overall
MEAN. Thus, the F value is used to test H0: MACHINE effect=0,
DIAMETER effect=0.
Source DF Type I SS F Value Pr > F
MACHINE 2 140.40000000 27.59 0.0001
DIAMETER 1 178.01411043 69.97 0.0001
note: Type I SS represent a "sequential" sum of squares. That is,
SS(MODEL) = SS(MACHINE) + SS(DIAMETER|MACHINE)
= 140.4 + 178.014
= 318.414
Although test statistics are given, these are not the correct F
values to use to test H0: MACHINE effect=0 and
H0: DIAMETER effect=0.
Source DF Type III SS F Value Pr > F
MACHINE 2 13.28385062 2.61 0.1181
DIAMETER 1 178.01411043 69.97 0.0001
note: In the above output the "full" model contains both MACHINE and
DIAMETER. The Type III SS represent the reduction in sum of squares
due to reduced models. For instance, the MACHINE Type III SS
corresponds to a reduced model with only MEAN and DIAMETER while
the DIAMETER Type III SS corresponds to a reduced model with MEAN
and MACHINE. These are the correct F values to use to test
H0: MACHINE effect=0 and H0: DIAMETER effect=0 respectively.
Observation Observed Predicted Residual
Value Value
1 36.00000000 36.43926380 -0.43926380
2 41.00000000 41.20920245 -0.20920245
3 39.00000000 40.25521472 -1.25521472
4 42.00000000 41.20920245 0.79079755
5 49.00000000 47.88711656 1.11288344
6 40.00000000 39.38404908 0.61595092
7 48.00000000 45.10797546 2.89202454
8 39.00000000 39.38404908 -0.38404908
9 45.00000000 47.01595092 -2.01595092
10 44.00000000 45.10797546 -1.10797546
11 35.00000000 35.80920245 -0.80920245
12 37.00000000 37.71717791 -0.71717791
13 42.00000000 40.57914110 1.42085890
14 34.00000000 35.80920245 -1.80920245
15 32.00000000 30.08527607 1.91472393
Sum of Residuals -0.00000000
Sum of Squared Residuals 27.98588957
Sum of Squared Residuals - Error SS -0.00000000
First Order Autocorrelation -0.03469267
Durbin-Watson D 1.93149012
note: This is just a listing of the response, predicted and residuals.
This information is essentially appended to the data set "raw"
and renamed "temp".
General Linear Models Procedure
Least Squares Means
MACHINE STRENGTH Std Err Pr > |T| LSMEAN
LSMEAN LSMEAN H0:LSMEAN=0 Number
1 40.3824131 0.7236252 0.0001 1
2 41.4192229 0.7444169 0.0001 2
3 38.7983640 0.7878785 0.0001 3
note: The above information can be used to do tests of hypotheses or
confidence intervals for a single population mean based on the
adjusted sample means.
T for H0: LSMEAN(i)=LSMEAN(j) / Pr > |T|
i/j 1 2 3
1 . -1.02359 1.430745
0.3280 0.1803
2 1.023592 . 2.283458
0.3280 0.0433
3 -1.43074 -2.28346 .
0.1803 0.0433
NOTE: To ensure overall protection level, only probabilities
associated with pre-planned comparisons should be used.
note: The above information essentially yields the Stage II results.
It gives the T statistics discussed in class and the
corresponding p-values. You can also use this information to
get confidence intervals for differences in means by solving
(by hand) for the appropriate standard errors.
note: Assuming the stage I test was rejected, the final results could
be summarized as follows.
MACHINE3 MACHINE1 MACHINE2
--------------- ---------------
We would not actually conclude this here since the stage I test
above was not rejected. Thus, the correct conclusion is that
there is not difference between the treatments.
-------------------------------------------------------------------------------
glm data=raw;
class machine;
model diameter = machine;
run;
General Linear Models Procedure
Class Level Information
Class Levels Values
MACHINE 3 1 2 3
Number of observations in data set = 15
General Linear Models Procedure
Dependent Variable: DIAMETER
Source DF Sum of Squares F Value Pr > F
Model 2 66.13333333 2.03 0.1742
Error 12 195.60000000
Corrected Total 14 261.73333333
R-Square C.V. DIAMETER Mean
0.252674 16.72925 24.1333333
Source DF Type I SS F Value Pr > F
MACHINE 2 66.13333333 2.03 0.1742
Source DF Type III SS F Value Pr > F
MACHINE 2 66.13333333 2.03 0.1742
note: Recall that one of the assumptions concerning the slope
parameter was that the slope did not depend on the treatment
variable. To validate this assumption consider running an ANOVA
with the covariate variable (DIAMETER) as the response. Since
the F value is 2.03 we fail to reject H0 and conclude that the
covariate does not depend on the treatment. This is also portrayed
in the M-plot above.
-------------------------------------------------------------------------------
proc rank data=temp out=checkass normal=vw;
var resid;
ranks expected;
run;
proc plot data=checkass vpct=50 hpct=50;
plot stdresid*fit = '*' / vref=-2 2 vrefchar='-' box;
plot stdresid*diameter = '*' / vref=-2 2 vrefchar='-' box;
plot stdresid*machine = '*' / vref=-2 2 vrefchar='-' box;
plot expected*resid = '*' / box;
run;
Plot of STDRESID*FIT='*'. Plot of STDRESID*DIAMETER='*'.
-+--------+--------+- -+-----+-----+-----+-
2 +--------------*----+ 2 +-----------*-------+
| | | |
|* | | * |
STDRESID | * | STDRESID | * |
| * | | * |
| * * | | * * |
| | | |
0 + + 0 + +
| * * * | | ** * |
| * * | | ** |
| * * | | * * |
| * | | * |
| * | | * |
| | | |
-2 +-------------------+ -2 +-------------------+
-+--------+--------+- -+-----+-----+-----+-
30 40 50 10 20 30 40
FIT DIAMETER
Plot of STDRESID*MACHINE='*'. Plot of EXPECTED*RESID='*'.
-+--------+--------+- -+-----+-----+-----+-
2 +---------*---------+ 2 + +
| | | |
| *| | * |
STDRESID | *| EXPECTED | * |
|* | | * |
|* * | | ** |
| | | * * |
0 + + 0 + * +
|* * | | ** |
| *| | ** |
|* * | | * |
| *| | * |
| * | | * |
| | | |
-2 +-------------------+ -2 + +
-+--------+--------+- -+-----+-----+-----+-
1 2 3 -2.5 0.0 2.5 5.0
MACHINE RESID
NOTE: 2 obs hidden.
note: These plots indicate that the assumptions of the model are
adequate.
-------------------------------------------------------------------------------
proc glm data=raw;
class machine;
model strength = machine;
run;
General Linear Models Procedure
Class Level Information
Class Levels Values
MACHINE 3 1 2 3
Number of observations in data set = 15
General Linear Models Procedure
Dependent Variable: STRENGTH
Source DF Sum of Squares F Value Pr > F
Model 2 140.40000000 4.09 0.0442
Error 12 206.00000000
Corrected Total 14 346.40000000
R-Square C.V. STRENGTH Mean
0.405312 10.30664 40.2000000
Source DF Type I SS F Value Pr > F
MACHINE 2 140.40000000 4.09 0.0442
Source DF Type III SS F Value Pr > F
MACHINE 2 140.40000000 4.09 0.0442
note: This is the output for a regular one way ANOVA. That is,
an analysis without the covariate term. Note that the conclusion
here is that there is a treatment effect which is the opposite
conclusion made by the analysis of covariance. Hence, the failure
to include the covariate would result in the wrong conclusion.