Diagnostic in Design of Experiments

2.1 Diagnostics for Residuals We shall consider the use of residuals for examining some important types of departures from the simple linear regression model with normal errors. (1) (2) (3) (4) (5)

The regression function is not linear The error terms do not have constant variance The error terms are not independent The model fits all but one or a few outlier observations The error terms are not normally distributed.

The following plots of residuals are generally utilized for this propose: (1) (2) (3) (4) (5) (6)

Plot of residuals against predictor variable Plot of absolute or squared residuals against predictor variable. Plot of residuals against fitted values Plot of residuals against time or other sequence Plot of residuals against omitted predictor variables Normal probability plot of residuals.

2.1.1 Non-linearity of Regression Function Weather a linear function is appropriate for the data being analyzed can be studied from a residual plot against the predictor variable or equivalently, from a residual plot against the fitted values. If the function is linear the residuals fall within a horizontal band centered around 0, displaying no systematic tendencies to be positive or negative. 2.1.2 Non Constancy of Error Variance Plots of the residuals against the predictor variable or against the fitted values are not only helpful to study whether a linear regression function is appropriate but also to examine whether the variance of the error is constant. 2.1.3 Non-normality of Error Terms. (i) Comparison of frequencies When the number of cases is reasonably large is to compare actual frequencies of the residuals against expected frequencies under normality. For examples one can determine whether, say about 68 percent of the residuals ei fall between ± MSE or about 90 % fall between ± 1.645 MSE . (ii) Normal probability plot. Still another possibility is to prepare a normal probability plot of the residuals. Here each residual is plotted against its expected value under normality. A plot that is nearly linear suggests agreement with normality, whereas a plot that departs substantially from linearity suggests that error distribution is not normal. The expected value of the kth smallest observation in a random sample of n is

⎡ ⎛ k − .375 ⎞⎤ MSE ⎢ z ⎜ ⎟⎥ ⎣ ⎝ n + .25 ⎠⎦

622

Diagnostic in Design of Experiments

2.1.4 Presence of Outliers Frequently in regression analysis applications, the data set contains some cases that are outlying or extreme that is, the observations for these cases are well separated from the remainder of the data. These outlying cases may involve large residuals and often have dramatic effects on the fitted least squares regression function. Tests for outliers A simple test for identifying an outlier involves fitting a new regression line to the other n-1 observations. The suspect observation, which was not used in fitting the new line, can now be regarded as a new observation. One can calculate the probability that in n observations, a deviation from the fitted line as great as that of the outlier will be obtained by chance. If this probability is sufficiently small, the outlier can be rejected as not having come from the same population as the n-1 observations. Other wise, the outlier is retained. There is a number of other test statistics available for detecting outliers in regression. 3. Diagnostics in Design of Experiments In section 2 we dealt with some diagnostics in regression analysis. These techniques can be applied to design of experiments. However, in this section we discuss the problem of outliers in detail, since the test statistics for dealing with outliers in regression, cannot be applied directly to design of experiments because of some problems. 3.1 Detection of outliers The fact that an observation is an outlier, that is, it provides a large residual when the chosen model is fitted to the data, does not necessarily mean that the observation is an influential one with respect to the fitted equation. When an outlier is omitted from the the analysis, the fitted equation may change hardly at all. An example given by Andrews and Pregibon (1978), using the data from Micky, Duan, and Clark(1967), illustrates the point well. Agricultural field experiments are always laid out using standard experimental designs. There are a number of standard designs available in the literature. A few of them are generally used in the field conditions. Interest of the scientists now a days is towards the evolvement of optimal designs, but one of the most important parts of the experiments, i.e., presence of outliers is being ignored. The consequences of the presence of outliers are well known. Even a single outlier may alter the inference to be drawn from the experiments. Agricultural field data definitely contain outliers owing to different reasons. Yet no attention is paid to this important problem. Cook(1977) introduced a statistic to indicate the influence of an observation with respect to a particular model. This statistic is used extensively in linear regression diagnostics. This statistic is also very useful to assess the degree of influential for a subset of parameters. Though the general set up of an experimental design is that of a linear model, yet Cookstatistic cannot be applied as such for testing outliers to this field because of some problems. In experimental designs, particularly in varietal designs, the design matrix x has does not have full column rank; this unique estimation of parameters is not possible.

623

Diagnostic in Design of Experiments

Moreover, in experimental designs the interest of the experimenter is only in a subset of parameters rather than the whole set of parameters. For example, in block designs, we are interested in the estimation of treatment contrasts only, other parameters such as block effects and general mean are considered as nuisance parameters. One may, therefore, be interested to see the effect of an outlying observation on the estimation of treatment contrasts. Therefore, there is a need to develop this statistic appropriately for this field.

3.1.1 Development of test-statistics Cook-statistic in general model for experimental designs Consider the general linear model y = Xβ + ε ; E ( e ) = 0 ; D(e) = σ 2 I n , σ 2 > 0 where y is an nx1 vector of observations, X is an nxp full rank matrix of known constants, θ is a px1 vector of unknown parameters, and e is an nx1 vector of independent random variables each with zero mean and variance σ 2 > 0 . To determine the degree of influence the i-th data point has on the estimate θ a natural first step would be to compute the least squares estimate of θ with the point deleted. Accordingly, let θˆ(i ) denote the least squares estimate of θ with the i-th point deleted. An easily interpretable measure of the distance θˆ(i ) from θˆ is given by Cook (1977) as

−1 ′ ( θˆ − θˆ(i ) ) [D(θˆ)] (θˆ − θˆ(i ) ) D = Rank [D(θˆ)] i

The statistic provides a measure of distance between θ and θˆ(i ) in terms of descriptive levels of significance, because Di is actually (1-α) x 100% confidence ellipsoid for the vector under normal theory, which satisfies Di ≤ F ( p, n − p, (1 − α )) . Extension of Di for more than one outlier is strait forward. For usual interpretation of Cook-statistic see Cook (1977,1979). Now we consider the general linear model, for an experimental design d (say). The model is same except that the rank of X is now m(

( )

the set of all (v-1) orthonormalized contrasts for the parameters θ 1 be given by P θ 1 . Let t-observations be suspected of being outliers in the sense that their expected values are shifted from the expected values of other observations. We keep t known. We also assume that the residual design obtained after deleting t observations from the original

624

Diagnostic in Design of Experiments

(

)

design d, remains connected, i.e., Rank C θ1(t ) = v-1 and the best linear unbiased estimator of the set of all orthonormalized contrasts of θ 1 is given by Pθˆ 1(t ) . We give the Cook- statistic for the contrasts Pθ1 of θ 1 in experimental designs as: Definition: Cook-statistic for the contrasts P θ 1 is given by ′ −1 Pθˆ − Pθˆ 1(t ) D(Pθˆ ) Pθˆ − Pθˆ 1(t ) Dt = Rank D(Pθˆ )

)[

(

] ( [ ]

)

and σˆ 2 is replaced for σ 2 . As stated earlier this provides a measure of the distance between θˆ 1(t ) and θˆ in terms of descriptive levels of significance. Suppose, for example,

Di ~ f(p, n-p. .5), then the removal of the ith data point moves the least squares estimates to the edge of the 50% confidence region for θˆ based on θˆ 1(t ) . Such a situation may be cause for concern. For an uncompleted analysis one would like each θˆ 1(t ) to stay well within a 10% say, confidence region. If we denote by ri* and ti the ordinary and Studentized residuals respectively, for the outlying observation, then

ri∗

= yi − yˆ i and t i =

element of the matrix V = X(X’X)-X’ .

ri∗

σ vii

where vii is the first diagonal

sii t i2 Then D1 can be written as Di = vii p − 1 Remark: The square of the Studentized residual, i.e., ti2 , which is also a monotonic function of corresponding residual is an outlier measure. Very high value of t12 arises due to a very extreme observation. Also note that sii is the variance of the first treatment contrast of the set Pθ 1 and vii is the variance of r1∗. Thus sii/vii measures the relative sensitivity of the estimate of that contrast to the potential outlying values at each data point. A large value of sii/vii has a large effect on the estimation of treatment contrasts. Thus the two measures combine to produce a measure of overall impact any single point has on the least squares solutions (see, Cook (1977)). Example: Table 1 contains the grain yield of rice variety IR8 with six different rates of seeding, from a RCD experiment with four replications. Table 1 5113 5346 5272 5164 4804 5254

5398 5952 5713 4831 4848 4542

5307 4719 5483 4986 4432 4919

4678 4264 4749 4410 4748 4098

625

Diagnostic in Design of Experiments

Through the analysis of variance of this data, we get the calculated F value is 2.17, which is not significant both at 5% and 1% level of significance. That means that there is no significance difference among the treatment effects. We now compute Cook-statistic for each of these observations. The observations are numbered in the ascending order of treatments and blocks. Table 2 presents Cookstatistic for these observations (upto 3 decimals). Table 2 Obs. No. Di Obs. No. Di Obs. No. DI 1 .042 9 .051 17 .010 2 .000 10 .022 18 .012 3 .027 11 .025 19 .081 4 .000 12 .007 20 .249 5 .005 13 .013 21 .119 6 .379 14 .070 22 .166 7 .129 15 .014 23 .038 8 .110 16 .000 24 .018 From Table 2, it is clear that the observation number 6 pertaining to second treatment in the second block and the observation number 20 pertaining to fifth treatment in the fourth block stand out. Removal of the first point, i.e., the observation number 6 will move the least squares estimate to the edge of 50% confidence region around Pτˆ , which is a matter of concern. Similarly, removal of the second point moves the least squares estimate to the edge of 40% confidence region around Pτˆ , which is again a matter of concern. Thus we see that these two observations have a great influence on the estimation of treatment contrasts, the prime interest of an experimenter. These two observations alone probably influenced the whole analysis of variance. Naturally the next question arises what we should do with these two observations. The data have been reanalyzed after deleting these two observations, one at a time and then both the observations. After deleting the observations, the data is treated as nonorthogonal and carried out the analysis. Calculated F-distribution value after deleting the observation number 6 is 3.19, which is significant at 5% level. Thus we see that the treatments are significantly different from each other. Calculated CD values are 411.14 and 444.08 between two unaffected treatments and between one affected and one unaffected treatment; means that observation number 6 is an influential observation. Similarly, calculated F-distribution value for the case number 20 is3.44, Which is again significant at 5% level, CD values for this case are 452.269 and 480.507, which means that observation number 20 is an influential observation. 3.1. 2 AP- statistic in General Linear Model for Experimental Designs In this section we present another useful statistic, AP - statistic, given by Andrews and Pregibon (1978) in regression model, however, this statistic cannot be applied as such to 626

Diagnostic in Design of Experiments

experimental designs situation because in case of experimental designs the design matrix is singular. Thus this statistic is appropriately developed for this field as follows: Definition: AP- statistic in experimental designs is given as, ⎛ r ′ (U ′MU )− 1 r2 ⎞⎟ APt = U ′MU ⎜ 1 − 2 ⎜ ⎟ RSS ⎝ ⎠ where the matrices U and M depend on the design concerned, r2 is the residual vector for the outlying observations and RSS is the residual sum of squares.

Following, Draper and John(1981), the first factor involves only independent variables and provides a measure of the remoteness of the set of observations in the factor space; smaller value indicating more remote point. The second factor will be small if r2 is large and so identifies set of outliers (see also John and draper,1978). 3.2 Outliers in Some Specific Designs In last section, we have derived some test-statistics for outlier detection in general linear models. In the present section, we apply these test statistics to designs for both one-way and two-way elimination of heterogeneity setting. 3.2.1 Outliers in Designs for One-Way Elimination of Heterogeneity. 3.2.2 Some Preliminaries Consider the usual intra-block model of n-observations y = µ1 + ∆′ τ + D′ β + e with E (e) = 0 and D(e) = σ 2 I n .

Here ∆′ is an n × v (0-1) design matrix for treatments, D′ is an n × b (0-1) design matrix for blocks, µ is the general mean, τis a v-component vector of treatment effects and β is a b-component vector of block effects. Also, ∆′1 = 1, D′1 = 1, ∆1 = r, D1 = k, where r = (r1,..., rv)′ and k = (k1,...,kb)′ are the vectors of replications and block sizes respectively. We also define the following matrix N = ∆D′ = ((nij)), where the non-negative integers nij denote the number of times i-th treatment appears in the j-th block. Single Outlier. Cook-Statistic. Without any loss of generality we assume that the observation pertaining to the first treatment in the first block is an outlier. Then the incidence matrix N can be written as

⎡1 ε ′ ⎤ N=⎢ ⎥ ⎣f N 0 ⎦ where, f is a (v-1) component (0-1) vector of incidence of remaining (v-1) treatments in the first block, ∈ is a (b-1) component (0 -1) vector of incidence of the first treatment in the remaining (b-1) blocks and No is the incidence matrix of the remaining (v-1) treatments in the remaining (b -1) blocks. Now from the Definition the Cook-statistic for the set of treatment contrasts Pτ in block design is given by,

627

Diagnostic in Design of Experiments

D1 =

2 s11 t 1 2 ˆ = δ1 = v11 v − 1 (v − 1)σ 2 (v − 1)σ 2

δˆ1u ′1Su 1δˆ1

k −1 U ′0 Cτ+ U 0 where s11 = 1 k1

s11

and

[

]

⎛ k − 1⎞ ⎟⎟ U 0 = k 1 (k − 1) −1 / 2 ⎜⎜ 1 ⎝ −f ⎠

Remark: The square of the Studentized residual, i.e., t12, which is also a monotonic function of corresponding residual is an outlier measure. Very high value of t12 arises due to a very extreme observation. Also note that s11 is the variance of the first treatment contrast of the set of contrasts and v11 is the variance of r1∗. Thus s11/v11 measures the relative sensitivity of the estimate the first treatment contrast to the potential outlying values at each data point. A large value of v11 has a large effect on the estimation of treatment contrasts. Thus the two measures combine to produce a measure of overall impact any single point has on the least squares solutions (see, Cook,1977).

From the expressions of Cook-statistic given in this section and the Remark, it is clear that a design to be less sensitive, in term of its degree of influence on the estimation of treatment contrasts, must have very less value of Cook-statistic. Thus if we have a choice of a design in a class of competing designs, we must choose such a design which results in a very less value of Cook-statistic, i.e., which automatically takes care of the presence of a single outliers. Let D (v, b, r, k, N) denotes the class of all binary block designs, then we have the following result. Lemma:. In the class of all binary block designs D (v, b, r, k, N) the E-optimal designs are least sensitive to the presence of a single outlier, when one is interested in estimating the treatment differences.

AP-statistic. AP-statistic in block designs for a single outlier can be established as s RSS (v11 − AP1 ) D1 = 11 v11 (v − 1)σˆ 2 Remark: We have seen that very small value of AP-statistic corresponds to outlying

observations. Note v11 is a decreasing function of Uo′ Cτ+ Uo. Thus it is clear that minimization of Uo′ Cτ+ Uo implies maximization of AP1. Thus it is established again that the E-optimal designs are least affected by the presence of a single outlier as per as APstatistic is concerned. 3.2.3 Multiple Outliers. In section 3.1.2 we have studied the effect of a single outlier in experimental designs for one-way elimination of heterogeneity. However, in some data sets, sub-sets of cases can be jointly influential, but individually are uninfluential. In the present section we study the case of multiple outliers.

For testing t-outliers in experimental designs for one-way elimination of heterogeneity expressions for Cook-statistic, and AP-statistic given earlier can directly be used by replacing the matrices B and Cθ1 by corresponding matrices in designs for one-way

628

Diagnostic in Design of Experiments

elimination of heterogeneity. However, more insight of DI in block designs can be obtained once we assume some specific pattern of occurrence of outliers. We shall consider some specific pattern of occurrence of outliners. Case 1. Suppose that all the observations in a block of a binary block design are suspected to be outliers. Without loss of generality we assume that the observations belonging to the 1st block are outliers. Then the denominator of Dt will be undefined. In such situation Cook statistic is not defined. Remark: The present case is an example of the situation where we cannot compute Cook-statistic. Thus apart from the problem of rank deficiency in design matrix, the application of Cook-statistic in experimental designs involves another problem just described. However, this type of problem also arises in case of full rank regression model. Cook and Weisberg (1980) suggest to set DI = α in that case. Under such situation AP-statistic is also not defined.

Case 2. Suppose any t observations pertaining to the same treatment of a proper binary block design are suspected to be outliers. Without loss of generality we assume that the t outlying observations belong to the 1st t blocks. Then for an upper bound of Dt can be approximated by Dt ≤

kλt (k − 1)(1 − λt ) (v − 1)σ 2

ˆ2

∑ ri2

Where λt be the largest eigen value of Uo′ Cτ+ Uo. Thus we see that for testing t-outliers which belong to the same treatment, we need to calculate the sum of residuals of corresponding observations, other thing being constant. If this quantity is significant we go for exact test otherwise we abandon the calculation. Moreover, we see that an E-optimal design is again least affected. 3.2.3 Outliers in Designs for Two-Way Elimination of Heterogeneity In this section we study various test-statistics for testing outliers in designs for two-way elimination of heterogeneity. Single Outlier. Let the i-th observation be an outlier. Then in the same manner as we have done for designs for one-way elimination of heterogeneity we get,

Cook-statistic: Di =

U ′i∗CU ∗i

ti2 ∗ ∗⎞ v −1 ⎛ g2 ⎜ g − U ′i CU i ⎟ 1 ⎝ ⎠

where g1 , g2 are some constants depending on the design considered. Remark 3.3. Following in the same manner as in Lemma 3.1 we conclude that E-optimal designs are least sensitive to the presence of a single outlier. 3.3. Robustness of Experimental Designs Against Outliers In last section we have presented some appropriate test-statistics for experimental designs alongwith their application to some specific designs. Before analyzing the data we have

629

Diagnostic in Design of Experiments

to apply these diagnostic procedures to detect outliers, if any. We have also seen that an outlier may not be an influential observation. In that case we need not bother about the presence of outliers. A number of standard designs are available in the literature, having different order of merits and demerits. So, instead of checking each observation whether it is influential or not, one may be interested to adopt a design which is insensitive or robust against the presence of an outlier. An extensive study of such kind against the presence of missing observations is available in the literature. However, robustness study with outliers is very limited. Box and Draper (1975) first introduced the study of robustness against the presence of a single outlier in designed experiments. However, the designs they considered were essentially response surface designs. Gopalan and Dey (1976) extended this study to other block designs in which the design matrix is deficient in rank. Further, Singh et al. (1987) extended this study to designs eliminating heterogeneity in two directions. These authors considered the unbiased estimation of error variance in the presence of a single outlier as the robustness criterion. In the present chapter we have taken up study of robustness of experimental designs. However, we have taken a completely different robustness criterion for our study. We have seen that Cook-statistic is very much useful for detecting an influential observation. We, therefore, take average Cook-statistic as the robustness criterion. However, we have confined our study to a single outlier case. The detail description on different designs considered in this section is available in many textbooks and published papers. 3.3.1

Robustness Against the Presence of a Single Outlier in Designs for One-Way Elimination of Heterogeneity

In this section we study the robustness of designs for one-way elimination of heterogeneity against the presence of a single outlier. To begin with, we develop a suitable robustness criterion and then find out designs, which are robust according to this criterion. Robustness criterion. We recall the Cook-statistic for testing a single outlier in designs for one-way elimination of heterogeneity, s t2 Di = ii i vii v − 1 Here, we assumed that the i-th observation is an outlier and accordingly the statistic Di is rewritten. Observe that t2i is an increasing function of sii and t2i, being a function of residual, is an outlier measure. Very high value of t2i corresponds to an outlying observation and combining with sii measures the influence of an outlier. An outlier may occur in any of the n-observations. A design for which the average of Cook-statistic over all possible outliers is minimum, may be termed as robust design against the presence of a single outleir. Average of Cook-statistic. The average Cook-statistic D as, sii t i2 1 D= ∑ n vii v − 1

630

Diagnostic in Design of Experiments

Clearly, D is a weighted sum of squares of ti. D will be minimum when the weights Sii/vii are all equal or in other words, sii are all equal. Thus the study of robustness of a design requires the computation of the elements sii; ∀ i =1,...,n, the diagonal elements of the matrix S . This actually the variance – covariance matrix of the set of treatment contrasts. A design will be robust if all the components of the set of treatment contrasts are estimated with the same variance. The p-th component of the set is τ i − ∑ nij

τi

n. j

, where the p-th observation pertains to the i-th treatment in

the j-th block and nij . If the least squares estimator of τ i − ∑ nij

τ i − ∑ nij

τi n. j

τi n. j

is pij, then pij =

. Thus any design will be robust if and only if var (pij) is a constant

independent of i and j. These ideas are used in characterizing robust designs. (i) Designs with proportional frequencies: Designs with proportional frequencies are ni. n. j characterized by the property nij = . Using this fact and doing some algebra, it can n 1 1 easily be shown that, Var (pij)/σ2 = − . Thus any design with proportional ni. n frequencies cannot be robust except in the special case where all ni. are equal. The robustness of randomized block designs (RBD) follows from this. (ii) Balanced binary designs: For a binary design nij = 0 or 1. A balanced design is one which permits the estimation of all elementary contrasts among the treatment effects with equal variance. It is known that a necessary and sufficient condition for a design to be balanced is that its C-matrix hs all the diagonal elements equal and all the off-diagonal elements equal. Let the off-diagonal element of the C τ -matrix of a balanced binary n. j − 1 , which is a constant block design be -α. Then it can be shown that, Var (pij)/σ2 = n. j vα

if and only if n.j is a constant for all j. Since the design is balanced binary, this implies that ni. must also be a constant for all i. Hence a balanced binary design is robust if and only if it is equi-block sized and equi-replicate, i.e., if and only if it is a BIB design. (iii) Partially balanced incomplete block (PBIB) designs with two-associate classes: Consider a two associate class PBIB design with usual parameters v, b, r, k, λi, ni, pijk (i, j, k =1, 2). Let D be the class of all 2-associate class PBIB designs satisfying the following block structure: (A) : For any treatment i, appearing in the j-th block, the number of first associates of i occurring in the same block is a constant (say g) independent of i and j.

After some algebra it can be shown easily that for a two associate class PBIB design belonging to D, Var (pij)/σ2 = {(k-1) B1 + gA2}/∆ , which is a constant , where A and B 631

Diagnostic in Design of Experiments

are some constants depending on the design parameters. Thus all two-associate class PBIB designs satisfying the block structure in (A) are robust. Now we present some examples of two-associate class PBIB designs having the block structure in (A) in sequel. (a) All non-group divisible two-associate PBIB class designs with λ2 = 0: Since λ2=0 any two treatments which are mutually second associates do not occur together in any block. Thus all the treatments appearing in any block are first associates, which gives g = k-1, since the block size is fixed. Thus any two-associate class PBIB design with λ2 = 0 statisfies the block structure in (A). (b) All semi-regular group-divisible (GD) designs: It is well known that for a semiregular GD design, k = cm, where c is an integer and every block contains c treatments from each group. We know that treatments belonging to any group are first associates. Thus we get g = c - 1. Hence such designs satisfy the block structure in (A). (c) All triangular PBIB designs satisfying r+(n-4)λ1-(n-3)λ2=0: It is well known that if in a triangular design, one of the eigen-values of NN′-matrix is θ1 = r + (n-4) λ1 - (n-3) λ2 = 0, then 2k is divisible by n and every block of the design contains 2k/n treatments from each of the n rows of the association scheme. Thus such designs satisfy the block structure in (A) and clearly for such designs g = (4k/n)-2. (d) All latin-square type designs with two-constraints (L2) satisfying r + (s - 2) λ1 - (s - 1) λ2 = 0 : L2 type PBIB design is based on the latin-square association scheme with v=s2 treatments. Further if one of the eigen-values of NN′ matrix is θ1 = r+(s-2)λ1-(s1)λ2=0, then k is divisible by s and in such case every block of the design contains k/s treatments from each of the rows (columns) of the association scheme. Thus, such type of designs satisfy the block structure in (A) and clearly g = (2k/s)-2. 3.3. 2 Robustness Against the Presence of a Single Outlier in Designs for Two-Way Elimination of Heterogeneity In this section we study the robustness of designs for two-way elimination of heterogeneity against the presence of a single outlier. We use the same criterion as developed in section (4.1) for studying the robustness of such designs. Consider the linear model of designs for two-way elimination of heterogeneity. The

sii t i2 1 . Thus, we get the same ∑ n vii v − 1 robustness criterion as developed for designs for one-way elimination of heterogeneity, i.e., all sii should be equal. average Cook-statistic for such designs is, D =

pijl = τˆi − ∑ nij i

τˆi

τˆ − ∑ nil i n. j n.l i

where the p-th observation pertains to the i-th treatment in the j-th row and l-th column, and N1= ((nij)), N2=((nil)). Thus any design will be robust if and only if var (pijl) is a constant independent of i, j and l. These ideas are used in characterizing robust designs.

632

Diagnostic in Design of Experiments

(i) Designs with proportional frequencies: Designs with proportional frequencies are ni. xn. j n xn characterized by the property nij = and nil = i. .l . Using this fact and doing n n 1 1 some algebra it can be shown that i.e. Var (pijl)/σ2 = − . Thus any design with ni. n proportional frequencies cannot be robust except in the special case where ni. are all equal. The robustness of latin squares designs (LSD) follows from this. (ii) Balanced binary row-column designs: For a binary design the elements of the matrices N1 and N2 are 0 or 1 and the row sizes are q each and the column sizes are p each. As mentioned earlier that a balanced design is one which permits the estimation of all elementary contrasts among the treatment effects with equal variance. It is known that a necessary and sufficient condition for a design to be balanced is that all the diagonal elements of the Cτ-matrix are equal and all the off-diagonal elements of the C τ -matrix are equal. Let the off-diagonal element of C τ -matrix of a balanced binary design be -α and assume that between a row and column λ treatments are common which is fixed for all j and l, then 2 2 ⎧ ⎛ n. j + n.l ⎞ n. j − λ − 1 n.l − λ − 1 ⎫⎪ 1 ⎪⎛⎜ 1 1 ⎞⎟ 2 ⎟ ⎜ + + Var(pijl)/σ = +λ − ⎨ 1− ⎬ 2 ⎜ n. j n.l ⎟ vα ⎪⎜⎝ n. j n.l ⎟⎠ n n.2l ⎪ ⎠ ⎝ . j ⎩ ⎭ which is a constant if n.j is a constant for all j and n.l is a constant for all l. Since the design is binary, this implies that ni. must also be a constant for all i. Thus a balanced binary design is robust if and only if it is equi-row sized, equi-column sized and equi replicate and having a constant number of treatments common between any row and column. (iii) Some other useful row-column designs: If the treatment vs. rows classification or treatments vs. columns or both treatment vs. rows and treatments vs. columns are orthogonal, then it can be easily shown that Var (pijl)/σ2= constant, hence they are robust. Thus balanced row-column designs in which the treatments vs. rows classification is orthogonal are robust. Robustness of Youden Squares designs follows from this. 4. Remedial Measures 4.1 Non-constant error variances This problem is tackled by variance stabilizing transformations. If the distribution of y is Poisson, we could regress y' = y against x, since the variance of the square root of a Poisson random variable is independent of the mean. As another example, if the response variable is a proportion (0≤ yi ≤1) then the arcsine transformation y' = sin-1 y is appropriate. Several commonly used variance-stabilizing transformations are summarized below: Relationship of σ² to E(y) Transformation σ² α constant σ² α E(y)

y'=y (no transformation) y'= y (square root; Poisson data)

633

Diagnostic in Design of Experiments

σ² α E(y)[1-E(y)]

y'=sin-1( y ) (arcsine; binomial proportions 0 ≤yi ≤1)

σ² α [E(y)]² σ² α [E(y)]3 σ² α [E(y)]4

y'=ln(y) (log) y'=y-½ (reciprocal square root) y'=y-1 (reciprocal)

Transformations also make the distribution of the transformed variable closer to the normal distribution. It is important to detect and correct a non-constant error variance. If this problem is not eliminated the least squares estimators will still be unbiased, but they will no longer have the minimum variance property. This means that the regression coefficients will have larger standard errors than necessary. The effect of the transformation is usually to give more precise estimates of the model parameters and increased sensitivity for the statistical tests. When the response variable has been re-expressed, the predicted values are in the transformed scale. It is often necessary to convert the predicted values back to the original units. Unfortunately, applying the inverse transformation directly to the predicted values gives an estimate of the median of the distribution of the response instead of the mean. It is usually possible to devise a method for obtaining unbiased predictions in the original units. The problem of non-constant error variances can also be tackled using weighted least squares. 4.2 Influential Observations The reason for our concern with outlying cases is that the method of least squares is particularly susceptible to these cases, resulting sometimes in a seriously distorted fitted model for the remaining cases. A crucial question that arises now is how to handle highly influential cases.

(1) A first step is to examine whether an outlying case is the result of a recording error, breakdown of a measurement instrument, or like. If erroneous data can be corrected, this should be done. Often, however erroneous data cannot be corrected later on and should be discarded. Many times, unfortunately, it is not possible after the observations have been obtained to tell whether the observations for an outlying case are erroneous. Such cases should usually not be discarded. If an outlying influential case is not erroneous, the next step should be to examine the adequacy of the model. Scientists frequently have primary interest in the outlying cases because they deviate from the currently accepted model. Examination of these outlying cases may provide important clues as to how the model needs to be modified. Outlying cases may also lead to the finding of other types of model inadequacies, such as the omission of an important variable or the choice of an incorrect functional form. The analysis of outlying influential cases can frequently lead to valuable insights for

634

Diagnostic in Design of Experiments

strengthening the model such that the outlying case is no longer an outlier but is accounted for by the model. Discarding of outlying influential cases that are not clearly erroneous and that cannot be accounted for by the model improvement should be done rarely, such as when the model is not intended to cover the special circumstances related to the outlying cases. (2) Robust Regression Robust regression procedures dampen the influence of outlying cases, as compared to ordinary least squares estimation, in an effort to provide a better fit for the majority of cases. They are useful when a known, smooth regression function is to be fitted to data that are “noisy,” with a number of outlying cases, so that the assumption of a normal distribution for the error term is not appropriate. Robust regression procedures are also useful when automated regression analysis is required. Numerous robust regression procedures have been developed. However, these techniques cannot be applied to the designs of experiments since in this case design matrix is singular.

(a) LAR regression - Least absolute residuals (LAR) regression, also called minimum L1 – norm regression, is an early robust regression procedure. It is insensitive to both outlying data values and inadequacies of the model employed. The method of least absolute residuals estimates the regression coefficients by minimizing the sum of the absolute deviations of the y observations from their means. (b) IRLS Robust Regression - Iteratively reweighted least squares (IRLS) robust regression uses the weighted least squares procedures to dampen the influence of outlying observations. (c) LMS Regression - Least median of squares (LMS) regression replaces the sum of squared deviations in ordinary least squares by the median of the squared deviation, which is a robust estimator of location. References Andrews, D. F. and Pregibon, D.(1978). Finding the outliers that matter. J.Roy. Statist., Ser.B, 40, 87-93. Barnett, V. and Lewis, T. (1984). Outliers in statistical data. New York: John Wiley. Beckmen, R. J., and Cook, R. D. (1983). Outlier.......s (with discussion). Technometrics, 25, 119 - 163. Box, G. E. P. and Draper, N. R. (1975). Robust designs. Biometrika, 62, 347 - 352. Cook, R. D. (1977). Detection of influential in linear regression. Technometrics, 21, 15 18. Cook, R. D. (1979). Influential observations in linear regression. J. Amer. Statist. Assoc., 74, 169 - 174. Cook, R. D. and Wiesberg, S. (1980). Characterisation of emperical influence functions for detecting influential cases in regression. Technometrics, 22, 495 - 508. Dey, A. (1986). Theory of block designs, Wiley Estern Ltd., New Delhi.

635

Diagnostic in Design of Experiments

Dey, A. (1993). Robustness of block designs against missing data. Statistica Sinica, 3, 219 - 231. Draper, N. R. and John, J. A. (1981). Influential observations and outliers in regression. Technometrics, 23, 21 - 26. Gopalan, R., and Dey, A. (1976), On robust experimental designs. Sankhya, Ser. B, 38, 297 - 299. Marshall, A. W., and Olkin, I. (1979). Inequalities: theory of majorization and its applications. p. 247, New York : John Wiley. Mickey, M. R., Dunn, O. J., and Clark, V.(1967).Note on the use of stepwise regression in detecting outliers. Computers & Biomedical Research, 1, 105 - 111. Singh, G., Gupta, S., and Singh, M.(1987). Robustness of row - column designs. Statistics & Probability Letters , 5, 421 - 424. Srikantan, K. S. (1961). Testing for a single outlier in a regression model. Sankhya, Ser. A, 23, 251- 260.

636