CHAPTER 3 HYPOTHESIS TESTING

CHAPTER 3 HYPOTHESIS TESTING Expected Outcomes Able to test a population mean when population variance is known or unknown. Able to test the differenc...

0 downloads 173 Views 3MB Size
CHAPTER 3

HYPOTHESIS TESTING Expected Outcomes  Able to test a population mean when population variance is known or unknown.  Able to test the difference between two populations mean when population variances are known or unknown.  Able to test paired data using z-test and t-test.  Able to test population proportion using z-test.  Able to test the difference between two populations proportion using z-test.  Able to test a population variance and test the difference between two populations variances.  Able to determine the relationship between hypothesis testing and confidence interval.  Able to solve hypothesis testing using Microsoft Excel.

PREPARED BY: DR SITI ZANARIAH SATARI & SITI ROSLINDAR YAZIZ

CONTENT 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10

Introduction to Hypothesis Testing Test Hypothesis for Population Mean with known and unknown Population Variance Test Hypothesis for the Difference Population Means with known and unknown Population Variance Test Hypotheses for Paired Data Test Hypotheses for Population Proportion Test Hypotheses for the Difference between Two Population Proportions Test Hypotheses for Population Variance Test Hypotheses for the Ratio of Two Population Variances P-Values in Hypothesis Test Relationship between Hypothesis Tests and Confidence Interval

3.1 INTRODUCTION TO HYPOTHESIS TESTING • A statistical hypothesis is a statement or conjecture or assertion concerning a parameter or parameters of one or more populations. Many problems in science and engineering require that we need to decide either to accept or reject a statement about some parameter, which is a decision-making process for evaluating claims or statement about the population(s). The decision-making procedure about the hypothesis is called hypothesis testing.

3 Methods of Hypothesis Testing

The traditional method The P - value method The confidence interval method

3.1.1 TERMS AND DEFINITION Definition 1a: A null hypothesis, denoted by is a statistical hypothesis that states an assertion about one or more population parameters.

Definition 1b: The alternative hypothesis denoted by is a statistical hypothesis that states the assertion of all situations that not covered by the null hypothesis.

H 0 :   0

TWO TAILED TEST

H1 :    0

H 0 :   0

RIGHT TAILED TEST

H1 :    0

H 0 :   0

LEFT TAILED TEST

H1 :    0

parameter

A value

Types Of Hypothesis Type of Hypothesis Two-tailed test

Hypothesis H 0 :   0 H1 :    0

H 0 :   0

Onetailed test

Right-tailed test

H1 :    0 H 0 :   0

Left-tailed test

H1 :    0

Note: (i) The H 0 should have ‘equals’ sign and H 1 should not have ‘equals’ sign. (ii) The H 0 is on trial and always initially assumed to be true. (iii) Accept H 0 if the sample data are consistent with the null hypothesis. (iv) Reject H 0 if the sample data are inconsistent with the null hypothesis, and accept the alternative hypothesis.

Definition 2: A test statistic is a sample statistic computed from the data obtained by random sampling. 

2 Z test , t test ,  test , f test

Definition 3: The rejection (critical) region α, is the set of values for the test statistics that leads to rejection of the null hypothesis. Definition 4: The acceptance region, 1 – α is the set of values for the test statistics that leads to acceptance of the null hypothesis. Definition 5: The critical value(s) is the value(s) of boundary that separate the rejection and acceptance regions. Definition 6: The decision rule of a statistical hypothesis test is a rule that specifies the conditions under which the null hypothesis may be rejected.  Reject H 0 if test statistics > critical value 6

Type of Test

Two-tailed test

Onetailed test

Righttailed test

Lefttailed test

Hypothesis Rejection Region

H 0 :   0 H1 :    0

H 0 :   0 H1 :    0

H 0 :   0 H1 :    0

Graphical Display

(Hypothesis using test statistic z with   0.05 )

Both sides

Right side

Left side 7

Definition 7: Rejecting the null hypothesis when it is true is defined as Type I error. 



P(Type I error) =

(significance level)

Definition 8: Failing to reject the null hypothesis when it is false in state of nature is defined as Type II error. 

P(Type II error) =



Possible Outcomes: State of Nature

Statistical Conclusion/decision Reject

H 0 is true

H 0 is false

H0

Not to reject

H0

Type I error

Correct decision

Correct decision

Type II error

Example 2 The additive might not significantly increase the lifetimes of automobile batteries in the population, but it might increase the lifetime of the batteries in the sample. In this case, H 0 would be rejected when it was really true, which committing a type I error. While, the additive might not work on the batteries selected for the sample, but if it were to be used in the general population of batteries, it might significantly increase their lifetime. Hence based on the information obtained from the sample, would not reject the H 0 , thus committing a type II error.

Hypothesis testing common phrase  : H1

 : H1

Is greater than Is above Is higher than Is longer than Is bigger than Is increased

Is less than Is below Is lower than Is shorter than Is smaller than Is decreased or reduced from

Is greater than or equal Is at least Is not less than

Is less than or equal Is at most Is not more than

Is equal to Is exactly the same as Has not changed from Is the same as

Is not equal to Is different from Has changed from Is not the same as

 : H0

 : H0

 : H0

 : H1

3.2.1 PROCEDURES OF HYPOTHESIS TESTING Step 1: Formulate a hypothesis and state the claim

Two-tailed test OR

H 0 :   0 H1 :   0

Right-tailed test

OR

Left-tailed test

H 0 :   0

H 0 :   0

H1 :   0

H1 :   0

Step 2: Choose the appropriate test statistic, and calculate the sample test statistic value: 2

Z test , t test ,  test , f test

Step 3: Establish the test criterion by determining the critical value (point) and critical region Significance level value,  Inequality (≠, >, <) used in the H 1 Step 4: Make a decision to reject or not to reject the

H 0.

Step 5: Draw a conclusion to reject or to accept the claim or statement.

Hypothesis Testing: Step by Step Extract all given information Define Parameter Define H0 and H1 Choose appropriate test statistics Find critical value Test the hypothesis (rejection region) Make a conclusion – there is enough evidence to reject/accept the claim at α

3.2: TEST HYPOTHESES FOR POPULATION MEAN, μ WITH KNOWN AND UNKNOWN POPULATION VARIANCE Two-tailed test

Right-tailed test

Left-tailed test

H 0 :   0

H 0 :   0

H 0 :   0

H 1 :   0

H 1 :   0

H 1 :   0

Test Statistics of Hypothesis Testing for Mean μ

ztest

X  0   n ztest

X  0  s n

Where: 0  population mean

NOTE: Ztest and ttest are test statistics

ttest

X  0  s n

The Rejection Criteria (1) i.

If the population variance,  2 is known, the test statistic to be used is

ztest 

x  0 / n

~ z .

Therefore the rejection procedure for each type of hypothesis can be summarised as in following Table 3.4. HHH

HHH Table 3.4: Hypothesis testing for μ with known σ2 H0

H1

H 0 :   0

H1 :   0

H 0 :   0

H1 :   0

H 0 :   0

H1 :   0

Statistical Test

ztest 

x  0 / n

Reject H 0 if

ztest  z 2 or ztest   z 2

ztest  z ztest   z

The Rejection Criteria (2) ii.

If the population variance,  2 is unknown and the sample size is large, i.e. n  30 , then the test statistic to be used is

ztest 

x  0 s/ n

~ z .

Therefore the rejection procedure for each type of hypothesis can be summarised as in following Table 3.5. HHH

HHHTable 3.5: Hypothesis testing for μ with unknown σ2 and n  30 Statistical Test Reject H 0 if H1 H0 ztest  z 2 or ztest   z 2 H 0 :   0 H1 :   0 x  0 ztest  H 0 :   0 H1 :   0 ztest  z s/ n H 0 :   0 H1 :   0 ztest   z

The Rejection Criteria (3) iii.

If the population variance,  2 is unknown and the sample size is small, i.e. n  30 , then the test statistic to be used is

ttest 

x  0 s/ n

~ t ,v

where v  n  1

Therefore the rejection procedure for each type of hypothesis can be summarised as in following Table 3.6. HHH

HHHTable 3.6: Hypothesis testing for μ with unknown σ2 and n  30 Statistical Test Reject H 0 if H1 H0 ttest  t 2,n1 or ttest  t 2,n1 H 0 :   0 H1 :   0 x  0 ttest  t ,n 1 ttest  H 0 :   0 H1 :   0 s/ n t test  t ,n1 H 0 :   0 H1 :   0

Example 3 Most water-treatment facilities monitor the quality of their drinking water on an hourly basis. One variable monitored is pH, which measures the degree of alkalinity or acidity in the water. A pH below 7.0 is acidic, above 7.0 is alkaline and 7.0 is neutral. One water-treatment plant has target a pH of 8.5 (most try to maintain a slightly alkaline level). The mean and standard deviation of 1 hour’s test results based on 31 water samples at this plant are 8.42 and 0.16 respectively. Does this sample provide sufficient evidence that the mean pH level in the water differs from 8.5? Use a 0.05 level of significance. Assume that the population is approximately normally distributed.

Solution: Step 1: Formulate a hypothesis and state the claim. X: pH level in the water H 0 :   8.5 H1 :   8.5

 claim 

Example 3: solution Step 2: Choose the appropriate test statistic and calculate the sample test statistic value. Since  2 is unknown, i.e. s2  0.162 and n  30 , the test statistic is ztest 

x  0 8.42  8.5   2.7839 . 0.16 s/ n 31

Step 3: Establish the test criterion by determining the critical value and rejection region.

H0

H1

H 0 :   0

H1 :   0

H 0 :   0

H1 :   0

H 0 :   0

H1 :   0

Statistical Test

ztest 

x  0 s/ n

Reject H 0 if

ztest  z 2 or ztest   z 2

ztest  z ztest   z 19

Example 3: solution Step 3: Establish the test criterion by determining the critical value and rejection region. H0

H1

H 0 :   0

H1 :   o

Statistical Test

ztest 

x  0 s/ n

Reject H 0 if

ztest  z 2 or ztest   z 2

Given   0.05 and the test is two-tailed test, hence the critical values are z0.025  1.9600 and  z0.025  1.9600 . Step 4: Make a decision to reject or fail to reject the H 0 . Since  ztest  2.7839   1.96   z0.025  , then we reject H 0 .

Step 5: Draw a conclusion to reject or to accept the claim or statement. At   0.05 , the sample provide sufficient evidence that the mean pH level in the water differs from 8.5. 20

3.3 TEST HYPOTHESES FOR THE DIFFERENCE BETWEEN TWO POPULATIONS MEAN

H 0 : 1  2  0 H1 : 1  2  0

H 0 : 1  2  0 H1 : 1  2  0 H 0 : 1  2  0 H1 : 1  2  0

Two-tailed test

Right-tailed test

Left-tailed test

Test Statistics for the Difference between Means ztest 

 x1  x2   o  12 n1

ztest 

 x1  x2   o sp

1 1  n1 n2

z

ttest 

 x1  x2   o sp

1 1  n1 n2

(n1  1) s12  (n2  1) s22 sp  n1  n2  2

ttest 

t ,n1 n2 2

ztest 

 x1  x2   o 2 1

2 2

s s  n1 n2

z



z

 22 n2

 x1  x2   o s12 s22  n1 n2 2

 s12 s22     n n    12 2  2  s12   s22       n1    n2  n1  1 n2  1

t ,

Example 4 The overall distance travelled of a golf ball is tested by hitting the ball with the golf stick. Ten balls selected randomly from two different brands are tested and the overall distance is measured and the data is given as follows. Overall distance travelled of golf ball (in meters) Brand 1 251 262 263 248 259 248 255 251 240 244 Brand 2 236 223 238 242 250 257 248 247 240 245 By assuming that both population variances are unequal, can we say that both brands of ball have similar average overall distance? Use α = 0.05.

Example 4: solution Step 1: X1 : Overall distance travelled of golf ball from brand 1 X 2 : Overall distance travelled of golf ball from brand 2

The hypothesis is H 0 : 1  2  0 ( claim) H1 : 1  2  0

Step 2: Statistic

Brand 1 10 252.1 7.6077

n x s

Brand 2 10 242.6 9.2640

Since 12 and  22 are unknown, 12   22 , and n1  30, n2  30 , then the test statistic is

ttest 

( x1  x2 )  0 s12 n1



s22 n2



(252.1  242.6)  0 2

7.6077 9.2640  10 10

2

 2.5061

Example 4: solution Step 3: Given   0.05 and the test is two-tailed test. The critical value is t 2

, ν

 t0.025, 17  2.1098 2

2

 s12 s22   7.60772 9.26402        n n 10 10  1 2    where ν    17.3442  17 2 2 2 2 2 2 2 2  s1   s2   7.6077   9.2640           n1    n2   10    10  n1  1 n2  1 9 9

Step 4: Since  ttest  2.5061   t0.025, 17  2.1098  , H 0 is rejected. Step 5: At   0.05 , there is no significant evidence to support that both brands of ball have similar average overall distance

3.4 TEST HYPOTHESES FOR PAIRED DATA Two-tailed test

Test Statistics

H 0 :  D  0 H1 :  D   0

Right-tailed test

H 0 :  D  0 H1 :  D   0 Left-tailed test

H 0 :  D  0 H1 :  D   0

ttest where

xD   0  ~ t ,v sD / n

D  X1  X 2 differences

between the paired sample n is number of paired sample

xD , sD , are the mean and

standard deviation for the difference of paired sample, respectively

v  n  1, degrees of freedom

 D  1  2 is population mean difference,     D  

Example 5 A new gadget is installed to air conditioner unit(s) in a factory to minimize the number of bacteria floating in the air. The number of bacteria floating in the air before and after the installation for a week in the factory is recorded as follows. Before

10.1

11.6

12.1

9.1

10.3

15.3

13.0

After

11.2

8.5

8.4

8.4

8.0

7.6

7.2

Is it wise for the factory management to install the new gadget? By assuming the data is approximately normally distributed, test the hypothesis at 5% level of significance.

Example 5: solution H 0 : D  0 H1 :  D  0 (wise to install the new gadget) where  D  Before-After Before, X 1

10.1

11.6

12.1

9.1

10.3

15.3

13.0

After,

X2

11.2

8.5

8.4

8.4

8.0

7.6

7.2

D  X1  X 2

-1.1

3.1

3.7

0.7

2.3

7.7

5.8

xD  3.1714 sD  2.9669

t0.05,6  1.943

3.1714  0  2.8281 2.9669 / 7 Since  ttest  2.8281   t0.05,6  1.943.

ttest 

H 0 is rejected.

It is wise for the factory management to install the new gadget, at 5% level of significance.

3.5 TEST HYPOTHESES FOR POPULATION PROPORTION 

The hypothesis: Two-tailed test

H0 :    0

OR

H1 :    0



Left-tailed test

Right-tailed test

H0 :    0 H1 :    0

OR

H0 :    0 H1 :    0

The Test Statistics: ztest 

p 0

 0 1   0  n

~ z

where

x - sample proportion p n  0 - given population proportion

Example 6 An attorney claims that at least 25% of all lawyers advertise. A sample of 200 lawyers in a certain city showed that 63 had used some form of advertising. At α = 0.05, is there enough evidence to support the attorney’s claim?

Step 1: X is the number of lawyers advertise H 0 :   0.25 H1 :   0.25

 claim 

Example 6: solution Step 2: Since n  200 and x  63 , then p 

The test statistic is ztest 

63  0.315 . 200

0.315  0.25

 0.25 0.75 200

 2.1229 .

Step 3: Given   0.05 and the test is left-tailed test, hence the critical value is  z0.05  1.6449 .

Step 4: Since  ztest  2.1229    1.6449   z0.05  , then we accept H 0 . Step 5: At   0.05 , there is enough evidence to support the attorney’s claim.

3.6 TEST HYPOTHESES FOR DIFFERENCE BETWEEN TWO POPULATIONS PROPORTION 

The hypothesis: Type of Test Two-tailed test Right-tailed test Left-tailed test



Hypothesis H 0 : 1   2   0

H1 : 1   2   0 H 0 : 1   2   0 H1 : 1   2   0 H 0 : 1   2   0 H1 : 1   2   0

Decision on Rejection Reject H 0 if ztest   z or ztest  z 2

2

Reject H 0 if ztest  z Reject H 0 if ztest   z

The Test Statistics:

If  0  0 :

If  0  0 : ztest 

 p1  p2    0  1 1   1   2 1   2   n1

n2

~ z ztest 

 p1  p2 

1 1 pp 1  pp     n1 n2 





~z where p p 

X1  X 2 n1  n2 32

Example 7 An experiment was conducted in order to determine whether the increased levels of carbon dioxide (CO2) will kill the leaf-eating insects. Two containers, labeled X and Y were filled with two levels of CO2. Container Y had double of CO2 level compared to container X. Assume that 80 insect larvae were placed at random in each container. After two days, the percentage of larvae that died in container X and Y were five percent and ten percent, respectively. Do these experimental results demonstrate that an increased level of CO2 is effective in killing leaf-eating insects’ larvae? Test at 1% significance level.

Example 7: solution Step 1: X: the number of the number of larvae that died in container X Y: the number of the number of larvae that died in container Y H0 : Y   X  0 H1 :  Y   X  0 ( claim)

Step 2: Statistic

Y 80 0.1 8

n

p x

X 80 0.05 4

The test statistic is ztest 

( pY  p X )   0  1 1  Pp 1  Pp      nY n X 

where Pp 



 0.1  0.05  0 1  1 0.075 1  0.075     80 80 

xY  x X 84   0.075 nY  n X 80  80

 1.2006

Example 7: solution Step 3: Given   0.01 and the test is right-tailed test, hence the critical value is z0.01  2.3263 .





Step 4: Since z test  1.2006   z0.01  2.3263 , then we failed to reject H 0 . Step 5: At   0.01 , there is no significant evidence to support that an increased level of carbon dioxide is effective in killing higher percentage of leaf-eating insects’ larvae.

3.7 TEST HYPOTHESES FOR A POPULATION VARIANCE 

The hypothesis: Type of Test Hypothesis

Decision on Rejection Reject H 0 if

Two-tailed test H 0 :  2   02

H1 :    2

2 0

Right-tailed test H 0 :  2   02

2  test  12 2  test  2

2

2

, n 1

or

, n 1

2  test  2 , n 1

H1 :  2   02 Left-tailed test H 0 :  2   02

H1 :  2   02 

The Test Statistics:

2 test  12 ,n1

2 n  1 s   2  test  ~ 2 ,v  n 1 2

0

s 2 is the sample variance,  02 is the given variance 36

Example 8 Listed below are waiting times (in minutes) of customers at a bank. 6.5

6.8

7.1

7.3

7.4

7.7

The management will open more teller windows if the standard deviation of waiting times (in minutes) is at least 0.9 minutes. Is there enough evidence to open more teller windows at α = 0.01?

Example 8: solution Step 1: X is waiting times (in minutes) of customers at a bank H 0 :  2  0.92 minutes

(open more teller windows)

H1 :  2  0.92 minutes

Step 2:

n  6 customers

The test statistic is

x  7.13 minutes s  0.43 minutes

n  1 s 2  6  1 0.432     2 2

test

0

0.9

 1.1414

2  0.554 . Step 3: Given   0.01 and the test is left-tailed test, hence the critical value is  0.99 ,5



 



2 2 Step 4: Since  test  1.1414   0.99,5  0.554 , then we failed to reject H 0 .

Step 5: At   0.01 , there is enough evidence to open more teller windows.

3.8 TEST HYPOTHESES FOR THE RATIO OF TWO POPULATION VARIANCES 

The hypothesis: Type of Test

Two-tailed test

Hypothesis

H0 :   2 1

Decision on Rejection Reject H 0 if f test  f1

2 2

H1 :  12   22

where f



1 , n11 , n21 2



Right-tailed test

H1 :   

2 2

H0 :   

2 2

2 1

Left-tailed test



2 1

H1 : 12   22

The Test Statistics:

ftest

or f test  f

,n ,n 2 11 21

1 f 2

H 0 : 12   22

,n ,n 2 11 21

, n2 1, n1 1

Reject H 0 if f test  f ,n11 ,n21 Reject H 0 if f test  f1 ,n11 ,n21 where f1 , n11 , n21 

1 f , n2 1, n1 1

s12  2 ~ f v1 ,v2 where v1  n1  1, v2  n2  1 s2 39

Example 9 A manager of computer operations of a large company wants to study the computer usage of two departments within the company. The departments are Human Resource Department and Research Department. The processing time (in seconds) for each job is recorded as follows: Human Resource Research

9 4

3 13

8 10

7 9

12 9

6

Is there any difference in the variability of processing times for the two departments at α = 0.05.

Example 9: solution Step 1: X1: processing time (in seconds) for each jobs from for Human Resource Department X2: processing time (in seconds) for each jobs from for Research Department

 claim 

H 0 :  12   22 H1 :  12   22

Step 2:

n1  5, x1  7.8, s1  3.3 n2  6, x1  8.5, s1  3.1

The test statistic is Ftest

s12 3.32  2   1.1332 s2 3.12

Step 3: Given   0.05 and the test is two-tailed test, hence the critical value are

F 2

F

,n1 1,n2 1



 F0.025,4,5  7.3879

1 ,n1 1,n2 1 2



1 F 2

,n2 1,n1 1

 F0.975,4,5 

1 F0.025,5,4



1  0.1068 9.3645

Example 9: solution

Step 4: Since

 F0.975,4,5  0.1068   ftest  1.1332    F0.025,4,5  7.3879  ,

then we failed to

reject H 0 . Step 5: At   0.05 , there is no difference in the variability of processing times for the two departments.

3.9 P-Values IN HYPOTHESIS TESTING The P-value (Probability value) is the smallest level of significance that would lead to rejection of the null hypothesis with the given data

• Finding the P-value Statistical Table

Calculator (Casio fx-570 MS)

Step 1: Find the area under the standard normal distribution curve corresponding to the z test value. Step 2: Subtracting the area from 0.5 to get the P-value for a right-tailed or lefttailed test. To get the P-value for a twotailed test, double the area after subtracting.

Step 1: Find the area under the standard normal distribution curve corresponding to the z test value. Step 2: The area obtained is the P-value for a right-tailed or left-tailed test. To get the P-value for a two-tailed test, double the area. P-value  P  Z  1.6449  R 1.6449  0.05

Procedures of Hypothesis Testing using P-Value Approach Step 1: Formulate a hypothesis and state the claim Two-tailed test H 0 :   0

H1 :   0

Right-tailed test

Left-tailed test

H 0 :   0

H 0 :   0

H1 :   0

H1 :   0

Step 2: Choose the appropriate test statistic, and calculate the sample test statistic value. Step 3: Find the P-value Step 4: Make a decision to reject or not to reject the H 0.

If P  value    Reject H 0 If P  value    Do not Reject H 0 Step 5: Draw a conclusion to reject or to accept the claim or statement. 44

Example 10 Most water-treatment facilities monitor the quality of their drinking water on an hourly basis. One variable monitored is pH, which measures the degree of alkalinity or acidity in the water. A pH below 7.0 is acidic, above 7.0 is alkaline and 7.0 is neutral. One water-treatment plant has target a pH of 8.5 (most try to maintain a slightly alkaline level). The mean and standard deviation of 1 hour’s test results based on 31 water samples at this plant are 8.42 and 0.16 respectively. Does this sample provide sufficient evidence that the mean pH level in the water differs from 8.5? Use a 0.05 level of significance. Assume that the population is approximately normally distributed. [Example 3]

Solve this problem using P-value approach.

Example 11: solution

Example 11: solution

P-value Using Excel – Test For Mean Step 1: Click Menu Data → Data Analysis →Descriptive Statistics → click OK

Step 2: a) The commands for t-test are (i) t-test = (Mean - 0 )/Standard Error (ii) P-value for a two-tailed test = T.DIST.2T(ABS(t-test), degrees of freedom) P-value for a right-tailed test = T.DIST.RT((ABS(t-test), degrees of freedom) P-value for a left-tailed test = T.DIST(ABS(t-test), degrees of freedom, 1) Note: Standard Error is a standard deviation divided by the square root of the number of data which can be written as s.e. 



n

.

Example 11 A petroleum company is studying to buy an additive for improving the distilled product. The company estimates the cost of the additive, which is RM1 million for 5 tonnes. Ten consultant companies submitted their tenders with the following estimates (in million RM):

0.97 0.95 1.10 1.30 1.10 0.96 0.97 1.20 1.50 1.70 Do you think the petroleum company over estimates the cost of the additive? Give your reason. Use P-value method.

Example 11: solution Step 1: Formulate the hypothesis H 0 : C  1 H1 : C  1 (claim: company over estimate the cost)

Step 2: Key in the data, select data → data analysis →Descriptive Statistics → click OK

Example 11: solution Output from Excel: Column1 Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Confidence Level(95.0%)

1.175 0.080942366 1.1

t-test P-value

2.162032171 0.970563811

0.97 0.255962237 0.065516667 0.524938867

The values highlighted will be used to calculate t-test

1.172718741 0.75 0.95 1.7 11.75 10 0.183104353

t-test and P-value are calculated using Excel command as follows: t-test = (1.175-1)/0.080942366 Since the case is t-test and left-tailed test, P-value = T.DIST(2.162032171,9,1)

Step 3: P -value  0.9706 Step 4: Since  P-value  0.9706    0.05 , then we do not reject H 0 . Step 5: At   0.05 , there is not enough evidence to support the claim that the petroleum company over estimate the cost of the additive.

P-value Using Excel – Test For Difference Mean Step 1: Test the difference in variability --> F.TEST(data set 1, data set 2) Step 2: Click Menu Data--> Data Analysis--> Choose the appropriate test (i.e.: t-Test: Two-Sample Assuming Unequal Variances)--> click ok Step 3: Variable 1 range--> select the data set 1 Variable 2 range--> select the data set 2

Hypothesized mean difference--> value of μ0 Alpha--> value of significance level, α Step 4: P-value for a two-tailed test = P(T<=t) two-tails (depends on distribution used) P-value for a right-tailed test = P(T<=t) one-tail (depends on distribution used) P-value for a left-tailed test = 1- P(T<=t) one-tail (depends on distribution used)

Example 12 A company is considering installing a new machine to assemble its product. The company is considering two types of machine, Machine A and Machine B but it will by only one machine. The company will install Machine B if the mean time taken to assemble a unit of the product is less than Machine A. Table below shows the time taken (in minutes) to assemble one unit of the product on each type of machine. Machine A

23

26

19

24

27

22

20

18

Machine B

21

24

23

25

24

28

24

23

At 10% significance level, test the difference in variability between the two types of machines. Which machine should be installed by the company to assemble its product?

Example 12: solution Step 1: Formulate the hypothesis

H 0 :  A  B  0 H1 :  A   B  0 (claim)

and

H 0 :  2A   B2 H1 :  2A   B2

P-value =0.2239 > 0.1 Thus, Failed to reject H0. There is no difference in the variability.

Example 12: solution Step 2: Key in the data in Excel and choose the t-Test: Two-Sample Assuming equal Variances t-Test: Two-Sample Assuming Equal Variances machine Machine A B Mean 22.375 24 Variance 10.55357 4 Observations 8 8 Pooled Variance 7.276786 Hypothesized Mean Difference 0 df 14 t Stat -1.2048 P(T<=t) one-tail 0.124127 t Critical one-tail 1.34503 P(T<=t) two-tail 0.248254 t Critical two-tail 1.76131 Step 3: The test is one-tailed test, hence P-value = 0.1242 Step 4: Since  P  value  0.1242   0.1    , then we do not reject H 0 . Step 5: At 10% significance level, machine A should be installed.

P-value Using Excel – Test For Paired Data Step 1: Click Menu Data--> Data Analysis--> Choose the appropriate test (i.e.: t-Test: Paired Two Sample for Means)--> click ok Step 2: Variable 1 range--> select the data set 1 Variable 2 range--> select the data set 2 Hypothesized mean difference--> value of μ0

Alpha--> value of significance level, α Step 3: P-value for a two-tailed test = P(T<=t) two-tails (depends on distribution used) P-value for a right-tailed test = P(T<=t) one-tail (depends on distribution used) P-value for a left-tailed test = 1- P(T<=t) one-tail (depends on distribution used)

Example 13: refer data example 5 H 0 : D  0 H1 :  D  0

(wise to install the new gadget)

P-value =0.0150 < 0.05 Thus, reject H0. At 5% significance level, it is wise to install the new gadget.

3.10 RELATIONSHIP BETWEEN HYPOTHESIS TEST & CONFIDENCE INTERVAL There is a relationship between the confidence interval and hypothesis test about the parameter,  . Let say  a, b  is a 1   100% confidence interval for the  , the test of the size  of the hypothesis

H 0 :   0 H1 :   0 will lead to rejection of H 0 if and only if  0 is not in the 1   100% confidence interval  a, b  .

Notes: This relationship should be checked for two-tailed test only.

Example 14 By considering Example 3.3 again, the 95% confidence interval for  is  0.16   8.42  z0.025    31   8.42  1.9600  0.0287   8.42  0.0563  8.3637,8.4763

Since   8.5 is not included in this interval, the H 0 is rejected. So, the decision making or conclusion is the same as in Example 3.3 and Example 3.10.

REFERENCES 1.

Montgomery D. C. & Runger G. C. 2011. Applied Statistics and Probability for Engineers. 5th Edition. New York: John Wiley & Sons, Inc.

2.

Walpole R.E., Myers R.H., Myers S.L. & Ye K. 2011. Probability and Statistics for Engineers and Scientists. 9th Edition. New Jersey: Prentice Hall.

3.

Navidi W. 2011. Statistics for Engineers and Scientists. 3rd Edition. New York: McGraw-Hill.

4.

Bluman A.G. 2009. Elementary Statistics: A Step by Step Approach. 7th Edition. New York: McGraw–Hill.

5.

Triola, M.F. 2006. Elementary Statistics.10th Edition. UK: Pearson Education.

6.

Satari S. Z. et al. Applied Statistics Module New Version. 2015. Penerbit UMP. Internal used.

Thank You NEXT: CHAPTER 4 ANALYSIS OF VARIANCE