Assignment:
1. Suppose Y i∼id N(µ , σ2) and Y i∼id N(µ , σ2), i = 1, . . . , n , j = 1, . . . , n , where Y ’s
and Y2j’s are all independent. We would like to conduct an F -test to see if the variances
are equal (i.e., σ2 = σ2). We would also like to construct a 100(1 − α)% confidence
interval for σ2/σ2. Recall the test and confidence interval from your introductory
1 2
statistics course (e.g., STA 3032), which were also reviewed in this course. Define your
notation clearly and answer the following questions.
(a) State the null and alternative hypotheses. Note that there are three possible alternatives.
(b) Write the test statistic and its null distribution.
(c) For each alternative, write the p-value and explain how to draw a conclusion based on it at the significance level α.
(d) Construct a 100(1−α)% confidence interval for σ2/σ2. You do not need to consider
1 2
a one-sided confidence bound.
(e) Write a function to conduct the tests and construct the interval above. No credit will be given if any built-in function related to the variance test is used. You may use the four Base R functions for a given distribution (e.g., d/p/q/rf()). Your function should perform as follows.
i. The function takes the arguments:
y1, y2, alt = "two-sided", lev = 0.95, where the equality indicates the default value.
• The arguments y1 and y2 are the two samples.
• The argument alt is the alternative hypothesis whose two other possible values are "less" and "greater".
• The argument lev is the confidence level 1 − α.
ii. The function returns an R list containing the test statistic, p-value, and con- fidence interval.
iii. Inside the function, two Shapiro-Wilk tests of normality are conducted sepa- rately for the two samples (note the normality assumption at the beginning of the problem). If one or both p-values are less than 0.05, a warning message is printed out explaining the situation. Regardless, the function performs part (ii).
(f) Use your function above to solve the following problem.
Problem: The following data represent the running times of films produced by two motion-picture companies.
Company Time (minutes)
I 139 131 147 108 122 129 140 144 86 104
II 169 239 120 124 99 113 96 125
Test the hypothesis that the variances for the running times of films produced by Company I and Company II are equal against the alternative that they are not equal. Draw a conclusion based on a p-value at the 0.01 significance level. Construct and interpret a 99% confidence interval for the ratio of the variances.
2. Suppose Y i∼id N(µ , σ2) and Y i∼id N(µ , σ2), i = 1, . . . , n , j = 1, . . . , n , where Y ’s
and Y2j’s are all independent. In Homework 6, we wrote a function to test H0 : µ1 = µ2
and construct a 100(1 α)% confidence interval for µ1 µ2 assuming the variances σ2 and σ2 were unknown but equal. In this problem, we modify the function to incorporate the case where the variances are unknown and unequal. Define your notation clearly
and answer the following questions. For parts (a) and (b), assume σ2
and σ2
are
unknown and unequal.
(a) Write the test statistic and its null distribution. Specify the null distribution degrees of freedom using the Satterthwaite approximation discussed in class.
(b) Construct a 100(1 α)% confidence interval for µ1 µ2. You do not need to consider a one-sided confidence bound.
(c) Modify the two-sample t-test function you wrote in Homework 6 so that the function works whether or not the unknown variances are equal. No credit will be given if any built-in function related to the t-test is used. You may use the four Base R functions for a given distribution (e.g., d/p/q/rt()). Your function should perform as follows.
i. The function takes the arguments:
y1, y2, alt = "two-sided", lev = 0.95, where the equality indicates the default value.
• The arguments y1 and y2 are the two samples.
• The argument alt is the alternative hypothesis whose two other possible values are "less" and "greater".
• The argument lev is the confidence level 1 − α.
ii. The function first conducts a variance test using the function you wrote in Problem 1. If the variance test p-value is greater than 0.05, the function
assumes σ2 = σ2; otherwise, it assumes σ2 σ2. Based on that assumption,
the function tests H0 : µ1 = µ2 and constructs a 100(1 − α)% confidence interval for µ1 − µ2.
iii. The function returns an R list containing the test name, test statistic, p-value, and confidence interval, where the test name takes one of the following two values
A. "two-sample t-test with unknown but equal variances";
B. "two-sample t-test with unknown and unequal variances".
iv. The function does not return the variance test result.
(d) Use your new t-test function above to repeat part (g) of Problem 6 in Homework
6. Are the results same as before? Why or why not?
3. Recall the chi-squared test of independence for two categorical variables from your introductory statistics course (e.g., STA 3032), which was also reviewed in this course. Let X and Y be the two categorical variables, and let O = [oij], i = 1, . . . , r, j =
1, . . . , c, be the r × c contingency table that cross-tabulates the n.. = Σr
c j=1
oij
observations by X and Y . That is, X and Y have r and c levels, respectively, and oij is
the observed count for the (i, j)th cell. Finally, let ni. = Σc oij and n.j = Σr oij be
the ith row total and the jth column total, respectively. Define your notation clearly
and answer the following questions.
(a) State the null and alternative hypotheses.
(b) Explain how to calculate the expected count eij for the (i, j)th cell under the null.
(c) Write the test statistic and its null distribution.
(d) Explain why a large value of the test statistic provides evidence against the null.
(e) Write the p-value and explain how to draw a conclusion based on it at the signif- icance level α.
(f) Write a function to conduct the chi-squared test of independence. No credit will be given if any built-in function related to the chi-squared test is used. Also, you are not allowed to use the outer() function or the %o% operator. You may use the four Base R functions for a given distribution (e.g., d/p/q/rchisq()). Your function should perform as follows.
i. The function takes the arguments:
dat, res.type = "pearson",
where the equality indicates the default value.
• The argument dat is an R matrix of the r × c contingency table.
• The argument res.type specifies the type of the residuals whose other possible value is "std". Here, the type "pearson" indicates the Pearson
residual
r = oij
— eij
ij
ij
and the type "std" indicates the standardized residual
∗ oij − eij
rij =
ij
(1 − ni.
/n..
)(1 − n.j
.
/n..)
ii. The function does not use continuity correction.
iii. The function returns an R list containing the test statistic, p-value, expected counts, residual type, and the residuals of that type, where the expected counts and the residuals are stored in two r × c matrices.
iv. Recall that the chi-squared approximation requires each expected count to be at least 5. If any expected count is less than 5, the function prints out a warn- ing message informing potential inaccuracy of the approximation. Regardless, the function performs part (iii).
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme