Assignment
1. In Section 2.1, we discussed properties of LSE β and residual e = (In − X(XTX)−1XT)y.
Based on the notation defined in Section 2.1, please show that
(a) the estimator β^ is an unbiased estimator of β with var(β^|X) = σ2(XTX)−1. Moreover,
β is the Best Linear Unbiased Estimator (BLUE).
(b) E(e|X) = 0, var(e|X) = σ2(In − H) with H = X(XTX)−1XT, and cov(e, β|X) = 0.
(c) Furthermore, find the unbiased estimator of σ2.
2. Suppose that {(Yi, Xi) : i = 1, • • • , n} is a sequence of independently and identically dis- tributed (i.i.d.) random variables with Yi, Xi ∈ R. Assume that var(Xi) = σ2 . We consider
the simple linear regression model
Yi = β0 + Xiβx + ϵi, (1)
where ϵi i.∼i.d. N (0, σ2) and ϵi is independent of Xi.
(a) In some applications (e.g., when collecting data), we are not able to precisely measure Xi, but instead, we can only observe Xi∗. It is called mismeasurement. As a result, we usually have model (1) with Xi replaced by Xi∗
in this situation. Please find the estimators of βx based on Xi and Xi∗, and denote them by βx and βx, respectively.
(b) We usually build up the relationship between Xi and Xi∗ by the following model:
Xi∗ = Xi + δi,
(c) Find variances of βx and βx, i.e., var(βx) and var(βx). Moreover, compare these two variances.
(d) (Do a simple simulation study). Consider the sample size n = 1000. Let β0 = βx = 1, 2 = 1. Let Xi be generated by N (4, 1). Then the response Yi can be generated by model (1). In addition, consider σ2 = 0.15, 0.55 and 0.75, and then generate Xi∗ by
(2). Suppose that we run 1000 repetitions. Based on your “artificial” data, calculate numerical results for βx, βx, var(βx) and var(βx). Summarize your numerical results as the following table and compare with (a), (b), and (c).
Suppose f (y) is a probability density function (pdf). Let
R(f (r)) = ∫ ∞ ,f (r)(y),2 dy,
where f (r)(y) is the rth derivative of f (y).
(a) When f is pdf of N (µ, σ2), please find R(f (2)) so that we are able to obtain the bandwidth based on normal scale rule.
(b) Show that under some conditions,
R(f (r)) = (−1)r ∞ f (2r)(y)f (y)dy.
Which kinds of conditions do we need here? Does standard normal distribution satisfy these conditions?
4. Consider the wool prices data set (wool.txt) that reports the wool prices at weekly markets. The response of interest is the log price difference between the price of a particular wool 19 µm (cents per kilogram clean) and the floor wool price (cents per kilogram clean) at markets:
yt = log(19 µm price/floor price), and the covariate xt is the time in weeks since January 1, 1976.
(a) Fit the data by a simple linear regression model and a polynomial model of order 10.
Give scatterplot of the data and add the two fitted lines, one for simple linear model and one for polynomial model. Put clear and proper legends on it.
(b) Fit the data by local constant kernel estimator and local linear kernel estimator. Choose the bandwidths in these two estimators by the CV method. Give scatterplot of the data and add the two fitted lines. Put clear and proper legends on it.
(c) Fit the data by local linear kernel estimator. Choose the bandwidths by the CV and direct plug-in methods. Give scatterplot of the data and add the two fitted lines. Put clear and proper legends on it.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme