Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Drop Files Here Or Click to Upload

Or Get Complete Course Help

Jovani DicksonPolitical science

(4/5)

756 Answers

Hire Me

Lance BlaserComputer science

(5/5)

908 Answers

Hire Me

Elyza Marice GamisManagement

(5/5)

841 Answers

Hire Me

Christopher MclaughlinMathematics

(5/5)

857 Answers

Hire Me

R Programming

(5/5)

Given X1, ..,Xn iid from some univariate for now distribution, in this exercise we consider a seemingly trivial goal

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

General Instructions

I expect you to upload your solutions on Moodle as a single running R Markdown file (.rmd) + its .html output, named with your surnames.

R Markdown Test

To be sure that everything is working fine, start RStudio and create an empty project called HW3. Now open a new R Markdown file (File > New File > R Markdown...); set the output to HTML mode, press OK, and then click on Knit HTML. This should produce a web page with the knitting procedure executing the default code blocks. You can now start editing this file to produce your homework submission.

Please Notice

• For more info on R Markdown, check the support webpage that explains the main steps and ingredients: R Markdown from RStudio.

• For more info on how to write math formulas in LaTex: Wikibooks.

• Remember our policy on collaboration: Collaboration on homework assignments with fellow students is encouraged. However, such collaboration should be clearly acknowledged, by listing the names of the students with whom you have had discussions concerning your solution. You may not, however, share written work or code after discussing a problem with others. The solutions should be written by you.

Exercise: Estimating a population mean. . . in 2020. . .

1. Introduction

Given {X1, . . . , Xn} iid from some (univariate for now) distribution, in this exercise we consider a seemingly trivial goal:

estimate the population mean µ = E(X)

An obvious choice would be the plug-in estimator, the empirical mean that you all know and love. . .

This estimator is computationally attractive, requires no prior knowledge, and automatically scales with the population variance

σ. In addition, tweaking a bit the Central Limit Theorem, we also know that result that also holds non-asymptotically under some suitable technical conditions. If these conditions are not met, we still have Chebyshev’s inequality, which says that with a probability of at least 1 − α an exponentially weaker bound that will especially hurt in modern applications where many means have to be estimated simultaneusly (e.g. empirical risk minimization methods).

One may in fact try to extend the MoM estimator to the multivariate case, only one problem: what is a median in the multivariate case? Given n points x1, . . . , xn in Rd, the center of the smallest ball that contains at least half of the points may be considered as a notion of a multivariate median. Computing such a median is totally a nontrivial problem! The multivariate MoM estimator may be defined as the geometric median of the sample means of the k blocks defined before. As in the univariate case, the theoretical optimal block number is k٨ = 8 log(1/α) . As a quick final remark, let me notice that, for the purpose of experimenting a bit, we could replace this last summary with any robust multivariate location estimator.

1.On robust procedures and heavy-tailed distribution in Data Science

As mentioned, MoM estimators should have an edge on the beloved sample average in multivariate, heavy-tailed cases. At this point in time, heavy-tailed distributions have been accepted as realistic models for various phenomena:

• www-session characteristics (e.g. sizes and durations of sub-sessions; sizes of responses inter-response time intervals)

• on/off-periods of packet traffic

• file sizes

• service-time in queueing model

• flood levels of rivers

• major insurance claims

• extreme levels of ozon concentrations

• high wind-speed values

• wave heights during a storm

• low and high temperature

But there’s more. As you probably know, recent technological developments have allowed companies and state organizations to collect and store huge datasets. Big datasets have also challenged scientists in statistics and computer science to develop new methods. In fact, because of the very “unstructured” way in which these datasets are collected, oftentimes they tend to be corrupted by nasty outliers and/or exhibit heavy tails.

(5/5)

Hurry, Grab up to 30% discount on the entire course

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Jovani DicksonPolitical science

Lance BlaserComputer science

Elyza Marice GamisManagement

Christopher MclaughlinMathematics

R Programming

Given X1, ..,Xn iid from some univariate for now distribution, in this exercise we consider a seemingly trivial goal

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

Other Services

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Jovani DicksonPolitical science

Lance BlaserComputer science

Elyza Marice GamisManagement

Christopher MclaughlinMathematics

R Programming

Given X1, ..,Xn iid from some univariate for now distribution, in this exercise we consider a seemingly trivial goal

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer