1. Produce a small data set (under 35 points) from the internet and briefly
(a) describe a (tentative) goal for the project it was (to be) used for;
>>
(b) describe an ontology to guide the organization/management/
understanding of the data; and
>>
(c) give two visualizations of the data you might use to understand it.
[The data can be downloaded from the internet, synthetically produced,
or collected by you, but you need to indicate the source. The term
`visualization’ is technical, as used in class.]
>>
2. Visualize/plot specific examples of two probability distributions (one discrete, one continuous), showing their mean, variance, std and the four quartiles of each. Generate 100 data points in your computing environment according to these distributions, calculate these statistics and compare/contrast with the original values.
[You need to cut & paste in output verbatim, including any commands used.]
>>
3. Choose one of the built-in data sets in your computing environment (R or Python) and take it as a sample. Give
a) a characterization of the full population under consideration in the sample;
b) a characterization of the data corpus/set;
c) three visualizations of the data to help understand it.
[You need to actually provide one typical data point, its dimensionality and size of the corpus of data (do not include it) and give visualizations consistent with what has been discussed in class, with outputs properly labeled and organized. For example, you need to give appropriate margins of error for at least 2 estimates of populations parameters and at least 2 scatter plots of variables of most interest.]
>>
4. Select one random variable X for your data set in problem 3 and
(a) Give a description of a probability space for it;
(b) Give a full technical description of X as a RV;
(b) Compute its probability distribution based on the chosen data.
[You need to actually describe in full the sample space and the probability measure used to define X.]
5. Design a randomized experiment consisting of a script and the corresponding
outputs outcomes/outputs of a series of commands in your programming environment in a test to provide evidence for the Central Limit theorem OR the Weak Law of Large Numbers (your choice of only one.)
[You can generate samples with values for the population mean and variance for a given distribution in a population “unknown” to the samples. Are the values retrievable just from samples? How exactly? What’s the outcome of the test?]
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme