OBJECTIVES
By now you would have observed that the selection of an appropriate statistical test depends on the objective of the analysis, the number of variables involved, and the
type (s) of data.
In this module we will continue with bivariate statistics, and explore the relationship between two quantitative or numerical variables, and determine the extent to which one variable (y) can be predicted from the other (x). Specifically, we will address the following:
Simple Linear Correlation (1Pearson’s) Simple Linear Regression
Scatter Plots
1 Another commonly used correlation analysis is Spearman’s (a non-parametric test), which is appropriate for ranked (or ordinal) data, and when a normal distribution cannot be assumed.
RECOMMENDED READING
Correlation, Regression, and Causation: http://www.bmj.com/cgi/content/full/315/7105/422
INTRODUCTION
When data are collected in a paired fashion, in other words, when we have a data set with two quantitative or numerical values (X and Y) for each subject, we can perform simple linear correlation analysis to determine if the two variables are significantly correlated, related or associated. This is a very useful test to determine the magnitude (STRENGTH) and DIRECTION of a relationship between two variables.
Simple linear correlation and regression methods can generally be applied to the same type of data, and complement each other in providing us with a more comprehensive understanding of the pattern(s) underlying our data. Both variables must be truly numerical2.
CAUTION: Linear correlation examines or test for an underlying LINEAR REALTIONSHIP only, therefore the absence of a linear relationship, does not mean the absence of a relationship, as there could be an underlying non-linear (e.g. curvilinear) relationship, which will require non-linear statistical methods to detect and describe.
Correlation does not equal causation. Potential confounding factors must always be considered.
2 There are other types of correlation and regression, not covered in this course.
In correlation the emphasis is on the degree to which a linear model can describe the relationship between two variables, while in regression, the emphasis is on predicting one variable from the other.
In regression the interest is directional, as in a regression of Y (the dependent variable) on x (the independent). One variable is predicted (y) and the other is the predictor (x).
In correlation the focus is on the relationship. Correlation is said to be symmetric, that is, the variables correlate with each other in both directions. In other words, the correlation of X with Y, is the same as the correlation of Y with X
The correlation between two variables reflects the degree to which the variables are related. The most common measure of correlation is the Pearson’s product moment correlation (called Pearson's correlation for short).
For a population, the Pearson’s product moment correlation is represented by the Greek letter rho (ρ), and for a sample, it is designated by the letter "r" (called "Pearson's r"), which is the correlation coefficient.
Pearson’s correlation coefficient is a measure of the strength and direction of a linear relationship. It ranges from 0 to 1 (either positive or negative).
A correlation of +1 means that there is a perfect positive linear relationship between the variables. The scatter-plot below depicts such a relationship. It is a positive relationship (positive slope or gradient) because as one variable increases so does the other. Higher values on one variable are associated with higher values on the other.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme