For my project I observed the correlation between the number of caffeine drinks a person drank per day, and how that affected the number of hours of sleep they got each day. My questions was How does a persons caffeine consumption affect their sleeping habits, with my explanatory variable being the number of caffeine drinks a person consumed per day, and my response variable being the number of hours of sleep. I collected my data online from a data website called statcrunch.com from a study of thirty caffeine and non-caffeine drinkers.
My data was a little scattered because some of the numbers varied quite differently. I found it easiest to create the stem and leaf plot because all my numbers were single digits. I also had an easier time with the histogram as well, because my numbers were mostly within the same range. I did however have more trouble with the scatterplot and finding the line of least regression because my data was quite scattered. The box plot and the histogram were also the easiest to interpret and were the two graphs that worked best for this kind of data set.
The scatterplot was the hardest to create and does not work as well for this kind of data. It is mostly scattered, with a downward slope, and most of the points fall throughout different places on the graph with no clear line of correlation. The scatter plot has a weak positive linear correlation. However there are not really any particular outliers, which helps the strength of the plot.
The r-value for this data was -.4146 which concludes that the data has a weak association, as well as a negative one. The r squared-value was .1719 which shows that there is a 17% variance in the data set accounted for by the least squares regression line. The correlation between these 2 variables is not very strong. It has a much weaker correlation showing that the data is not very well connected and that these two variables may not entirely directly affect each other. I think because the hours...