Chapter 2. Measures of location.
Measures of Dispersion
Frequencies distributions and graphs enable us to visualize data so that its principal features can be more easily observed.
What do you observe about the employees’ salary?
What do you observe about the number of children?
Objectiv of this chapter: describe and summarize data using a set of numerical measures (like mean, median, mode, percentiles, variance….)
2.1 Measures of central tendency
- Locate the “center” of the data (of the frequencies distribution)
- This is a one value around which all the other values tend to concentrate or “locate”
- Three different measures of central tendency: the mean, the median, the mode.
2.1.1 The Mean
Suppose we have a sample (or a population) containing n elementary units
X – is a quantitative variable.
Note[pic]= the value (the observation) of variable X for the ith elementary unit;
[pic],[pic], …, [pic] denotes the values of the n observations.
The arithmetic mean
- Usually just “the mean”, or “average” of data.
- Interpretation: the arithmetic mean answers the question, "if all the observations had the same value, what would that value have to be in order to achieve the same total?"
a) The mean of ungrouped data
Sum of all the values divided by the number of values:
Example. The selling price for six houses: 50000, 50000, 65000, 67500, 67500 and 75000 Euro. The mean (average) selling price is
b) The mean of grouped data (the mean from a frequency distribution)
b1) For discrete variable:
where k is the number of classes and [pic]denote the frequencies of the ith class.
It is a weighted mean. Each value is multiplied by its weight (in this case the frequency) and summed. This sum is divided by the total of the weights.
b2) For continuous variable: