Methods for Summarising Data
Epidemiological studies often generate a large volume of data. Summarising this can help draw out patterns and results.
Methods used to summarise epidemiological data are addressed in more detail in the 1b Statistical Methods module of the website. This page provides a brief overview of the options available and weblinks to further coverage elsewhere on the site.
There are four principal methods used to summarise data:
- Using simple frequencies - the number of times something occurs in a data set. This makes it possible to determine if the specific value we are interested in is unusual or common
- Measures of location
- Mean - also known as the average
- Median - the middle value in the ranked observations, used when the data is asymmetric,
- Mode - the value of the random variable in the sample which occurs with the highest frequency
- Measures of spread
- Range - the difference between the largest and smallest observations
- Interquartile Range - describes the middle half of the data, between the 25th and 75th centiles, reducing the effect of outlying observations
- Variance and standard deviation
- Graphical methods
- Box plots - visual presentation of the quartiles of the data
- Histograms - graphical presentation of the shape of a data distribution
© Helen Barratt 2009

