The UK Faculty of Public Health has recently taken ownership of the Health Knowledge resource. This new, advert-free website is still under development and there may be some issues accessing content. Additionally, the content has not been audited or verified by the Faculty of Public Health as part of an ongoing quality assurance process and as such certain material included maybe out of date. If you have any concerns regarding content you should seek to independently verify this.

Role of chance, bias and confounding in epidemiological studies


Learning objectives: You will learn how to understand and differentiate commonly used terminologies in epidemiology, such as chance, bias and confounding, and suggest measures to mitigate them.

The interpretation of study findings or surveys is subject to debate, due to the possible errors in measurement which might influence the results. This section introduces you to various errors of measurement in epidemiological studies.

Read the resource text below.

Resource text

While the results of an epidemiological study may reflect the true effect of an exposure(s) on the development of the outcome under investigation, it should always be considered that the findings may in fact be due to an alternative explanation [1].

Such alternative explanations may be due to the effects of chance (random error), bias or confounding, which may produce spurious results, leading us to conclude the existence of a valid statistical association when one does not exist, or alternatively the absence of an association when one is truly present.

Observational studies are particularly susceptible to the effects of chance, bias and confounding, and these need to be considered at both the design and analysis stage of an epidemiological study so that their effects can be minimized.

Confounding, interaction and effect modification

Confounding involves the possibility that an observed association is due, totally or in part, to the effects of differences between the study groups (other than the exposure under investigation) that could affect their risk of developing the outcome being studied.

Confounding occurs when the effects of two associated exposures have not been separated, resulting in the interpretation that the effect is due to one variable rather than the other. The consequence of confounding is that the estimated association is not the same as the true effect.

In order for a variable to be considered as a confounder:

1. The variable must be independently associated with the outcome (i.e. be a risk factor).
2. The variable must be associated with the exposure under study in the source population.
3. It should not lie on the causal pathway between exposure and disease.

Examples of confounding

A study found alcohol consumption to be associated with the risk of Coronary Heart Disease. However, smoking may have confounded the association between alcohol and CHD. For example smoking is independently associated with CHD (is a risk factor) and is also associated with alcohol consumption (smokers tend to drink more than non-smokers).

Controlling for the potential confounding effect of smoking may in fact show no association between alcohol consumption and CHD.

Effects of confounding

Confounding factors, if not controlled for, cause bias in the estimate of the impact of the exposure being studied.

The effects of confounding can result in:

* An observed difference between study populations when no real difference exists.
* No observed difference between study populations when a true association does exist.
* An underestimate of an effect.
* An overestimate of an effect.

Residual confounding

Residual confounding occurs when a confounder has not been adequately adjusted for in the analysis, for example by using too large age groups.


Bias may be defined as any systematic error in an epidemiological study that results in an incorrect estimate of the association between exposure and risk of disease.

  • Bias results from systematic errors in the research methodology.
  • Limited scope exists for the adjustment of most forms of bias at the analysis stage. As a result careful consideration and control of the ways in which bias may be introduced during the design and conduct of the study is essential in order to limit the effects on the validity of the study results.
  • Common types of bias in epidemiological studies

    Information bias

    Information bias results from systematic differences in the way data on exposure or outcome are obtained from the various study groups [1]. Types of information bias include:

    Observer bias

    Observer bias occurs when there are systematic differences in the way information is collected for the groups being studied. Observer bias may occur as a result of the investigator's prior knowledge of the hypothesis under investigation or knowledge of an individual's exposure or disease status. Such information may result in differences in the way information is collected, measured or interpretation by the investigator for each of the study groups.

    Minimizing observer bias

    • Where possible, observers should be blinded to the exposure and disease status of the individual.
    • Blind observers to the hypothesis under investigation.
    • In a randomised controlled trial, blind investigators and participants to treatment and control group (double blind randomised controlled trial).
    • Development of a protocol for the collection, measurement and interpretation of information.
    • Use of standardized questionnaires.
    • Training of interviewers.

    Losses to follow-up

    Loss to follow-up is a particular problem associated with cohort studies. Bias may be introduced if the individuals lost to follow-up differ with respect to the exposure and outcome from those persons who remain in the study.

    Recall bias

    In a case-control study, data on exposure are collected retrospectively. The quality of the data, therefore, is determined to a large extent by the patient's ability to accurately recall past exposure(s). Recall bias may occur when the information provided on exposure is different between the cases and controls. For example, an individual with the outcome under investigation (case) may report their exposure experience differently than an individual without the outcome (control) under investigation. That is, cases may tend to have a better recall on past exposures than controls.

    Recall bias may result in either an underestimate or overestimate of the association between exposure and outcome.

    Methods to minimize recall bias include: the collection of exposure data from work or medical records or to blind the study participants as to the hypothesis under investigation.

    Selection bias

    Selection bias occurs when the two groups being compared differ systematically. That is, there are differences in the characteristics between those who are selected for a study and those who are not selected, and where those characteristics are related to either the exposure or outcome under investigation.


    1. Hennekens CH, Buring JE. Epidemiology in Medicine, Lippincott Williams & Wilkins, 1987.
    2. Breslow NE & Day NE. Statistical Methods in Cancer Research. Vol. 1: The Analysis of case control studies, IARC, 1980.