We are currently in the process of updating this chapter and we appreciate your patience whilst this is being completed.
1. Sources of routine data
Routine data describes non-targeted information that is obtained in a standardised and consistent manner. Various sources of routinely collected data are available for use in epidemiological studies. These include:
- Demographic data from census and population registers
- Death certificates
- Cancer registrations
- Birth registrations
- Congenital malformations registrations
- Infectious disease notifications
- Hospital episode data
- Health surveys
- Royal College of General Practitioners weekly returns
Some of these data are collected routinely, on a national or regional basis, such as deaths, cancer registrations, births, fertility and infectious disease notifications. However, the availability and quality of routine health statistics that can be utilised for epidemiological investigations may vary considerably:
- Data may be collected on a systematic and ongoing basis or only intermittently (such as in health surveys)
- They may be collected for the entire population or for a sample of the population
- They may be available nationally, regionally or locally
- They may be cohort or cross-sectional in concept
- They may be person or episode based
- Much of the mortality and morbidity data in the UK is collected separately for England, Wales, Scotland and Northern Ireland, making comparisons between countries more problematic, particularly as data collection may not be standardised between countries
- The ability to compare mortality or morbidity, for example cancer rates or cancer survival, between countries may prove problematic because coverage, as well as both the methods of data collection and the classification of certain cancers, may vary significantly between countries.
In England most health statistics are published by the Office for National Statistics (ONS) and a large amount is now available electronically.
2. Demographic data
Demography is the scientific study of population statistics, including their size, structure, dispersal and development. Basic demographic data are an essential part of any epidemiological investigation as they form the baseline count of the total population being studied. Reliable denominators are necessary for the calculation of the various measures of disease frequency, including incidence and prevalence.
The primary source of denominator data in the UK is derived from the national census which is carried out every 10 years. The last UK census was held in 2011 and the next one is scheduled for 2021. In addition to population counts by age and sex, the UK census also collects data on a number of socio-economic variables including ethnicity (since 1991), country of birth, accommodation, education, employment and long-term illness (since 1991).
There is no central agency that collates statistics for all the four home nations in the UK, so separate censuses are undertaken simultaneously by each of the three government census agencies in England and Wales, Scotland and Northern Ireland. Census data has become increasingly available in electronic format much of which can be downloaded from the relevant census office (see below).
- England and Wales:
The Office for National Statistics (ONS) https://www.ons.gov.uk/census
The National Records of Scotland (NRS) http://www.scotlandscensus.gov.uk/
- Northern Ireland
The Northern Ireland Statistics and Research Agency (NISRA) https://www.nisra.gov.uk/statistics/census
Issues using census data
In the UK, the census is conducted every 10 years. As a result population data can become out of date quickly, leading to spurious measures of disease frequency and effect. In addition, data may be incomplete for some population sub-groups, for example those in hard-to-reach communities.
However, annual mid-year population estimates based on updates from the most recent census, taking into account births, deaths, net migration, and ageing of the population are produced by the Office for National Statistics. Mid-year population estimates are also available for Scotland and Northern Ireland. In some settings, particularly urban areas, there may be a significant issue with migration rates which can lead to potential inaccuracies in some local population projection estimates.
3. Mortality Data
Sources of routine mortality data in the United Kingdom
Complete and continuous mortality data are readily available in the UK. In England and Wales the information is derived from the registration of deaths certified by an attending medical practitioner or coroner (Procurator Fiscal in Scotland). Death certificates include information on the immediate and underlying causes of death, age at death, sex, address and occupation.
Death certificates are sent to the Office for National Statistics (ONS) where the underlying causes of death are classified according to the Tenth Revision of the International Classification of Diseases (ICD-10).
Annually published mortality statistics for England and Wales, compiled by the Office for National Statistics (ONS) include:
- Mortality Statistics: Deaths Registered in England and Wales (Series DR)
Presents statistics on deaths occurring annually in England and Wales. Data includes death counts and rates by sex, age-group, and underlying cause.
- Child Mortality Statistics: Childhood, Infant and Perinatal
Presents detailed analyses of all stillbirths, infant and perinatal deaths and data on deaths of children < 16 years by cause of death, sex and age-group.
Mortality data for Scotland and Northern Ireland are available from the relevant national statistical offices.
Mortality data are more complete and reliable than many other routine sources of morbidity data, but still subject to some level of misclassification. Such data may be affected by a number of factors including:
- Diagnostic accuracy
- The certifying experience of the attending medical practitioner
- Correct classification and coding of the death certificate
4. Morbidity Data
Sources of routine morbidity data
Routine data on morbidity are less readily available than data relating to mortality. However, a range of morbidity data is available that may be utilised for use in epidemiological studies and include:
- Cancer statistics registrations
- National congenital anomaly and rare disease registrations
- Statutory notifications of infectious disease
- Laboratory reporting of microbiological data
- General practitioner clinical codes (SNOMED CT)
- Hospital episode statistics
- Data from health surveys (for example, Health Survey for England)
- Royal College of General Practitioner Research and Surveillance Centre weekly reports
Cancer Statistics - England
Cancer registrations in England are conducted by eight independent regional registries that collect data on cancer cases in their regions. Regional registries supply a standard dataset (Cancer Outcomes and Services Dataset) monthly to the National Cancer Registration Service run by Public Health England for the provision of cancer statistics.
These data are published annually by the ONS (2 years after the year in which the cancer was diagnosed) in:
- Cancer Statistics: Registration Series MB1: Available here.
National and regional information on cancer incidence. Statistical tables are presented on numbers and rates of newly diagnosed cases of cancer by site, age-group and sex. Age standardised rates (using the European standard population) for the latest ten year period by cancer site and sex are also included.
- Cancer Registrations Wales: http://www.wcisu.wales.nhs.uk/home
Cancer registrations in Wales are co-coordinated by the Welsh Cancer Intelligence and Surveillance Unity (WCIS).
- Cancer Registrations Scotland: http://www.isdscotland.org
The cancer registration system in Scotland is coordinated by the information and statistics division (ISD) of the NHS in Scotland.
- International Agency for Research on Cancer (IARC): Available here.
International cancer incidence data including IARC Scientific Publication No.155: Cancer Incidence in Five Continents Vol. X
Care is needed when examining trends over time or between different regions. For example, cancer registries may differ with respect to methods of data collection, completeness of registrations or recording of data items. Submission of data to the registries is voluntary. In addition, misclassification of cancer cases or changes in coding systems over time may affect the reliability of the data particularly when examining trends over time.
Hospital Episode Statistics (HES)
Hospital Episodes Statistics (HES) provide a wide range of information about all in-patient admissions, outpatient appointments and accident and emergency attendances to NHS hospitals in England. HES data has been collected since 1989 and now includes private patients treated in NHS hospitals, patients resident outside of England and care delivered by treatment centres that are funded by the NHS.
More than 125 million new records are generated each year. Each contains a range of information about an individual patient’s hospital admission including:
- Clinical information about diagnoses and operations
- Patient data, such as age group, sex and ethnicity
- Geographical information on where the patient was treated and the area in which they live
- Administrative information, such as time waited, and dates and methods of admission and discharge
Prior to the introduction of the HES, the hospital in-patient enquiry (HIPE) service collected data on discharges and deaths, although this is not directly comparable to HES data.
Detailed information regarding the format and analysis of HES data is available online at:
Statutory Notifiable Diseases
In England and Wales cases of certain infectious diseases must be notified to the proper officer of the relevant local authority. All diagnostic laboratories in England must notify Public Health England (PHE) when a notifiable organism is confirmed. Reports of notifications of infectious diseases (NOIDs) are published weekly and annually by PHE.
PHE reports of NOIDs can be found here.
Vital Statistics for England and Wales
Vital statistics tables for England and Wales are produced by the ONS and include data on live/stillbirth rates, fertility rates, maternity statistics, death registrations and cause of deaths. Data is available by local authorities, health authorities and wards, with raw data being held by NHS Digital, formerly the Vital Statistics Service of the Health and Social Care Information Centre (HSCIC).
The UK General Household Survey
The General Household Survey (GHS) ran from 1971 to 2007. It was carried out annually by the Social Service Division of the ONS and comprised a random sample of 13,000 households in the UK. Data were collected for a number of socio-economic factors including health, use of health services, the prevalence of smoking and drinking, alcohol consumption, fertility history, employment, education and income. Archived data sets are available here.
The Health Survey for England
The Health Survey for England (HSE) was established in 1991 by the Department of Health and Social Care, and is now carried out in conjunction with NatCen Social Research (an independent social research agency). It comprises a series of annual surveys about the nation's health. The HSE is designed to be representative of the general population in England and aims to provide a measure of the health status of the population. Annual surveys cover the adult population aged 16 and older living in private households. Children have been included in the survey since 1995.
Each year the survey focuses on different demographic groups or diseases and their risk factors and looks at health indicators including cardiovascular disease, physical activity and eating habits. In addition to completing a health questionnaire, those surveyed are followed up by a nurse visit during which various physical measurements including blood pressure, lung function tests, blood and saliva are collected.1
The Health Survey for England is available online from the UK Data Service at:
5. Uses of routine health statistics
Routine health statistics may be utilised in a variety of epidemiological investigations including descriptive studies (cross sectional surveys or comparative mortality or morbidity studies) and disease surveillance. For example, cancer registration data may be used to examine trends in cancer incidence over time or variations in the incidence of specific cancers by age, sex, socio-economic status or geographic area. Cancer registration data may also be used to evaluate the effectiveness of cancer screening programmes or cancer care.
Routine or vital health statistics may also be used in analytical studies. However, their usefulness will depend on their completeness, accuracy, coverage and comparability between different regions or countries.
6. Strengths and weaknesses of routine data
Routinely collected data may be subject to incompleteness, bias and measurement error.
|Readily available||Lack of completeness, with potential bias|
|Low cost||Often poorly presented and analyzed|
|Useful for establishing baseline characteristics||Where there are small numbers of cases, it may be possible to identify individuals, threatening confidentiality|
|Identify cases in a case control study||Data may not be collected in a uniform way across the entire population|
|Generating aetiological hypotheses||Techniques of data collection may vary geographically, e.g. recording data, coding|
|Derive expected numbers in a cohort study or as a source for ascertaining outcome in a cohort study||Equivalent data not always available for all countries|
|Useful for examining trends of disease over time and by place||Delay between collection and publication|
Descriptive studies exploring patterns of disease distribution in populations often make use of routinely available health data. The results can be used for a variety of purposes, including generating hypotheses about the possible determinants of disease and for service planning.
Data derived from routine statistics are usually analysed with respect to three main categories:
- Time – when?
- Place – where?
- Person – who?
There are three broad patterns of variation in disease incidence with time:
- Secular (long-term) trends – changes in disease incidence over a number of years that do not conform to an identifiable cyclical pattern. For example, the secular trend in mortality from TB in England shows a steady decline over many years. However, this does not give any indication of the cause of the decline.
- Periodic changes including seasonality – regular or cyclical changes in incidence, for example in infectious diseases. Cases of influenza typically reach a peak in the winter months.
- Epidemics – strictly speaking, an epidemic is a temporary increase in the incidence of a disease in a population
Variations in disease incidence by place also fall under three main headings:
- Broad geographical differences – sometimes related to factors such as climate, or social and cultural habits. Some cancers show marked geographical differences in incidence.
- Local differences – distribution of a disease may be limited by the localisation of the cases, for example a contaminated water supply.
- Variations within a single institution – variations in attack rates by hospital ward, for example, may help identify possible sources or routes of spread of a gastrointestinal infection.
The chances of an individual developing a disease may be affected by their personal characteristics, so exploring this area may provide useful clues about possible causes. Personal characteristics can be classified in several ways, but important features include:
- Intrinsic factors such as age, gender and ethnic group– these are not amenable to change
- Occupation, marital status, behavioural habits and lifestyle
- Leadbeter D, Harnessing Official Statistics, Radcliffe, 2000.
- Ben-Shlomo Y, Brookes S, Hickman M. 2013. Lecture Notes: Epidemiology, Evidence-based Medicine and Public Health (6th ed.), Wiley-Blackwell, Oxford.
© Helen Barratt, Maria Kirwan 2009, Saran Shantikumar 2018