Sources of routine mortality and morbidity data, including primary care data, and how they are collected and published at international, national, regional and local levels


We are currently in the process of updating this chapter and we appreciate your patience whilst this is being completed.


Key issues and questions that should be considered for any data set that provides information on population health:

  • Accuracy - to what extent are the data that are present, correct?
  • Precision - have appropriate measures of uncertainty been included (e.g. 95% confidence intervals)
  • Relevance-to what extent do the data answer the question posed?
  • Completeness - how much of the data is missing?
  • Timeliness - what period does the data refer to, and how relevant is that to the current position?
  • Coverage - is the whole population of interest represented, and if not, what fraction makes up the sample?
  • Accessibility - who has access to the data, and is it controlled (e.g. via password-restricted access, public domain)?
  • Confidentiality/suppression/disclosure control - there are strict regulations preventing the publication of datasets that might, when used in combination with other available sources, enable individuals to be identified. Are these regulations followed?
  • Original purpose of collection/collation - under Data Protection legislation, personal data may only be used for the purposes for which it was collected. NHS data registrations generally include improvement of the health of the population and management of the NHS among their purposes, but non-NHS data may not. In addition, change of purpose may be a source of bias in the data.
  • Who undertook the collection/collation? - this may not be available.
  • How the data have been collected? - this may not be available.
  • Whether what is included in the data set is the actual requirement, or whether it will have to act as a proxy for the real item.
  • Is the data set comparative, what are the comparators, and are they appropriate?
  • If the dataset presents rates or ratios, have appropriate techniques been used to control for differing population structures? - for example, has direct or indirect standardisation (qv.) been applied?

2.1  Sources and publication of Mortality data in the UK

Mortality  data and statistics publications are routinely available from the Office for National Statistics (ONS)  [accessed 14/08/2018].

As of August 2018, there are 46 data sets related to deaths - see:

[Accessed 14/08/2018]

Among them are:

Deaths registered by area of usual residence, UK

[Accessed 14/08/2018]

Life expectancy at birth and selected older ages: [Accessed 14/08/2018]

Deaths registered by age, sex and selected underlying cause of death. Tables also provide both mortality rates and numbers of deaths over time.

Analysis of provisional 2015 death registrations data by sex, cause, age, region and local authority.

Further information on mortality data is available at  the ONS website, for example:

Statistical bulletin:  Deaths registered in England and Wales: 2017, which has annual data on death registrations, contains death rates, cause of death data by sex and age, and death registrations by area of residence and singe year of age.
[Accessed 14/08/2018]

Article:  Mortality in the United Kingdom: 1983-2013.  This summary report presents trends in age-standardised mortality rates in the UK and constituent countries over the period 1983 to 2013.  Age-standardised rates allow valid comparisons to be made between different countries and over time.  Trends in the overall number of deaths, infant deaths, causes of death for 3 broad disease groups (cancers circulatory diseases and respiratory diseases) and age groups are included for both males and females.
[Accessed 14/08/2018]


Public Health Uses of Births and Deaths data

  • Health service planning
  • Epidemiology
  • Monitoring and evaluation
  • Audit
  • Screening programmes (breast and ovarian cancer, immunisation take up)
  • Confidential enquiries and register checking
  • Inequalities analysis (postcode enables precise geographical analysis of population and patients).
  • Assessing progress against targets (e.g. infant mortality, life expectancy).

Both births and deaths data are very complete and accurate for the UK.

Deaths data can provide very important information on health of populations.

Ethnicity is not  collected for either deaths or  births (though where these take place in  NHS hospitals it may be derived from HES records.

Defining the socio-economic status from the occupation recorded on birth registration records is very difficult and possibly even more tenuous for single mothers. 

Deaths are not reliable as a picture of burden of morbidity of chronic illness; more people are living longer with illness than in the past; quality of recording cause of death varies considerably.

2.2   Sources and publication of morbidity data 

Self-reported health
Self-reported health is by its nature subjective and different people will have different thresholds for what is considered 'serious' or 'very painful'.  Definitions can change over time as well.  Conversely people can forget illnesses that they have had, or choose to withhold information.  However, only a proportion of people experiencing ill-health make contact with health services. Self-reported ill-health can therefore help to show more of the burden of ill health in the population that is otherwise 'under the line'.

Figure 1 the iceberg concept

Source:  Donaldson and Donaldson 'Essential Public Health' 2nd edition.


Sources of information on self-reported health
Census: question on self-reported general health and long term limiting illness.

Health Survey for England
Carried out annually since 1991, The Health Survey for England is a series of annual national surveys about the health and related behaviours of people in England involving about 16,000 adults.  It comprises a questionnaire with taking some physical measurements and blood samples.  Questions on specific diseases are also asked at certain intervals such as cardio-vascular disease and accidents.  However, currently its sample size is not large enough to provide local authority level data (individual local authorities sometimes conduct their own surveys using identical questions in order to better understand the local situation). 

More details can be found at  [accessed15/08/2018].


National psychiatric Morbidity Survey

2.3  Primary care data
In the UK, 90% of reported ill health is captured at GP practice level and so primary care data are potentially very important, especially when the iceberg effect is considered (see figure 1).  There is potential to offer a huge amount of information on health of population,  but it has been historically very difficult to access. Most GP Practices are now computerised, and in some areas may allow access to the data to public health staff directly. Changes in the NHS have resulted in improved primary care data collection.

Quality and Outcomes Framework in UK General Practice
This is a system for performance management and payment for GPs[accessed 15/08/2018].

Data are collected on a number of clinical domains, which vary over time.  They currently include:

  • CHD
  • Stroke
  • Hypothyroidism
  • Diabetes
  • Hypertension
  • Mental health
  • COPD
  • Asthma
  • Epilepsy

QoF data in theory enable local prevalence to be estimated for the conditions it covers. However, in practice, the epidemiological validity of present routine data from general practices, e.g., the Quality and Outcomes Framework (QOF), is not quality controlled and is sometimes very poor. For example, in a study of QOF depression registers in 2013, there was a very large variation in the prevalence of depression for numbers on GP QOF depression registers by general practice.

Figure: Ratio of ‘observed’ number of patients with depression diagnosed within the preceding year to the ‘expected’ number with depression by general practice, Northamptonshire, 2012




Source: Lawrence D Report to Northamptonshire County Council 2013: Observed-QOF reports. Expected-National Psychiatric Morbidity Survey

Analysis of QOF data and MH prevalence estimates,[1] based on QOF data from Nene and Corby CCGs, 2012-3 and GP list populations 2012[2]








Clinical Practice Research Datalink CPRD)
formerly General Practice Research ‘Database’  [accessed 15/08/2018]

The Clinical Practice Research Datalink (CPRD) is a governmental, not-for-profit research service, jointly funded by the NHS National Institute for Health Research (NIHR) and the Medicines and Healthcare products Regulatory Agency (MHRA), a part of the Department of Health and Social Care.


Secondary care data

Hospital Episode Statistics
In the NHS, the main source of data on patients' interactions with the secondary (i.e. hospital) care services is the Hospital Episodes Statistics (HES) database.
[accessed 15/08/2018]

These data are recorded by providers and exchanged with commissioners of care via electronic clearing houses. The data collected varies according to the type of  interaction - outpatient attendance, admitted care, waiting for elective admission, A&E. In each case the data to be collected is set out in the NHS Data Dictionary.  [accessed 15/08/2018]

Information on data is known as ‘metadata’.

They are called ‘episode’ statistics because hospital inpatient admissions are subdivided into episodes of care under hospital consultants. That is, if the admission was consecutively under two different Consultants, the hospital stay would have two episodes[accessed 28/11/2007].

The data are available under the Secondary User Service (SUS):

Clinical Commissioning Groups (CCGs) receive data on a monthly basis. Mostly this is a measure of activity, but data on admitted patients also includes clinically coded diagnoses and operative procedures and ethnicity, that enable determination of met need and assist in the analysis of health need and inequalities.

SUS data can help to identify health needs of the local population. It can be used to determine patient flows for treatment and contribute to the analysis of health outcomes.

The most direct use of HES data is for monitoring contracts between (CCGs and hospital providers.  Under ‘Payment by Results’, hospitals are paid for the activity they undertake. [accessed 15/08/2018]

Payment by Results is underpinned by Healthcare Resource Group (HRG) codes, developed initially by Dr Hugh Sanderson, a Consultant in Public Health Medicine in Wessex.  HRGs, based on the American ‘Diagnostic Related Groups’, are grouping of conditions and procedures that are clinically and resource-intensively similar.  A national tariff is applied to each HRG code and updated each year. 

HES is a nation-wide dataset of all hospital admissions, recorded using computerised Patient Administration Systems. Each record is a subset of the record submitted by the provider to SUS, and so defined in the NHS Data Dictionary. It is generally issued on an annual basis, though provisional data is now issued quarterly.  HES data often take longer to produce than SUS but have greater accuracy.  Each record represents time under the responsibility of a specific consultant 'finished consultant episodes'.  A patient could have several episodes within one spell in hospital.  There are hundreds of fields including data on maternities and augmented care for patients whose care involves the use of Intensive and High Dependency care facilities. Data include:

  • NHS Number
  • Full postcode
  • Date of birth
  • Sex
  • Ethnicity (more complete with each year)
  • Diagnosis fields
  • Procedures fields
  • Registered GP
  • PCT of residence
  • Date of admission
  • Date of discharge
  • Method of admission
  • Method of discharge

HES Uses
At a national level, health service planning, monitoring activity, and assessing quality of care.

At a local level (e.g. PCT) NCWS/SUS data is more timely, and HES is mostly used to enable national comparisons and benchmarking. Comparisons can be made between areas, for example, by calculating standardised rates.  

Monitoring inequalities across geographical areas, including ward level.

Example of output can be found at:  [accessed 15/08/2o18]

Completeness of data increasing. For example, a recent study by N. Dattani and . MacFarlane concluded: “The linkage rate for birth registration and notification records to HES delivery records steadily improved from 2005 to 2014 due to improvement in the quality and completeness of patient identifiers in both HES and birth notification data.”

Standard codes, ICD10 for diagnoses, and OPCS4 for procedures are - DL TO ADVISE.

For serious morbidity where secondary care is necessary, can give prevalence of the condition across the country.
Recently becoming linked with ONS mortality and births data
Provides a time series going back many years.


HES and SUS data only tell us about those who have a disease and then use health care facilities.  They do not give a full picture of morbidity. Specifically they do not tell us about those people who have a disease or disability but do not seek care - see iceberg effect and Figure 1.  Some fields are very incomplete - for example, whilst completeness of the ethnicity field has been increasing with each year, a significant proportion (10 - 15%) is filled with 'not stated'.

Diagnostic and procedure coding is often poor. For instance, HES coding for the procedure cataract removal is complicated and has often been miscoded.

Outpatient data became available via HES in 2007 but comes with limitations. 

Diagnostic data are not completed.  For instance, in the 2016-17 Outpatient Primary diagnosis dataset, hosp-epis-stat-outp-prim-diag-2016-17-tab.xls[accessed 15/08/2018] It is noted that “primary diagnosis is not a mandated field in the outpatients dataset, therefore coverage within this field is poor.” In fact, in this dataset, out of 93,944,301 ‘all attendances’, 89,700,366 were categorised “Unknown and unspecified causes of morbidity”.

Other limitations can be found here in data quality notes: [accessed 15/08/2018]

in the publication DQ Notes M3 2018-19 (Publication)

Clinical registers
A clinical register is a register of all cases in a defined area with a specific disease or condition, The clinical criteria for inclusion should be defined. [accessed 15/08/2018]

There are many such registers in the UK and in the world.

Cancer registry  [accessed 15/08/2018

The National Cancer Registry overseen by the Office for National Statistics. 

Cancer registration statistics are available from the ONS website, See: [accessed 15/08/2018]

Cancer registries were set up to collate new cases of cancer and use this information to produce statistics about cancer incidence, prevalence, survival and mortality.  In recent years the work of cancer registries has expanded from the monitoring of cancer occurrence to include the analysis of different aspects of cancer prevention, treatment outcomes and care.



  • Patient follow up.
  • Auditing treatment, comparing with other treatment outcomes.
  • Evaluation of services.
  • Studies of causation.
  • Health service planning.

Quality and methodology 

"The Cancer Registration Statistics Quality and Methodology Information report  contains important information on: 

  • the strenghts and limitations of the data and how it compares with related data
  • uses and users of the data
  • how the output was created
  • the quality of the output including the accuracy of the data" [accessed 15/08/2018]

Very rich, detailed source of information - patient identifiable information which is longitudinal i.e. updated over time. 


  • Expensive to run - updating the registry is laborious.
  • Confidentiality issues
  • Assessing completeness/under-coverage is not straightforward [Brenner et al]

Other examples of registries:

Issues to consider when establishing a registry:

Why (users and purposes)

Who will handle the analyses

What disease

Who is taking enquiries

Clear case definition

Who is producing reports

System for reporting new events

Financial implications

What is to be stored

Confidentiality and ethical issues

What is to be reported

Maintaining quality


Other examples of routine morbidity data

Notifications of communicable diseases

Notifications of foetal anomalies

Abortion statistics

Korner data, or Central Returns, are available at aggregated level only and are the NHS' only source of community data.  Significant datasets for Public health are:

  • KC60 - GUM clinic activity
  • KC50 - immunisation data
  • KC53, 63 - adult screening, cervical and breast
  • Mental health minimum dataset

The Mental Health Needs Index (MINI) provided an estimate of the need for inpatient mental health services for adults (ages 16-59) by ward and borough. It is calculated using a number of population variables likely to indicate need for access to services, such as deprivation; proportion of economically active adults unemployed; proportion of adults living in households not self-contained, etc. The MINI provided both predicted admission rates and a ratio of need compared to the England average. The MINI was developed by the Centre for Public Mental Health which has produced an online tool for accessing information on a ward / borough level.  [accessed 16/08/2018]

Values and limitations of routine data

Value of routine data

Limitations of routine data

Readily available

Lack of completeness

Limited costs

Potential for bias

Up to date

Limited details of determinants such as income and ethnicity

Useful for identifying hypotheses

Often poorly presented and analysed

Useful for initial assessment

Occasionally subject to political influences and manipulation

Provides baseline data on expected levels of health/disease



The most valuable feature of routine data is their availability at little cost to the researcher.  They may be especially helpful in establishing baseline characteristics regarding the health status of the community, in generating hypotheses as a result of sex, age, cohort or geographic variation, in identifying potential areas requiring further research. However, no one single data set can provide the whole picture of a population's health and its needs. 

Ad hoc data
Local health surveys may be carried out by CCGs when national surveys do not give sufficient information at local level.

Social and economic data
These data can be obtained from the Office of National Statistics, which has dat and bulletins on:“People, population and community

People living in the UK, changes in the population, how we spend our money, and data on crime, relationships, health and religion. These statistics help us build a detailed picture of how we live.”  [accessed 16/08/2018]

Employment and labour market

People in and out of work covering employment, unemployment, types of work, earnings, working patters and workplace disputes.  [accessed 16/08/2018]



© M Goodyear & N Malhotra 2007, D Lawrence 2018


[1] Northamptonshire QOF-12-13-Obs Exp v2 G prac-depr v 1-1.xlsx

[2] GP Practice Population 2012.xlsx