Sources of routine mortality and morbidity data, including primary care data, and how they are collected and published at international, national, regional and district levels

Populations: Collection of routine and ad hoc data

There are four main types of health information

1) Demographic data

This covers factors such as age, sex, migration patterns, ethnicity, marital status in populations and how it influences health.

  • 1.1 Census
  • 1.2 Exeter data 

2) Health event data

This covers recording of health events affecting individuals or populations.

  • 2.1 Births
  • 2.2 Deaths
  • 2.3 Self reported health
  • 2.4 Primary care interactions
  • 2.5 Secondary care interactions
  • 2.6 Health hazards

3) Circumstantial data

This covers aspects of individuals' and populations' circumstances that  may affect the wider determinants of health, including socio-economic, lifestyle, and environmental  data.

  • 3.1 Education data
  • 3.2 Employment data
  • 3.3 Housing data
  • 3.4 Environment data

4) National reference data

This covers data not purely issued for  health purposes, but which is used in connection with health data to improve understanding of health issues. Examples are

  • 4.1 Postcode look-up files that link postcodes to administrative and geographical units of which they are components, and  include map reference data
  • 4.2 Deprivation data
  • 4.3 ICD 10,  OPCS4, Clinical terms/Read/SNOMED CT coding systems for diagnoses and operative interventions

Key issues and questions that should be considered for any data set that provides information on population health:

  • Accuracy - to what extent  is  the data that is present correct?

  • Precision - have appropriate measures of uncertainty been included (e.g. 95% confidence intervals)

  • Completeness - how much of the data is missing?

  • Timeliness - what period does the data refer to, and how relevant  is that to the current position?

  • Coverage - is the whole population of interest represented, and if not, what fraction  makes up the sample?

  • Accessibility - who has access to the data, and is  it controlled (e.g. via password-restricted access, public domain)?

  • Confidentiality/suppression/disclosure control - there are strict regulations preventing the publication of datasets that might, when used in combination with other available sources, enable individuals to be identified. Are these
    regulations followed?

  • Original purpose of collection/collation - under Data Protection legislation, personal data may only be used for the purposes for which it was collected. NHS data registrations generally include improvement of the health of the population and
    management of the NHS among their purposes, but non-NHS data may not. In addition, change of purpose may be a source of bias in the data.

  • Who undertook the collection/collation - this may not be available.

  • How the data have been collected - this may not be available.

  • Whether what is included in the data set is the actual requirement, or whether  it will have to act as a proxy for the real item.

  • Is the data set comparative, what are the comparators, and are they appropriate?

  • If the dataset presents rates or ratios, have appropriate techniques been used to control for differing population structures? - for example, has direct or indirect standardisation been applied?

 

  • Demographic data

    1.1 Census data

    The most important source of demographic data at the population  level for the UK is the Census.

    1.2 Exeter data

    Within England, another important source of demographic data is Exeter data, managed by the National Strategic Tracing Service http://www.connectingforhealth.nhs.uk/nsts [accessed
    28/11/2007].

    Description

    The Exeter database stores information at individual patient level, on patient registration with general practitioners.  It contains information on:

    • NHS Number

    • Name

    • Address

    • Postcode

    • Sex

    • Date of birth

    • Place of birth

    • GP and GP Practice  patient is registered with

    • PCT of where the patient is registered

    Uses

    • The main purpose of the Exeter system was to pay GPs, on the basis of list capitation.  

    • For tracing people as they move and register with a new GP.

    • For providing GPs with a register.

    • Deprivation of registered patients at ward level is also factored in when calculating primary care resource allocation.

    • For recording national adult cancer screening programmes data.

    • For understanding local populations and to inform practice based commissioning.

    Strengths

    Crucial for practice profiling by practice clusters, PCTs and public health observatories.

    Postcodes enable determination of local authority of residence. Local authorities do not have equivalent databases of their residents, and in collaborative work between NHS and LAs the picture of the population Exeter makes available can
    be enormously useful.

    Weaknesses

    GP lists are inflated on average by 5.7%, due to mobility among young adults and delays in removing list members on death or emigration (http://www.primary-care-db.org.uk/datasets_help.cfm)
    [accessed 28/11/2007].

    Vulnerable populations such as homeless people, asylum seekers, and some migrant workers tend not to be registered with GPs so are missing from the Exeter system.

    Place of birth, which might  be useful in ethnic analyses, is a free text field, and may vary from  'home' to  country to detailed address.

     

  • Health event data

    Births and death registration.

    In the UK, registration of birth and death events is a legal requirement. Information on how registration occurs can be found at http://www.gro.gov.uk/gro/content/ [accessed 28/11/2007]. 

    2.1 Births data

    http://www.familyrecords.gov.uk/topics/bmd_2.htm#birth [accessed 28/11/2007]. 

    • Child's forenames

    • Sex

    • Date of birth

    • Place of birth

    • Mother's full name and maiden name

    • Father's full name and occupation if married to the mother

    • Name, address and relationship to child of the person who registered the birth

    • Information on marital status and living arrangements of parents

    • Parents' occupation

    • Postcode of mother's normal place of residence

    In Scotland the following additional information would be given:

    • Time of birth

    • Date and place of parents' marriage

    Publicly accessible births data on births can be found at:

    http://www.statistics.gov.uk/statbase/Product.asp?vlnk=5768 [accessed 28/11/2007].

    Summary data includes information on stillbirths:

    http://www.lho.org.uk/DATAANDMETHODS/Datasources_Description/Births_Data.aspx [accessed 28/11/2007].

    2.2 Deaths data

    Description

    In England and Wales, deaths need to be reported to the local registrar within 5 days of the death occurring. http://www.gro.gov.uk/gro/content/deaths/ [accessed 28/11/2007].

    Deaths data are passed to ONS on a weekly basis.  ONS provide a monthly dataset on deaths to directors of public health, including cause of death and contributing factors.  Information on what is captured on the death certificate can
    be found at http://www.familyrecords.gov.uk/topics/bmd_2.htm#death [accessed 28/11/2007]. 

    Data on the certificate and  the data files available to Public Health departments includes:

    • Full name of deceased

    • Date of death

    • Address and postcode of normal place of residence

    • Place of death

    • Given age

    • Cause of death, underlying and participatory

    • Occupation (or name and occupation of husband if the deceased was a married or widowed woman)

    • Name, address and family relationship (if any) of the person who reported the death.

    In Scotland the following additional information would be given:

    • Marital status

    • Spouse's name

    • Sex

    • Father's name and rank or profession

    • Mother's name and maiden name

    Mortality  statistics publications are routinely available from the ONS http://www.statistics.gov.uk/onlineproducts/default.asp#health [accessed
    28/11/2007]. Among them are:

    DH1 - Annual review of the Registrar General on deaths in England and Wales

    DH2 - Deaths by cause

    DH3 - Childhood infant and perinatal mortality

    DH4 - Injury and poisoning

    Further information on data about deaths is available at  http://www.lho.org.uk/DATAANDMETHODS/Datasources_Description/Deaths_Data.aspx [accessed
    28/11/2007].

    Public Health Mortality Files contain information about deaths within different health authority boundaries. It also includes details about people who died outside the health authority region in which they were normally resident.  These
    files are only available to health authorities and at a cost.

    National births and deaths data are made available to regional Public Health Observatories (PHOs).  Mortality data have been linked to Hospital Episode Statistics and are in the process of being made available to PHOs. 2005/6 and historic
    years' extracts are available with date of death added. Discussions with ONS are ongoing to secure a regular quarterly feed of deaths data also including cause and place of death.

    Data on stillbirths can be found on the Compendium website http://www.nchod.nhs.uk/  [accessed 28/11/2007].

    Public Health Uses of Births and Deaths data

     

    • Health service planning
    • Epidemiology
    • Monitoring and evaluation
    • Audit
    • Screening programmes (breast and ovarian cancer, immunisation take up)
    • Confidential enquiries and register checking
    • Inequalities analysis (postcode enables precise geographical analysis of population and patients).
    • Assessing progress against targets (e.g. infant mortality, life expectancy).

    Strengths



    Both births and deaths data are very complete and accurate for the UK.

    Deaths data can provide very important information on health of populations.

    Weaknesses



    Ethnicity is not  collected for either deaths or  births (though where these take place in  NHS hospitals it may be derived from HES
    records.

    Defining the socio-economic status from the occupation recorded on birth registration records is very difficult and possibly even more tenuous for single mothers. 

    Deaths are not reliable as  a picture of burden of morbidity of chronic illness; more people are living longer with illness than in the past; quality of recording cause of death varies considerably.

    2.3       Self reported health

    Self reported health is by its nature subjective and different people will have different thresholds for what is considered 'serious' or 'very painful'.  Definitions can change over time as well.  Conversely people can forget
    illnesses that they have had, or choose to withhold information.  However, only a proportion of people experiencing ill-health make contact with health services. Self reported ill-health can therefore help to show more of the burden of ill
    health in the population that is otherwise 'under the line'.

    Figure 1 the iceberg concept



    Source:  Donaldson and Donaldson 'Essential Public Health' 2nd edition.

    Sources of information on self-reported health

    Census: question on self reported general health and long term limiting illness.

    Health Survey for England

    Carried out annually since 1991, The Health Survey for England is a series of annual national surveys about the health and related behaviours of people in England involving about 16,000 adults.  It comprises a questionnaire with taking
    some physical measurements and blood samples.  Questions on specific diseases are also asked at certain intervals such as cardio-vascular disease and accidents.  However, currently its sample size is not large enough to provide
    PCT/local authority level data.  It is open to  concerned organisations to sponsor  a boost of the health survey for England. The London Health Observatory did so in 2006 to provide an increased sample of London residents on
    specific aspects of health related behaviours. Collection of the data began in February 2006 and takes 12 months to complete. Data from the survey is expected in Summer - Autumn of 2007. More details can be found at
    http://www.lho.org.uk/OurWork/LondonHealthSurvey.aspx
    [accessed 28/11/2007].

    General Household survey

    http://www.statistics.gov.uk/ssd/surveys/general_household_survey.asp [accessed 28/11/2007].

    The General Household survey is conducted every year and involves approximately 13,000 adults in Great Britain. It provides information on all sorts of areas for government departments.  On health issues, questions include acute illness
    in the last 2 weeks; illness over the last year; presence of chronic illness; consultations with a doctor; visits to hospital; smoking and alcohol consumption.   The General Household survey has indicated an increase in self-reported
    long standing illness.

    2.4 Primary care interactions

    In the UK, 90% of reported ill health is captured at GP practice level and so primary care data is very important, especially when the iceberg effect is considered (see figure 1).  There is potential to offer a huge amount of information
    on health of population,  but it has been historically very difficult to access. Most GP Practices are now computerised, and in some areas may allow access to the data to public health staff directly. Changes in the NHS have resulted in
    improved primary care data collection.

    QMAS:

    http://www.connectingforhealth.nhs.uk/delivery/programmes/qmas [accessed 28/11/2007].

    In operation since 2004, the Quality Management and Analysis System (QMAS) supports a new GP contract which depends partly on quality of care as well as numbers of patients registered. Almost all  practices are submitting data. Its
    primary function is financial, and awards payment under the Quality and Outcomes Framework (QOF).  The database is replicated for other purposes, including Public Health. The replica is  managed by the prescribing support unit, based at
    the information centre for health and social care http://www.ic.nhs.uk/psu/services/QOF [accessed 28/11/2007].

    Data are collected on a number of clinical domains, which vary over time.  They currently include:

    • CHD

    • Stroke

    • Hypothyroidism

    • Diabetes

    • Hypertension

    • Mental health

    • COPD

    • Asthma

    • Epilepsy

    Uses

    QoF data enables local prevalence to be estimated for the conditions  it covers. These may be compared with other prevalence studies such as the Health Survey for England,  as exemplified  in the LHO  report
    http://www.lho.org.uk/viewResource.aspx?id=10070
    [accessed 28/11/2007].

    Strengths

    The new GP contract gives an incentive to GPs to  improve completeness of data.  It encourages the establishment of disease registers.  There are incentives to identify more registered patients needing to be on disease registers
    and receive treatment.

    Weaknesses

    Raw data is not available to PCT or public health departments. Instead, some  ready analysed data is available. This is  limited: there is no age/sex/ethnicity breakdown; comparable analyses may be inappropriate; there is
     no information on co-morbidity, and significant under recording of some indicators.

    General Practice Research Database (GPRD)

    http://www.gprd.com/home/ [accessed 28/11/2007]

    This is a proprietary product, only accessible to public health departments on a fee-paying basis. It is a longitudinal anonymised database that claims to be the largest source of computerised information on morbidity and prescription activity
    in GP practices, holding data from 1987 to present.  Participating practices agree guidelines on recording clinical data.

    Strengths

    Quality continually assessed.

    Available for research questions.

    Standards for recording allow collation.

    Weaknesses

    Incomplete - only a small proportion of self selecting practices across the country (450 or so)

    2.5 Secondary care interactions

    In the NHS, data on patients' interactions with the secondary (i.e. hospital) care services is recorded on statutorily defined datasets. These are recorded by providers and exchanged with commissioners of care via electronic clearing houses.
    The data collected varies according  to the type of  interaction - outpatient attendance, admitted care, waiting for elective admission, A&E. In each case the data to be collected is set out in the NHS Data Dictionary http://www.connectingforhealth.nhs.uk/systemsandservices/data/datamodeldictionary/index_html
    [accessed 28/11/2007].

    The mechanism of exchange has varied from time to time, the former NHS Wide Clearing Service (NWCS) being replaced by the Secondary User Service (SUS), though at present the data flows remains similar.

    Primary Care Trusts receive data on a monthly basis. Mostly this is a measure of activity, but data on admitted patients also includes clinically coded diagnoses and operative procedures and ethnicity,  that enable determination of
    met need and  assist in the analysis of health need and inequalities.

    Uses

    NWCS/SUS  data can help to identify health needs of the local population. It can be used to  determine patient flows for treatment and  contribute to the analysis of health outcomes.

    The most direct use of NWCS data is for monitoring contracts between primary care trusts and hospital providers.  Under Payment by Results, hospitals are paid for the activity they undertake.  Payment by Results is underpinned by
    Healthcare Resource Group (HRG) codes.  HRGs are grouping of conditions and procedures that are clinically and  resource-intensively similar.  A national tariff is applied to each HRG code and updated each year.  http://www.kingsfund.org.uk/resources/briefings/payment_by.html
    [accessed 28/11/2007].

    NWCS/SUS data is cleaned and collated on a national basis to create HES data.

    Strengths

    Timely data; components of NWCS data are made available to PCTs by hospital trusts on a monthly basis.

    The system by which it is delivered allows questioning and challenging of the data before a certain date each month.

    Weaknesses

    Clinical coding may be of variable quality. Recording of ethnicity is sometimes a problem.

    Outpatient datasets do not contain diagnostic information.

    Exchange of A&E data, although mandatory,  often does not happen as the define dataset is not considered particularly useful,  so only a partial picture of A&E activity is available.

    Hospital Episode Statistics (HES)

    http://www.hesonline.nhs.uk/Ease/ [accessed 28/11/2007]

    Description

    HES is a nation-wide dataset of all hospital admissions, recorded using computerised Patient Administration Systems. Each record is a subset of the record submitted by the provider to NWCS/SUS, and so defined in the NHS Data Dictionary. It is
    generally issued on an annual basis, though provisional data is now issued quarterly.  Each record represents time under the responsibility of a specific consultant 'finished consultant episodes'.  A patient could have several
    episodes within one spell in hospital.  There are hundreds of fields including data on maternities and augmented care for patients whose care involves the use of Intensive and High Dependency care facilities. Data include:

    • NHS Number

    • Full postcode

    • Date of birth

    • Sex

    • Ethnicity (more complete with each year)

    • Diagnosis fields

    • Procedures fields

    • Registered GP

    • PCT of residence

    • Date of admission

    • Date of discharge

    • Method of admission

    • Method of discharge

    Public Health Observatories are considered  'HES safe havens' and get access to sensitive fields on a national basis.  Strict protocols are in place for how data can be disseminated including clear suppression rules to prevent
    disclosure. Various pre-analysed reports are available free online.

    Dr Foster intelligence [accessed 28/11/2007] has access to NWCS and HES data and provides data via web-based tools on hospital and PCT activity.

    Uses

    At a national level, health service planning, monitoring activity, and assessing quality of care.

    At a local level (e.g. PCT) NCWS/SUS data is more timely, and HES is mostly used to enable national comparisons and benchmarking. Comparisons can be made between areas, for example, by calculating standardised rates.  

    Monitoring inequalities across geographical areas, including ward level.

    Examples of outputs can be found at http://www.lho.org.uk/DATAANDMETHODS/Local_Data/HES_Analyses.aspx [accessed 28/11/2007].

    Strengths

    Completeness of data increasing.

    Standard codes (ICD10 and OPCS4) used for diagnoses and procedures.

    For serious morbidity, can give prevalence of the condition across the country.

    Recently becoming linked with ONS mortality data

    Weaknesses

    HES and NWCS/SUS data  only tells us about those who have a disease and then use health care facilities.  They do not  give full picture of morbidity.  Specifically they do not tell us about those people who have a disease
    or disability but do not seek care - see iceberg effect and figure 1.  Some fields are very incomplete - for example, whilst completeness of the ethnicity field has been increasing with each year, a significant proportion (10 - 15%) is
    filled with 'not stated'.

    Outpatient data became available via HES in 2007 but comes with limitations.  Diagnostic data are not completed.  Other limitations can be found here:
    http://www.hesonline.org.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=805


    Timeliness - there used to be quite a delay between NWCS outputs and HES.  This is less because provisional data are made available on a quarterly basis, before ministerial sign-off.

    Cancer registries

    http://www.thames-cancer-reg.org.uk/ [accessed 28/11/2007]

    In the United Kingdom, there are 12 registries  (geographically defined) and each contributes to the National Cancer Registry overseen by the Office for National Statistics.  Cancer registries were set up to collate new cases of
    cancer and use this information to produce statistics about cancer incidence, prevalence, survival and mortality.  In recent years the work of cancer registries has expanded from the monitoring of cancer occurrence to include the analysis of
    different aspects of cancer prevention, treatment outcomes and care.

    Uses

    Patient follow up.

    Auditing treatment, comparing with other treatment outcomes.

    Evaluation of services.

    Studies of causation.

    Health service planning.

    Strengths

    Very rich, detailed source of information - patient identifiable information which is longitudinal i.e. updated over time. 

    Weaknesses

    Expensive to run - updating the registry is laborious.

    Confidentiality issues

    Assessing completeness/under-coverage is not straightforward [Brenner et al]

    More recently, more rigorous methods have been developed to assess completeness: http://www.thames-cancer-reg.org.uk/research/pubs/research_activities_2004.pdf
    [accessed 28/11/2007].

    Completeness of ethnicity could be improved: http://www.biomedcentral.com/1471-2458/6/281/abstract [accessed 28/11/2007].

    Other examples of registries:

    Congenital anomalies

    Industrial diseases

    Diabetes

    Issues to consider when establishing a registry

    Why Who will handle the analyses
    What disease Who is taking enquiries
    Clear case definition Who is producing reports
    System for reporting new events Financial implications
    What is to be stored Confidentiality and ethical issues
    What is to be reported Maintaining quality

    Other examples of routine morbidity data

    Notifications of communicable diseases

    Notifications of foetal anomalies

    Abortion statistics

    Korner data, or Central Returns, are available at aggregated level only and are the NHS' only source of community data.  Significant datasets for Public health are:

    • KC60 - GUM clinic activity

    • KC50 - immunisation data

    • KC53, 63 - adult screening, cervical and breast

    • Mental health minimum dataset

    The Mental Health Needs Index (MINI) provides an estimate of the need for inpatient mental health services for adults (ages 16-59) by ward and borough. It is calculated using a number of population variables likely to indicate need for access
    to services, such as deprivation; proportion of economically active adults unemployed; proportion of adults living in households not self contained etc. The MINI provides both predicted admission rates and a ratio of need compared to the England
    average. The MINI was developed by the Centre for Public Mental Health which has produced an online tool for accessing information on a ward / borough level. http://www.dur.ac.uk/mental.health/index.php?l1=1&l2=27&s=27
    [Accessed 11/1/2008]

    Values and limitations of routine data

    Value of routine data Limitations of routine data
    Readily available Lack of completeness
    Limited costs Potential for bias
    Up to date Limited details of determinants such as income and ethnicity
    Useful for identifying hypotheses Often poorly presented and analysed
    Useful for initial assessment Occasionally subject to political influences and manipulation

    Provides baseline data on expected levels of health/disease

     

    The most valuable feature of routine data is their availability at little cost to the researcher.  They may be especially helpful in establishing baseline characteristics regarding the health status of the community, in generating
    hypotheses as a result of sex, age, cohort or geographic variation, in identifying potential areas requiring further research. However, no one single data set can provide the whole picture of a population's health and its needs. 

    Ad hoc data

    Local health surveys may be carried out by PCTs when national surveys do not give sufficient information at local level.

     

     

    References

    H Brenner, C Stegmaier, and H Ziegler, Estimating completeness of cancer registration: an empirical evaluation of the two source capture-recapture approach in Germany J Epidemiol Community Health 1995; 49: 426-430

    © M Goodyear & N Malhotra 2007