Learning objectives: You will learn about key questions and issues to consider for population health data, accessing demographic data, and national references and associated data.
Studying health and sickness within a population requires data from a variety of sources. This section covers the types of issues that you need to be aware of, and potential national data sources for demographic and other related topics.
Please read the resource text below.
Key questions and issues to consider for population health data
It is necessary to appraise the data source and the quality of the data prior to utilising it. Some key questions that should be considered for any dataset that provides information on population health are as follows:
- Who undertook the collection/collation and how was it collected?
- Does the dataset include the actual requirement, or will it be a proxy for the real item?
- Is the dataset comparative, what are the comparators, and are they appropriate?
- If the dataset presents rates or ratios, have appropriate techniques been used to control for differing population structures?
In addition, there are eight issues to be addressed regarding the data:
- accuracy - to what extent are the present data correct?
- precision - have appropriate measures of uncertainty been included (e.g. 95% confidence intervals)
- completeness - how many of the data are missing?
- timeliness - what period does the data refer to, and how relevant is that to the current position?
- coverage - is the whole population of interest represented, and if not, what fraction makes up the sample?
- accessibility - who has access to the data, and is it controlled (e.g. via password-restricted access, public domain)?
- confidentiality/suppression/disclosure control - there are strict regulations preventing the publication of datasets that might, when used in combination with other available sources, enable individuals to be identified. These regulations may impact on datasets where further breakdowns result in a low number of individuals being included in the category. Are the regulations followed?
- original purpose of collection/collation - under Data Protection legislation, personal data may only be used for the purposes for which it was collected. NHS data registrations generally include improvement of the health of the population and management of the NHS among their purposes, but non-NHS data may not. In addition, change of purpose may be a source of bias in the data.
Data on the general population and its characteristics can be found through a number of sources, including surveys. However, the most important source at the national population level is through the census. This is carried out every ten years since 1801, with the exception of 1941. Data are collected on individuals and on households. The exact dataset varies from census to census. Data on individual characteristics including age, gender, marital status, country of birth, education level, occupational status etc are gathered from the census. Data on household size and structure, amenities, type of tenure etc are also collected.
In addition to census data, there are other national data collections that can assist in understanding population health and associated characteristics. Within England, Exeter data stores information at individual patient level on registration with general practitioners. It is managed by the National Strategic Tracing Service http://www.connectingforhealth.nhs.uk/nsts.
The dataset contains information on
- NHS Number
- date of birth
- place of birth
- GP and GP Practice patient is registered with
- PCT of where the patient is registered
The Exeter system was mainly used to pay GPs, on the basis of list capitation. However, additional uses include the study of migration across GP practices, for providing GPs with a register, and analysis of deprivation of registered patients to calculate primary care resource allocation, and understanding local populations.
Strengths: This dataset enables practice, PCT and health observatory population profiling, and by using the postcodes, it is possible to map the data to local authority of residence. Local authorities (LAs) do not have equivalent databases of their residents, and the picture of the population Exeter makes available for collaborative work between NHS and LAs can be enormously useful.
Weaknesses: GP lists are inflated on average by 5.7%, due to mobility among young adults and delays in removing list members after death or emigration. Vulnerable populations such as homeless people, asylum seekers, and some migrant workers tend not to be registered with GPs, so are missing from the Exeter system. Place of birth, which might be useful in ethnic analyses, is a free text field, and may vary from 'home' to country to detailed address.
National Reference Data
Population health data can be complemented by analysis of factors that may be associated with health and may be classed as determinants of health status. A review of the social determinants was published by the World Health Organisation and outlined areas such as occupational class, work, stress, social exclusion, addiction, transport and food as potential determinants (http://www.euro.who.int/document/e81384.pdf). Although there may be an overwhelming number of data sources and potential areas for consideration, these may be usefully combined as a basket of indicators to assist with analysing population health. Three particular areas of interest covered in this section are education, employment, and housing data.
Information on education attainment may be accessed for primary schools through key stage 2 tests for English, Mathematics and Science. GCSE results are used to compile achievement levels at secondary school level. These are known as schools and colleges achievement and attainment tables and are available at individual school, local education authority (LEA) and national level (www.dcsf.gov.uk).
Key statistics that may be of interest include:
- examination results
- community projects/initiatives, including Sure Start partnerships, City Academies and School Achievement Awards
- early education, including number of 3 and 4 year olds taking up free early education places
- absences and permanent exclusions
This is available through a labour force survey which is carried out once quarterly in England. This measures the employment status in terms of three categories which are internationally recognised, namely employed, unemployed and economically inactive (http://www.statistics.gov.uk/StatBase/Source.asp?vlnk=358&More=Y). A measure of unemployment is also available through the claimant count, which is a monthly data collection and enables precision at a geographical level as well as a measure of flow, as the data includes the National Insurance number. It is also possible to analyse vacancy and earnings data at ward level through a service provided by the Office of National Statistics (https://www.nomisweb.co.uk/Default.asp).
Information on housing and homelessness can be accessed through statistics from a number of sources. One main source is through the Communities and Local Government research and statistics department. Data on housing tenure, homelessness, and occupancy rating may be available through this source (http://www.communities.gov.uk/housing/housingresearch/).
H Brenner, C Stegmaier, and H Ziegler, Estimating completeness of cancer registration: an empirical evaluation of the two source capture-recapture approach in Germany J Epidemiol Community Health 1995; 49: 426-430