Section 6: Analysing Qualitative Data

Principles of Qualitative Methods: Section 6: Analysing Qualitative Data

General considerations

No general consensus exists amongst qualitative researchers concerning the process of data analysis. Rather, what exists is a plurality of analyses and interpretations, which reflect the particular theoretical perspective or tradition within which the researcher is working. This is one of the many ways in which qualitative research methods significantly differ from quantitative approaches. In the latter, there exists really only one route from data to conclusions, and this is statistical analysis. In quantitative data analysis, the options available to the researcher largely concern the appropriateness of the statistical test relative to the size and type of the sample. However, many of the qualitative methods textbooks do attempt to identify some general features that are common to the analytical phase of qualitative research; these include the following

  • Some form of review of all the information to gain an initial sense of the data, these ideas might then be fed back to the informants for verification purposes.
  • The process of organising the data into some manageable form. This is often described as 'reducing the data', and usually involves developing codes or categories. However, as will be argued below, this process can be potentially problematic if the desire of the researcher is to maintain the unique richness of qualitative forms of data.
  • Interpreting the data
  • Finally presenting it in some form, be it tables, text or diagrammatically.

Having identified these broad stages, it should nevertheless be stated that the process of qualitative analysis is not a linear but rather continuous and iterative (Ritchie, Spencer and O'Connor:2003:219). That is, an emergent analytical process which moves backwards and forwards from the data to analytical concepts, refining and synthesizing the latter as more data becomes available. As has been consistently asserted above, the theoretical approach that informs a piece of qualitative research will essentially determine the process by which the data is to be analysed, however what is common to all qualitative research is the analytical process known as induction (as distinct from deduction - working from an established hypothesis / theoretical model). That is, interpreting the data in order to derive some theoretical framework or working hypothesis, proposition, or `essence' of the social processes under investigation.

Managing Data : The process of indexing/coding/labelling the data

The process of coding or more accurately (in the case of qualitative research) the indexing and linking of those elements of the data that are conceived of as sharing some perceived commonality, is an essential first step in managing the analytical process. However, this process of coding can all too often implicitly reflect the researcher's (pre-) conceptualisations of the social phenomena under investigation. Therefore the first step in avoiding such opaqueness in the process of data analysis must be '…not (to) confuse coding itself with the analytical work of developing conceptual schemes' (Coffey and Atkinson:1996:27). As may be supposed by the preceding comments, there are significant differences in the approach to the process of coding that are adopted within qualitative research; two of these approaches are discussed below.

Codes can be used to simplify or reduce transcript data to manageable levels, the purpose being to achieve a simple conceptual schema. This process usually involves the exclusive index coding of segments of data text in order to be able to eventually retrieve segments sharing a common code. This `code-and-retrieve' procedure becomes a method for treating the data in a `quasi-quantitative' way. Alternatively, coding can be used as a method to open-up the data so enabling the researcher to think or conceptualise beyond the data itself. In this sense, codes act as a heuristic device linking particular areas of the data together with the researcher's pre-existing sets of concepts in order to expand those conceptual frameworks. Here, codes serve to stimulate the expansion, transformation and re-conceptualisation of the data in thought.

As Coffey and Atkinson (1996) have argued, in practice coding is usually a mixture of data reduction and data complication; their point being that codes or labels are the creation of the particular researcher and should therefore be conceived of as nothing more than tools to generate ideas with. Nevertheless, all too often qualitative researchers do end up by selectively analysing the data generated in their fieldwork. This outcome is a consequence of the use of exclusive coding procedures and in addition, employing de-contextualised quotations from interview subjects (Frankland and Bloor:1999). However, because data analysis is an essentially comparative exercise, researchers should always seek to derive propositions that can be applied across all of the data / transcript items, not just selected ones.

Two Approaches to Analysing Qualitative Data

Analytical Induction (AI)

In the analytical approach known as analytic induction (AI) or `deviant case analysis', each section of the transcript (in the case of interviews or focus group discussions) or notes of an observation is not assigned a single code in some 'finaland arbitrary interpretative act' but is merely the first stage in the process of analysis (Coffey and Atkinson:1996:76). Each segment is inductively assigned several non-exclusive index-codes or 'labels' which refer to the various analytic topics or 'themes' that emerge during an interview. Initially these labels will be generalised but they become progressively more elaborate as more data is examined. Once indexing is completed the next stage in the analytical process can occur, this involves making systematic comparisons within and between the labelled transcript data. In AI, generalisability of the final analytical propositions is achieved by focusing on the `deviant' or contradictory indexed items that emerge at this stage. Here the attempt is made to modify the initial analytical themes in order to embrace these deviant cases. This procedure is essential to guard against selective attention to data in order to provide a more systematic means of extending analytic thinking.

There are clear merits to this deviant case approach applied to interview data analysis, particularly in its drawing attention to the importance of contradictions as being indicative of an important dynamic at work rather than some aberrant occurrence or utterance that cannot be fitted into a code.

'Framework'

This approach to analysis has been developed over time by the National Centre for Social Research (Ritchie & Lewis:2003). The term 'framework' derives from the 'thematic framework' which is the central component of this approach to data management and interpretative analysis. The thematic framework is utilised to classify and organise data according to key themes, concepts and emergent categories.

Each research study requires its own distinctive thematic framework comprising of a series of main themes, subdivided by a succession of related sub-themes or topics. These categories evolve and are refined (as an iterative process) through the researcher's familiarisation with the raw data and the subsequent cross-sectional labelling or 'tagging' (essentially the same process that was described in the discussion of AI above). Once the researcher judges that they have a comprehensive list of main and sub-themes, each is then 'charted' or displayed in its own matrix. The response of each research subject is then allocated a row with each column representing a separate subtopic. The final stage of this data management component of 'framework' involves summarising or synthesising the original data from each case (subject) within the appropriate parts of the thematic framework.

The authors (Ritchie, Spencer and O'Connor:2003) recognise that whilst it is necessary to distil the data to a manageable level in order to subsequently engage in the interpretative process of analysis leading to a description and explanation of the social phenomenon under investigation, the essence of the original qualitative data has to be retained. They argue that three requirements are essential in achieving this aim. First, that key terms, phrases or expressions used by the research subjects should be retained as much as possible. Second, interpretation should be kept to a minimum in the data management stage. Third, material should not be dismissed as irrelevant just because its inclusion in the analysis is not immediately obvious at this stage - 'It may well be that issues that make little sense at this early stage of analysis become vital clues in the later interpretative stage of analysis' (Ritchie, Spencer and O'Connor:2003:229).

Once the process of managing the data within the 'chart' or matrix has been completed, then the next stage of descriptive analysis can occur. This involves the following three key steps labelled 'detection', 'categorisation' and 'classification'and are described below (Ritchie, Spencer and O'Connor:2003:237). This process begins by identifying the substantive content and dimensions of the phenomenon under investigation. This is involves looking across all the range of cases with the aim of distilling the key dimensions within the range of perceptions, experiences or behaviours that have been labelled within a theme and charted in the matrix. This process should result in more refined categories that incorporate and discriminate between the different manifestations of the data. The data is then assigned to these new categories. Descriptive research analyses can be based purely at this level of categorisation.

The researcher can however go on to refine these categories further and construct more abstract classifications by which to encapsulate and present the data. Typologies (multidimensional or multifactorial classifications) may be subsequently established in order that an even more refined characteristic / dimension of the phenomenon under investigation can emerge. More usually typologies relate to characteristics of groups of people but they can be used to describe any type of phenomenon such as social systems and structures or forms of events (Ritchie, Spencer and O'Connor:2003:248).

It should be noted however that because these more abstract categorisations and classifications are being utilised in order to describe the form or nature of a social phenomenon (events, attitudes, norms and systems, or the characteristics of different social groups within a population) they have to be conceptually coherent - 'It is a very poor sign of the interrogative powers of the analyst if the descriptive and classificatory analyses are ill defined, meaningless or tangled' (Ritchie, Spencer and O'Connor:2003:244).

A note on computer-assisted qualitative data analysis software (CAQDAS):

The use of computer data analysis programmes such as `Ethnograph' (designed for ethnographic studies) and NUD.IST (designed for grounded theory theory-generation) is becoming increasingly widespread, and have clear advantages when analysing a large text data-base i.e. 30+ interviews or extensive field notes. These programmes provide an organised storage file system, meaning that data can be quickly located, and enable a systematic search for `themes' in the data. The purpose of the software is to attach code words to discrete strips of data. This then allows the researcher to retrieve all instances in the data that share a code. Nevertheless there are some important concerns attached to the universal use of CAQDAS. Coffey and Atkinson have argued that, 'the presuppositions and procedures that are inscribed in contemporary software for qualitative data analysis are implicitly driving a renewed orthodoxy' (1996:1.4). Additionally, whilst CAQDAS does offer a useful way of organizing data in order to facilitate searching the data, such coding for use with computer programmes is not analysis.

© ICrinson & MLeontowitsch 2006