Skip to main content

Evaluation methodological approach

Group
public
46
 Members
2
 Discussions
210
 Library items
Share

Table of contents

Analysis

Share

This section is structured as follows:

KEY ELEMENTS

An analysis is required to convert data into findings, which themselves call for a judgement in order to be converted into conclusions. The analysis is carried out on a question-by-question basis, in the framework of an overall design cutting across all questions.

Data, evidence and findings

Any piece of qualitative or quantitative information that has been collected by the evaluation team is called data, for instance:

  • Document X indicates that the number of pupils has grown faster than the number of teachers in poor rural areas (data)

A piece of information is qualified as evidence as soon as the evaluation team assesses it as reliable enough, for instance:

  • Document X, quoting Education Ministry data that are considered reliable, indicates that the number of pupils has grown faster than the number of teachers in poor rural areas (evidence)

Findings establish a fact derived from evidence through an analysis, for instance:

  • The quality of primary education has decreased in poor rural areas (finding)

Some findings are specific in that they include cause-and-effect statements, for instance:

  • The EC has not significantly contributed to preventing the quality of primary education from decreasing in poor rural areas (cause-and-effect finding)

Findings do not include value judgements, which are embedded in conclusions only, as shown below:

  • The EC has successfully contributed to boosting the capacity of the educational system to enrol pupils from disadvantaged groups, although this has been at the expense of quality (conclusion).
Strategy of analysis

Four strategies can be considered:

  • Change analysis, which compares measured / qualified indicators over time, and/or against targets
  • Meta-analysis, which extrapolates upon findings of other evaluations and studies, after having carefully checked their validity and transferability
  • Attribution analysis, which compares the observed changes with a "without intervention" scenario, also called counterfactual
  • Contribution analysis, which confirms or disconfirms cause-and-effect assumptions on the basis of a chain of reasoning.

The first strategy is the lightest one and may fit virtually all types of questions, for instance:

  • To what extent are the EC priorities still in line with the identified challenges?
  • To what extent has the support taken into account potential interactions and conflicts with other EC policies?
  • To what extent has the EC mainstreamed a given cross-cutting issue in the implementation of its interventions?

The three last strategies are better at answering cause-and-effect questions, for instance:

  • To what extent has the EC contributed to achieving effect X?
  • To what extent has the EC contributed to achieving effect X sustainably?
  • To what extent has the EC contributed to achieving effect X at a reasonable cost?

The choice of the analysis strategy is part of the methodological design. It depends on the extent to which the question raises feasibility problems. It is made explicit in the design table.
Once the strategy has been selected and the data collected, the analysis proceeds through all or part of the following four stages: data processing, exploration, explanation, confirmation.

Data processing

The first stage of analysis consists in processing information with a view to measuring or qualifying an indicator, or to answering a sub-question. Data are processed through operations such as cross-checking, comparison, clustering, listing, etc.

  • Cross-checking is the use of several sources or types of data for establishing a fact. Systematic cross-checking of at least two sources should be the rule, although triangulation (three sources) is preferable. The cross-checking should involve independent documents. A document that quotes another document is not an independent source. An interviewee who has the same profile as another interviewee is not an independent source.
  • Comparison proceeds by building tables, graphs, maps and/or rankings. Data can be compared in one or several dimensions such as time, territories, administrative categories, socio-economic categories, beneficiaries and non-beneficiaries, etc. The evaluation team typically measures change by comparing quantitative indicators over time. Comparisons may also be qualitative, e.g. ranking a population's needs as they are perceived by interviewees.
  • Clustering proceeds by pooling data in accordance with predefined typologies, e.g. EC support per sector, beneficiaries per level of income.
  • Listing proceeds by identifying the various dimensions of something, for instance the various needs of the targeted group as expressed in a participatory meeting, the various effects of a project as perceived by field level stakeholders, the various strengths and weaknesses of the EC as perceived through interviews with other donors' staff.

Provisional findings emerge at this stage of the analysis. Further stages aim to deepen and to strengthen the findings.

Exploration

The exploratory analysis aims to improve the understanding of all or part of the evaluated area, especially when knowledge is insufficient and expertise is weak, or when surprising evidence does not fit available explanations.

The exploratory analysis delves deeper and more systematically into the collected data in order to discover new plausible explanations such as:

  • New categories / typologies
  • Unforeseen explanatory factors
  • Factors favouring / constraining sustainability
  • Unintended effects
  • New cause-and-effect assumptions.

The exploratory stage may not be needed for all questions. When such an analysis is carried out, brainstorming techniques are appropriate. The idea is to develop new plausible explanations, not to assert them.

Explanation

This next stage ensures that a sufficient understanding has been reached in terms of:

  • Precisely defined concepts, categories and typologies
  • Plausible cause-and-effect explanations
  • Identification of key external factors and alternative explanations.

Depending on the context and the question, the explanation builds upon one or several of the following bases:

  • Diagram of expected effects
  • Expertise of the evaluation team
  • Exploratory analysis

A satisfactory explanation (also called explanatory model) is needed for finalising the analysis.

Confirmation

The last stage of the analysis is devoted to confirming the provisional findings through a valid and credible chain of arguments. This is the role of the confirmatory analysis.

To have a finding confirmed, the evaluation undertakes a systematic self-criticism by all possible means, e.g. statistical tests, search for biases in data and analyses, check for contradictions across sources and analyses.
External criticism from experts or stakeholders is also considered.

ANALYSIS STRATEGY
 
Cause-and-effect analysis

What does this mean?

Approach through which the evaluation team asserts the existence of a cause-and-effect link, and/or assesses the magnitude of an effect.

Attribution or contribution

- Attribution analysis

Attribution analysis aims to assess the proportion of observed change which can really be attributed to the evaluated intervention. It involves building a counterfactual scenario.

- Contribution analysis

Contribution analysis aims to demonstrate whether or not the evaluated intervention is one of the causes of observed change. It may also rank the evaluated intervention among the various causes explaining the observed change. Contribution analysis relies upon chains of logical arguments that are verified through a careful confirmatory analysis.
It comprises the following successive steps:

  • Refining the cause-and-effect chains which connect design and implementation on the one hand, and the evaluated effect on the other. This step builds upon available explanations pertaining to the evaluated area. Explanations derive from the diagram of expected effects drawn in the first phase of the evaluation, from the evaluation team's expertise, and from exploratory analyses.
  • Gathering evidence related to each link in the cause-and-effect chain, including findings of similar studies, causal statements by interviewees, and evidence from in-depth inquiries.
  • Gathering evidence related to other explanations (other interventions, external factors).
  • Developing a step-by-step chain of arguments asserting that the intervention has (or has not) made a contribution, and possibly ranking the intervention among other contributions.
  • Submitting the reasoning to systematic criticism until it is strong enough.

Analytical approaches

- Counterfactual

The approach is summarised in the diagram below:

 

The "policy-on" line shows the observed change, measured with an impact indicator, between the beginning of the evaluated period (baseline) and the date of the evaluation. For instance: local employment has increased, as has literacy. The impact accounts for only the share of this change that is attributable to the intervention.
The "policy-off" line, also called the counterfactual, is an estimate of what would have happened without the intervention. It can be obtained with appropriate approaches like comparison groups or modelling techniques. Impact is assessed by subtracting the policy-off estimate from the observed policy-on indicator.
The assessed impact, derived from an estimate of the counterfactual, is itself an estimate. In other words, impacts cannot be directly measured. They can simply be derived from an analysis of impact indicators.
Only a counterfactual allows for a quantitative impact estimate. When successful, this approach therefore has a high potential for learning and feedback. It is nevertheless relatively demanding in terms of data and human resources, which makes it somewhat unusual in evaluation practice in developing countries.

- Case studies

Another analytical approach relies on case studies. It builds upon an in-depth inquiry into one or several real life cases selected in order to learn about the intervention as a whole. Each case study monograph describes observed changes in full detail. A good case study also describes the context in detail and all significant factors which may explain why the changes occurred or did not occur.
In a case study approach, the evaluation team analyses the whole set of collected facts and statements and checks whether they support assertions like "the change can be attributed to the intervention", "the change can be attributed to another cause", "the absence of change can be attributed to the intervention", etc.
Just one case study may fully demonstrate that the intervention does not work as intended and may provide a convincing explanation for that. However, it is worth confirming such a finding by one or two additional case studies.
On the other hand, it takes more case studies to demonstrate that the intervention works, because all alternative explanations should be carefully investigated and rejected.
If professionally implemented, case studies provide for a highly credible and conclusive contribution analysis. The approach is nevertheless fairly demanding in terms of time and skilled human resources.

- Causal statements

The approach builds upon documents, interviews, questionnaires and/or focus groups. It consists in collecting stakeholders' views about causes and effects. Statements by various categories of stakeholders are then cross-checked (triangulated) until a satisfactory interpretation is reached. A panel of experts may be called to help in this process.
A particular way of implementing this approach consists in collecting beneficiaries' statements about impacts or direct results. Typically, a sample of beneficiaries is asked questions like "How many jobs would you say have been created/lost in your firm as a result of the support received?" or "To what extent is your present situation/behaviour attributable to your participation in the intervention?" In this approach the interviewee is asked to apply the policy-off scenario on his/her own.
Evaluation teams tend to prefer this approach which is far more feasible, but nobody should forget that the difficulty is transferred to the respondents. Most often, interviewees do not have a clear view of the policy-off scenario. They try to make up their minds in a few seconds during the interview and in doing so are subject to all kinds of biases.
When interviewing beneficiaries, the evaluation team often faces difficulties due to deadweight, since interviewees tend to exaggerate the effect of the evaluated intervention on their own behaviour or situation. In other words, they tend to underestimate the changes that would have occurred in the absence of the intervention. This results in a bias which is called deadweight.
In order to avoid this bias, the evaluation team should never rely upon a single naive question like "How many new jobs have been created as a result of the support received?" or "How much has your income increased as a result of the project?" By contrast, multiple triangulated questions may enable the evaluation team to assess and reduce the bias. Beneficiaries' statements are called "gross effects" (including bias) whilst the evaluation team's estimate is called a "net effect" (corrected from bias).

- Meta-analysis

This approach builds upon available documents, for instance:

  • Previous works pertaining to the evaluation as a whole (monitoring, audit, review, etc.)
  • Recent reports related to a part of the intervention, e.g. a project, a sector, a cross-cutting issue (evaluation reports but also monitoring, audit, review, etc.)
  • Lessons learnt from other interventions and which can be used in answering the question.

In performing meta-analyses, the evaluation team needs to (1) assess the quality of information provided by the reviewed documents, and (2) assess the transferability to the context of the evaluation underway.

- Generalisation

The first two approaches (counterfactual and case studies) have the best potential for obtaining findings that can be generalised (see external validity), although in a different way. Findings can be said to be of general value when all major external factors are known and their role is understood. Counterfactual approaches build upon explanatory assumptions about major external factors, and strive to control such factors through statistical comparisons involving large samples. Case studies strive to control external factors through an in-depth understanding of cause-and-effect mechanisms.

  Counterfactual Case studies
External
factors
Identified in advance Identified in advance or discovered during the study
Control of
external
factors
Quantitative, large samples, statistical techniques In-depth understanding of cause-and-effect mechanisms

Recommendation

The evaluation team should be left with the choice of its analysis strategy and analytical approach.

Cause-and-effect questions

What does this mean?

Cause-and-effect questions pertain to the effects of the evaluated intervention. They are written as follows:

  • To what extent has [the intervention] contributed to [the expected change]?
  • How far has [the intervention] helped to achieve [the expected change]?

These questions call for an observation of change, and then an attribution of observed change to the intervention, or an analysis of the intervention's contribution to observed changes.
Questions pertaining to direct and short-term effects can easily be answered. Far-reaching effects (e.g. poverty reduction) raise feasibility problems.

Causality and evaluation criteria

Effectiveness and impact questions tend to be cause-and-effect questions in the sense that they link the evaluated intervention (the cause) to its effects.
Efficiency and sustainability questions are also cause-and-effect questions since actual effects have to be analysed first, before being qualified as cost-effective or sustainable.
Generally speaking, relevance and coherence questions are not cause-and-effect questions. Typical examples are:

  • How far are EC objectives in line with needs as perceived by the targeted population?
  • To what extent are the effects of the intervention and the effects of other EC policies likely to reinforce one another?

The latter example involves causes and effects, but only in a prospective and logical manner. The evaluation team is not expected to assert the existence of cause-and-effect links and/or to assess the magnitude of actual effects.
Exceptionally, some relevance questions may call for cause-and-effect statements, for instance:

  • How far did the involvement of non-state actors in the design of the EC strategy contribute to better adjustment of priorities to needs as perceived by the targeted population?

Questions pertaining to the EC value added may be cause-and-effect questions if the evaluation team attempts to assert the existence or the magnitude of an additional impact, due to the fact that the intervention took place at European level.

Caution!

Questions which do not require a cause and effect analysis do nevertheless call for a fully-fledged analysis covering all or part of data processing, exploration, explanation and confirmation.

Counterfactual

What does this mean?

The counterfactual, or counterfactual scenario, is an estimate of what would have occurred in the absence of the evaluated intervention.
The main approaches to constructing counterfactuals are

  • Comparison group
  • Modelling

What is the purpose?

By subtracting the counterfactual from the observed change (factual), the evaluation team can assess the effect of the intervention, e.g. effect on literacy, effect on individual income, effect on economic growth, etc.

Comparison group

One of the main approaches to counterfactuals consists in identifying a comparison group which resembles beneficiaries in all respects, except for the fact that it is unaffected by the intervention. The quality of the counterfactual depends heavily on the comparability of beneficiaries and non-beneficiaries. Four approaches may be considered for that purpose.

Randomised control group

This approach, also called experimental design, consists in recruiting and surveying two statistically comparable groups. Several hundred potential participants are identified and asked to participate or not in the intervention, on a random basis. The approach is fairly demanding in terms of preconditions, time and human resources. When the approach is workable and properly implemented, most external factors (ideally all) are neutralised by statistical rules, and the only remaining difference is participation in the intervention.

Adjusted comparison group

In this approach a group of non-participants is recruited and surveyed, for instance people who have applied to participate but who have been rejected for one reason or another. This approach is also called quasi-experimental design. In order to allow for a proper comparison, the structure of the comparison group needs to be adjusted until it is similar enough to that of participants as regards key factors like age, income, or gender. Such factors are identified in advance in an explanatory model. The structure of the comparison group (e.g. per age, income and gender) is adjusted by over- or under-weighting appropriate members until both structures are similar.

Matching pairs

In this approach a sample of non-participants is associated with a sample of beneficiaries on an individual basis. For each beneficiary (e.g. a supported farmer), a matching non-participant is found with a similar profile in terms of key factors which need to be controlled (e.g. age, size of farm, type of farming). This approach often has the highest degree of feasibility and may be considered when other approaches are unpractical.

Generic comparison

The counterfactual may be constructed in abstracto by using statistical databases. The evaluation team starts with an observation of a group of participants. For each participant, the observed change is compared to what would have occurred for an "average" individual with the same profile, as derived from an analysis of statistical databases, most often at national level.

Comparative approaches

Different forms of comparison exist, each with pros and cons, and varying degrees of validity.

  • An "afterwards only" comparison involves analysis of the differences between both groups (participants and non-participants) after the participants have received the subsidy or service. This approach is easy to implement but neglects the differences that may have existed between the two groups at the outset.
  • A "before-after" comparison focuses on the evolution of both groups over time. It requires baseline data (e.g. through monitoring or in the form of statistics, or through an ex ante evaluation), something which does not always exist. Baseline data may have to be reconstructed retrospectively, which involves risks of unreliability.

Strengths and weaknesses in practice

A well-designed comparison group provides a convincing estimate of the counterfactual, and therefore a credible base for attributing a share of the observed changes to the intervention. A limitation with this approach stems from the need to identify key external factors to be controlled. The analysis may be totally flawed if an important external factor has been overlooked or ignored. Another shortcoming stems from the need to rely upon large enough samples in order to ensure statistical validity. It is not always easy to predict the sample size which will ensure validity, and it is not infrequent to arrive at no conclusion after several weeks of a costly survey.

Modelling

The principle is to run a model which correctly simulates what did actually occur in reality (the observed change), and then to run the model again with a set of assumptions representing a "without intervention" scenario. In order to be used in an evaluation, a model must include all relevant causes and effects which are to be analysed. These are at least the following:

  • Several causes including the intervention itself and other explanatory factors.
  • The effect to be evaluated.
  • A mathematical relation between the causes and the effect, including adjustable parameters.

Complex models (e.g. macro-economic ones) may include hundreds of causes, hundreds of effects, hundreds of mathematical relations, hundreds of adjustable parameters, and complex cause-and-effect mechanisms such as causality loops. When using a model, the evaluation team proceeds in three steps:

  • A first simulation is undertaken with real life data. The parameters are adjusted until the model reflects all observed change correctly.
  • The evaluation team identifies the "primary impacts" of the intervention, e.g. increase in the Government's budgetary resources, reduction of public debt, reduction of interest rates, etc. A set of assumptions is elaborated in order to simulate the "without intervention" scenario, that is to say, a scenario without the "primary impacts".
  • The model is run once again in order to simulate the "without intervention" scenario (i.e. the counterfactual). The impact estimate is derived from a comparison between both simulations.

Modelling techniques are fairly demanding in terms of data and expertise. The workload required for building a model is generally not proportionate to the resources available to an evaluation. The consequence is that the modelling approach is workable only when an appropriate model and the corresponding expertise already exist.

EXTERNAL FACTORS
 
What are they?

Factors which are embedded in the context of the intervention and which hinder or amplify the intended changes while being independent from the intervention itself.

External factors are also called contextual, exogenous or confounding factors.

Why are they important?
  • To understand their influence in order to properly disentangle the effects of the intervention from those of other causes.
  • To identify the contextual factors which may prevent the transferability of lessons learned.
  • To adjust samples of participants and non-participants in order to make them truly comparable and to achieve internal validity.
Typical examples

Factors explaining participation in the intervention:

  • Potential applicants are (are not) familiar with the implementing agency.
  • They belong (do not belong) to a "club" of recurrent beneficiaries.
  • They have (do not have) a systematic grant-seeking behaviour.
  • They have (do not have) social or economic difficulties preventing them from applying for grants / loans.

Factors explaining the achievement of specific impacts:

  • Trainees had a high (low) level of education before entering the programme. In addition to the intervention this factor partly explains the skills they acquired.
  • The target public had a high (low) level of knowledge before being reached by an awareness-raising campaign. In addition to the intervention this factor partly explains their current level of awareness.
  • Beneficiary enterprises were large (small), young (old), and growing (declining) when they received support. In addition to the intervention these factors partly explain their current level of competitiveness.

Factors explaining global impact

  • Local agriculture has benefited from exceptionally favourable weather during the programming period (or the opposite). In addition to the intervention this factor partly explains the change in farmers' income.
  • International economic trends were exceptionally favourable during the programming period (or the opposite). In addition to the intervention this factor partly explains the evolution of exports.

When dealing with such external factors, the evaluation may usefully consult the contextual indicators that are available on the web.

How can they be identified?

In a given evaluation, external factors are potentially numerous and it is crucial to highlight the most important ones. The following approaches may help:

  • Making the intervention logic explicit and uncovering implicit cause-and-effect assumptions
  • Analysing previous research and evaluation work
  • Taking expert advice
  • Carrying out qualitative surveys

Identifying external factors is one of the main purposes of the exploratory analysis.

Recommendations
Do not try to identify all possible external factors when clarifying the intervention logic in the structuring phase of the evaluation. They are simply too numerous. This task should be undertaken only when working on a given evaluation question, and only if the question involves a cause-and-effect analysis.
EXPLORATORY AND CONFIRMATORY ANALYSIS
 
Exploratory analysis

What does this mean?

If necessary, the evaluation team delves into the collected data in order to discover new plausible explanations such as:

  • New categories / typologies
  • Explanatory factors
  • New cause-and-effect assumptions
  • Factors favouring / constraining sustainability

What is the purpose?

  • To improve the understanding of all or part of the evaluated area, especially when existing knowledge and expertise are inadequate
  • To develop and refine explanatory assumptions.

How to carry out the exploratory analysis

The analysis explores the set of data (quantitative and qualitative) with a view to identifying structures, differences, contrasts, similarities and correlations. For example, the analysis involves:

  • Cross-cutting analyses of several case studies
  • Statistical comparisons cutting across management data bases, statistical data bases, and/or the results of a questionnaire survey
  • Comparisons between interviews and documents.

The approach is systematic and open-minded. Brainstorming techniques are appropriate. Ideas emerge through the first documentary analyses, interviews, and meetings. The exploration may continue through the field phase.

Confirmatory analysis

What does this mean?

Provisional findings progressively emerge during the first phases of the evaluation team's work. They need to be confirmed by sound and credible controls. That is the role of the confirmatory analysis.
In the particular case of cause-and-effect questions, the analysis is meant to disentangle several causes (the intervention and the external factors) in order to demonstrate the existence and/or assess the magnitude of the effect.

What is the purpose?

  • To ensure that the findings are sound and able to withstand any criticism when the report is published
  • To ensure that the findings are credible from the intended users' viewpoint
  • In the particular case of cause-and-effect questions, to distinguish actual effects from observed change

How is a confirmatory analysis performed?

For a finding to be confirmed, it is systematically criticised by all possible means, e.g.:

  • If the finding derives from a statistical analysis, are the validity tests conclusive?
  • If the finding was suggested by a case study, is it contradicted by another case study?
  • If the finding derives from a survey, can it be explained by a bias in that survey?
  • If the finding is based on an information source, is it contradicted by another source?
  • Is the finding related to a change that can be explained by external factors that the evaluation team may have overlooked?
  • Does the finding contradict expert opinions or lessons learned elsewhere and, if so, can this be explained?
  • Do the members of the evaluation reference group have arguments to contradict the finding and, if so, are these arguments justified?

Recommendations

  • Devote relatively long interactions to the discussion of the final report in order to allow for a careful confirmatory analysis. Ensure that the evaluation team has put aside sufficient resources for that purpose.
  • Not all findings require the same level of confirmation. Concentrate efforts on findings that support the most controversial conclusions, the lessons that are the most likely to be transferred, or the recommendations that are the most difficult to accept.
  • In order to enhance the evaluation's credibility, it is valuable to present the criticisms that the findings withstood during the confirmatory analysis, in an annex.
VALIDITY
 
What does this mean?

Validity is achieved when:

  • Conclusions and lessons are derived from findings in a way which ensures transferability (external validity)
  • Findings are derived from data without any bias (internal validity)
  • Collected data reflect the changes or needs that are to be evaluated without bias (construct validity)
What is the purpose?

A lack of validity may expose the evaluation to severe criticism from those stakeholders who are dissatisfied with the conclusions and recommendations, and who will point out any weaknesses they may have found in the reasoning.

Validity is part of the quality criteria. It should be given an even higher level of attention when the intended users include external stakeholders with conflicting interests.

External validity

Quality of an evaluation method which makes it possible to obtain findings that can be generalised to other groups, areas, periods, etc. External validity is fully achieved when the evaluation team can make it clear that a similar intervention implemented in another context would have the same effects under given conditions.

Only strong external validity allows one to transfer lessons learned. External validity is also sought when the evaluation aims at identifying and validating good practice.
External validity is threatened when the analysis fails to identify key external factors which are influential in the context of the evaluated intervention but would have a different influence in another context. External factors should not only be identified; the magnitude of their consequences should also be assessed. In the instance of a survey, this calls for larger and more diverse samples. In the instance of case studies, it requires a multiplication of the number of cases.

Internal validity

This is the quality of an evaluation method which, as far as possible, limits biases imputable to data collection and analysis. Internal validity is fully achieved when the evaluation team provides indisputable arguments showing that the findings derive from collected facts and statements.

Internal validity is a major issue in the particular case of cause-and-effect questions. When striving to demonstrate the existence and/or to assess the magnitude of an effect, the evaluation team is exposed to risks such as:

  • Overlooking cause-and-effect mechanisms which contradict initial assumptions.
  • Deriving impact estimates from a comparison of samples (e.g. participants and non-participants) that are not similar enough.
  • Having findings that do not withstand statistical tests because samples are not large enough.
Construct validity

This is the quality of an evaluation method which faithfully reflects the changes or needs that are to be evaluated. Construct validity is fully achieved when key concepts are clearly defined and when indicators reflect what they are meant to.

Construct validity is threatened if the evaluation team does not fully master the process of shifting from questions to indicators. Construct validity is also at risk when the evaluation team uses indirect evidence like proxies.

Recommendations
  • Never start analysing data without a thorough understanding of the context. Pay enough attention to identifying key external factors.
  • The valid analysis of an impact requires time and resources. For this reason, cause-and-effect questions may need more resources than others.