5. Showing if IMPACT evaluation is appropriate, feasible, affordable and credible


Academic references to outcomes theory: Duignan (2009d; 2008d)*

There are seven major types of impact evaluation designs listed below. Where controllable indicators (building-block two) do not reach to the top of an organization or program's outcomes model, there is only one way to prove that it is having a high-level impact. This is to assess the appropriateness, feasibility, affordability and credibility of the following set of possible impact evaluation designs.

Impact evaluation is any type of evaluation which attempts to make a claim about changes in high-level outcomes being attributable (caused by) the program or organization. It tends to be more of a one-off activity than the routine data collection which is used for indicator measurement in building-blocks two and three

The seven possible impact evaluation designs are set out here. Using Duignan's Impact Evaluation Feasibility Check you can work out which, if any, are appropriate, feasible, affordable and credible for a particular project.

1. True randomized experiments.

In randomized experimental designs, people, organizations, regions or other units are assigned randomly to an intervention and control group. They are sometimes called Randomized Controlled Trials (RCTs). The intervention group receives the intervention and its outcomes are compared to those of the control group which does not. Because of the random assignment, all other confounding factors that may have produced the outcomes can be ruled out as unlikely to have created the observed improvement in high-level outcomes.

2. Regression discontinuity designs.

In regression discontinuity designs, people, organizations, regions or units are quantitatively ranked or distinguished. For example they might be ranked from those with the lowest pre-intervention outcome scores to those with the highest. Then an intervention is implemented above or below a cut-off point in that ranking. For our purposes here we can think of them being graphed in order of the ranking. Any improvement in the intervention group that is effective should appear on the graph at the cut-off point between those units which received the intervention and those which did not. One advantage of this design over a true experiment is that it is often seen as a more ethical design because the treatment is given to those most in need (i.e. those below the cut-off point)

4. Constructed comparison group designs.

In constructed matched comparison group designs, the attempt is made to identify, or create, a comparison group which has not received. This group is then used to compare outcomes with the group which has received the intervention. For instance, one might find a similar community to the intervention community. In some cases called propensity matching, statistical methods are used to work out what is likely to have happened to a particular type of case if it did not receive the intervention. This is worked out on the basis of statistics from many cases which did not receive the intervention in the past.

3. Time series designs.

In time series designs, a number of measurements of an outcome are taken over a period of time. Then an intervention is introduced at a specific point in time. If it is successful, it is expected that there would be an improvement at exactly the time when the intervention was introduced. Because a series of measurements have been taken over time, it is possible to look at the point in time when the intervention was introduced and ask the question as to whether an improvement is shown at that point in time.

5. Exhaustive causal identification and elimination designs.

In exhaustive alternative causal identification and elimination designs there needs to be a good way of measuring whether or not outcomes have occurred. Then all of the alternative explanations as to why outcomes might have occurred are detailed. Alternative explanations are then eliminated by logical analysis, and using any empirical data available. Each alternative is assessed to see whether it is credible that it might have caused the observed improvements in outcomes. If this can be done successfully and all alternative explanations eliminated, it leaves the intervention as the only credible explanation as to why outcomes have improved.

7. Intervention logic (program theory/theory of change) designs.

In intervention logic designs, the attempt is first made to establish a credible 'intervention logic' (program theory/theory of change) for the program or organization. This logic sets out the way in which it is believed that lower-level program activities will logically lead on to cause higher-level outcomes (this can be done in the form of a DoView Outcomes Model). This logic is then endorsed either by showing that previous evidence suggests that it does work in cases similar to the one being evaluated, or by experts in the topic endorsing the logic as being a credible logic. It is then established that lower-level activities have actually occurred (relatively easy to do because they tend to be able to be measured by controllable indicators). Finally, it is assumed, but not proven, that lower-level activities did in this particular instance cause higher-level outcomes to improve.

6. Expert and key informant judgement designs.

In expert judgment designs an expert, or an expert panel, is asked to make a judgment as to whether, in their opinion and using whatever method they usually use in making such judgments, they believe that the program has had an effect on improving high-level outcomes. This type of evaluation design is sometimes called a 'connoisseur' evaluation design drawing on an analogy with connoisseur judges such as wine tasters. In key informant judgment designs, key informants (people who are likely to be in a position to be knowledgable about what is happening in a program and whether it impacted on high-level outcomes) are asked to make a judgment regarding whether they think that the program affected high-level outcomes using whatever method they want in making such judgments. This type of impact evaluation design is often seen as less robust and credible by some stakeholders than the designs listed above.

* These particular references can be cited to refer to the material on this page.

The  development of this typology was significantly influenced by a workshop on impact evaluation run by Michael Scriven and the thinking in Charles