It's often really hard to prove your impact on high-level outcomes
For most real-world programs and organizations, it's really hard to prove that you're having an impact on high-level outcomes. Most people make the mistake of thinking that if you have just measured that indicators of high-level outcomes are improving (e.g. less violence, more tourism, better social outcomes), you've established that you're the one who has improved them.
For indicators that are controllable by your organization or program (i.e. you are the only thing influencing them), just measuring them does actually establish that you've improved them. Controllable indicators tend to be measures of lower-level steps rather than higher-level outcomes. Many higher-level outcomes are influenced by a number of factors in addition to the work of your organization or program. Simply measuring that such high-level indicators have got better is not, in itself, enough to establish that it was only your work that improved them. This is because it could be that other factors may have improved them. It is usually the case that many of the most interesting high-level indicators you work is focused on will be non-controllable indicators.
You need to turn to technical impact evaluation designs
The only way you can prove that you've made a difference to high-level indicators where they are non-controllable is to complement your indicator measurement work (which is called monitoring) with specific impact evaluation work. Impact evaluation consists of specific studies aimed at teasing out what's actually causing high-level outcomes to occur. It's the way that you can prove that it's your program or organization, rather than other factors, which is causing high-level outcomes to improve.
The problem for program and organizational staff attempting to prove that their program is actually improving high-level indicators is that when they turn to the area of impact evaluation, they quickly realize that in many cases it can be technically complicated.
Avoiding psuedo-impact evaluation studies
There are two traditional approaches to the problem of impact evaluation being technically hard. The first is to just ignore the technicalities and insist that anyone can do impact evaluation. You then suggest that program staff just draw together whatever information they can find to show that their particular program 'works'. In effect this is watering down the definition of the term impact evaluation. The danger of this is that the term can be so watered down that it becomes almost meaningless. It produces what can be called pseudo-impact evaluations. There are impact evaluations which you could drive a bull-dozer through because of their methodological weaknesses.
The other traditional approach is at the other extreme. It is where you insist that everyone has to do technically 'robust' impact evaluations (e.g. randomized experiments) so that they can robustly establish that their program has made a difference to high-level outcomes. Demanding this of all programs is unrealistic and unaffordable.
The Duignan approach deals with the impact evaluation problem differently from these two traditional solutions. First, it never assumes that impact evaluation is appropriate, feasible or affordable to do on any particular program, organization or intervention. Second, even when impact evaluation is appropriate, feasible and affordable, the Duignan approach does not insist that it always should be done. This is because even if it is feasible, it might not be a good use of scarce evaluation resources. For instance, the same intervention might have been evaluated before in similar circumstances. Evaluation resources should not be allocated on a program by program basis but rather as part of a sector-wide assessment of what the knowledge needs are for a particular sector.
Assessing the feasibility of the seven major types of impact evaluation design
The Duignan approach works by going through seven major impact evaluation design types and assessing each of them for their appropriateness, feasibility, affordability and credibility with stakeholders when applied to the particular program, organization or intervention being subject to impact evaluation. If none of them are appropriate, feasible, affordable and credible to key stakeholders then in the Duignan approach that is simply the reality of the situation you are dealing with. There's no reason why program staff should feel that they are failures if they cannot come up with any appropriate, feasible, affordable and credible way of doing impact evaluation. The ease of impact evaluation varies widely amongst different types of program in different settings with different levels of resources available for doing impact evaluation.
The Duignan approach does not dumb-down the impact evaluation choices available to you but rather lists the seven major design types and suggests that you assess which of them are appropriate, feasible and affordable. Some people reading this may need some assistance to do the impact evaluation feasibility analysis, others will be able to work their way through it at a high-level and call in technical expertise for just parts of it. Best practice in using the Duignan approach is for funders to provide expert advice to providers as to which of the seven major impact evaluation design types are likely to be appropriate, feasible, affordable and credible in the case of particular types of program, organizations or interventions.
Some designs are relatively easy to set up and do. Such designs could be undertaken by a wide range of generic market research companies or those with similar experience. An example of one of these designs is Impact Evaluation Design Type 6: Key Informant Judgment Design. In this design you just ask people who are likely to know whether they think the program or organization made a difference to high-level outcomes. Others, like Time Series Designs or Regression Discontinuity Designs are much more technical and will usually require that you ask someone with technical evaluation expertise whether or not they may be applicable to your program or organization's work.
The seven major impact evaluation design types are listed below with an explanation of each of them. The idea is that for any program or organization, you go through these design types and see if any of them are applicable in the case of your particular program or organization.
1. True Experiments
In randomized experimental designs, people, organizations, regions or other units are assigned randomly to an intervention and control group. They are sometimes called Randomized Controlled Trials (RCTs). The intervention group receives the intervention and its outcomes are compared to those for the control group which does not. Because of the random assignment, all other confounding factors which may have produced the outcomes can be ruled out as unlikely to have created the observed improvement in high-level outcomes.
2. Regression Discontinuity Design
In regression discontinuity designs, people, organizations, regions or units are quantitatively ranked or distinguished. For example they might be ranked from those with the lowest pre-intervention outcome scores to those with the highest. Then an intervention is implemented above or below a cut-off point in that ranking. For our purposes here we can think of them being graphed in order of a ranking. Any improvement in the intervention group that is effective should appear on the graph at the cut-off point between those units which received the intervention and those which did not. One advantage of this design over a true experiment is that it is often seen as a more ethical design because the treatment is given to those most in need (i.e. those below the cut-off point).
3. Time Series Designs
In time series designs, a number of measurements of an outcome are taken over a period of time. Then, an intervention is introduced at a specific point in time. If it is successful, it is expected that there would be an improvement at exactly the time when the intervention was introduced. Because there have been a series of measurements over time, it is possible to look at the point in time when the intervention was introduced and ask the question as to whether an improvement is shown at that point in time.
4. Constructed comparison group Designs
In constructed matched comparison group designs, the attempt is made to identify or create a comparison group which is not receiving the intervention. This group is then used to compare outcomes with the group which has received the intervention. For instance, one might find a similar community to the intervention community. In some cases, called propensity matching, statistical methods are used to work out what 'is likely to have happened' to a particular type of case (on the basis of statistics from many cases which did not receive the intervention) if they did not receive the intervention.
5. Exhaustive Causal Identification and Elimination Designs
In exhaustive alternative causal identification and elimination designs there needs to be a good way of measuring whether or not outcomes have occurred. Then all of the alternative explanations as to why outcomes might have occurred need to be detailed. Alternative explanations are then eliminated by logical analysis, and using any empirical data available. Each is assessed to see whether it is credible that they might have caused the observed improvements in outcomes. If this can be done successfully and all alternative explanations eliminated, it leaves the intervention as the only credible explanation as to why outcomes have improved. These designs differ from Time Series Designs in that they do not require a large number of observations over time.
6. Expert and Key Informant Judgment Designs
In expert judgment designs, an expert, or an expert panel, is just asked to make a judgment as to whether, in their opinion (using whatever method they usually use in making such judgments) they believe that the program has had an effect on improving high-level outcomes. This type of evaluation design is sometimes called a 'connoisseur' evaluation design drawing on an analogy with connoisseur judges such as wine tasters. In key informant judgment designs, key informants (people who are likely to be in a position to be knowledgable about what is happening in a program and whether it impacted on high-level outcomes) are asked to make a judgment regarding whether they think that the program affected high-level outcomes (using whatever method they want in making such judgments). This type of impact evaluation design is often seen as less robust and credible by some stakeholders than the designs listed above.
7. Intervention Logic (Program Theory/Theory of Change) Based Designs
In intervention logic designs, the attempt is first made to establish a credible 'intervention logic' (program theory/theory of change) for the program or organization. This logic sets out the way in which it is believed that lower-level program activities will logically lead on to cause higher-level outcomes (this can be done in the form of a DoView Outcomes Model more). This logic is then endorsed either by showing that previous evidence shows that it does work in cases similar to the one being evaluated, or by experts in the topic endorsing the logic as being a credible logic. It is then established that lower-level activities have actually occurred (relatively easy to do because they tend to be able to be measured by controllable indicators) and it is then assumed (but not proven) that they did in this particular instance, in fact, cause higher-level outcomes to occur.
Using Duignan's Framework
In practice, Duignan's Check is used by going through each of these seven major impact evaluation design types and assessing the appropriateness, feasibility, affordability and credibility of each of them.
Below is an example of part of the analysis of the set of seven impact evaluation major design types for impact evaluation of a new national building regulation regime designed to improve the performance of new buildings in a country.
For a simple humorous article which goes through each of the impact evaluation designs and illustrates them see here.
Duignan's Impact Evaluation Feasibility Check for a New National Building Regulatory Regime
Additional examples of this type of analysis are available here.
The Duignan Impact Evaluation Feasibility Check can be used within DoView Visual Monitoring and Evaluation Plans. These are more quickly built and accessible evaluation plans which are build around a visual outcomes model build in DoView Outcomes and Evaluation software. Information on how to build DoViews is here. Information on how to build DoView Visual Monitoring and Evaluation plans is here. Download the DoView free trial now.
Anyone can use the above material, with acknowledgment, when doing evaluation planning for their own organization or for-profit or not-for-profit consulting work just with acknowledgement. However you can't embed the approach into software or web-based systems without our permission. If you want to embed it in software or web-based systems please contact firstname.lastname@example.org.
*Reference to cite in regard to this work: Duignan, P. (2009). A concise framework for thinking about the types of evidence provided by monitoring and evaluation. Australasian Evaluation Society International Conference, Canberra, Australia, 31 August – 4 September 2009.