Chapter 4: Designs for Assessing Program
Implementation and Effectiveness
Page 47 GAO-12-208G
Table 5: Designs for Assessing Effectiveness of Different Types of Programs
Typical design
Comparison controlling for
alternative explanations Best suited for
Process and outcome
monitoring or evaluation
Performance and preexisting goals or standards, such
as
• R&D criteria of relevance, quality, and performance
• productivity, cost effectiveness, and efficiency
standards
• customer expectations or industry benchmarks
Research, enforcement, information and
statistical programs, business-like enterprises,
and mature, ongoing programs where
• coverage is national and complete
• few, if any, alternatives explain observed
outcomes
Quasi-experiments: single
group
Outcomes for program participants before and after the
intervention:
• collects outcome data at multiple points in time
• statistical adjustments or modeling control for
alternative causal explanations
Regulatory and other programs where
• clearly defined interventions have distinct
starting times
• coverage is national and complete
• randomly assigning participants is NOT
feasible, practical, or ethical
Quasi-experiments:
comparison groups
Outcomes for program participants and a comparison
group closely matched to them on key characteristics:
• key characteristics are plausible alternative
explanations for a difference in outcomes
• measures outcomes before and after the
intervention (pretest, posttest)
Service and other programs where
• clearly defined interventions can be
standardized and controlled
• coverage is limited
• randomly assigning participants is NOT
feasible, practical, or ethical
Randomized experiments:
control groups
Outcomes for a randomly assigned treatment group and
a nonparticipating control group:
• measures outcomes preferably before and after the
intervention (pretest, posttest)
Service and other programs where
• clearly defined interventions can be
standardized and controlled
• coverage is limited
• randomly assigning participants is
feasible and ethical
Source Adapted from Bernholz et al. 2006.
Some types of federal programs, such as those funding basic research
projects or the development of statistical information, are not expected to
have readily measurable effects on their environment. Therefore,
research programs have been evaluated on the quality of their processes
and products and relevance to their customers’ needs, typically through
expert peer review of portfolios of completed research projects. For
example, the Department of Energy adopted criteria used or
recommended by OMB and the National Academy of Sciences to assess
research and development programs’ relevance, quality, and
performance (U.S. Department of Energy 2004.)
Regulatory and law enforcement programs can be evaluated according to
the level of compliance with the pertinent rule or achievement of desired
health or safety conditions, obtained through ongoing outcome