<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Genuine Evaluation &#187; Causal inference strategies</title>
	<atom:link href="http://genuineevaluation.com/category/techniques/causal-inference-strategies/feed/" rel="self" type="application/rss+xml" />
	<link>http://genuineevaluation.com</link>
	<description>Patricia J Rogers and E Jane Davidson blog about real, genuine, authentic, practical evaluation</description>
	<lastBuildDate>Fri, 03 Feb 2012 19:49:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>The Rise and Risk of Evidence</title>
		<link>http://genuineevaluation.com/the-rise-and-risk-of-evidence/</link>
		<comments>http://genuineevaluation.com/the-rise-and-risk-of-evidence/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 06:50:54 +0000</pubDate>
		<dc:creator>Katherine Hay</dc:creator>
				<category><![CDATA[Appropriate inference]]></category>
		<category><![CDATA[Causal inference]]></category>
		<category><![CDATA[Causal inference strategies]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[evidence]]></category>

		<guid isPermaLink="false">http://genuineevaluation.com/?p=2934</guid>
		<description><![CDATA[Our guest blogger this week is Katherine Hay, a senior member of the Evaluation Unit of the International Centre for Development Research. Based in New Delhi, India, she is an expert on the role of evaluation in development in South Asia. &#8230; <a href="http://genuineevaluation.com/the-rise-and-risk-of-evidence/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fgenuineevaluation.com%2Fthe-rise-and-risk-of-evidence%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fgenuineevaluation.com%2Fthe-rise-and-risk-of-evidence%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><em><span style="color: #000000;"><a href="http://GenuineEvaluation.com/wp-content/uploads/IMG-20110928-00148.jpg"><img class="alignright size-medium wp-image-2943" title="Katherine Hay" src="http://GenuineEvaluation.com/wp-content/uploads/IMG-20110928-00148-300x225.jpg" alt="" width="300" height="225" /></a>Our guest blogger this week is Katherine Hay, a senior member of the Evaluation Unit of the International Centre for Development Research. Based in New Delhi, India, she is an expert on the role of evaluation in development in South Asia. She promotes approaches that assess how women and other marginalized groups benefit from development in the region. </span><span style="color: #000000;"> <span style="color: #000000;">Katherine joined IDRC’s South Asia office in New Delhi in 2000 and has undertaken research in South Asia for more than 15 years. Her work with IDRC includes building evaluation curriculum in universities in the region, and supporting evaluation communities of practice spanning South Asia and Afghanistan. She has written on women’s empowerment, evaluation, and the policy research environment in South Asia. </span><span style="color: #000000;">Katherine holds a master’s degree in international affairs from Carleton University in Ottawa.</span></span></em> <em><span style="color: #000000;"><span style="color: #000000;">Katherine is sharing with us perspectives from her recent keynote address to the conference of the Sri Lankan Evaluation Association.</span></span></em></p>
<p><em><span style="color: #000000;"> </span></em>In reading the newspapers lately, I’ve noticed an increasing expectation that evidence can give us the answers that policy makers need.  I practice evaluation because I believe that evaluation can help identify what is working from what is not working, and for whom.  So I should be pleased to see these calls for “the evidence.”   I am….and yet, I am also somewhat alarmed by this faith in data.</p>
<p>Some people seem to suggest that if we would just get enough evidence we will be able to ‘fix’ poverty.  I think that is both naïve and dangerous.  In the New York Times, Nicolas Kristoff had a piece, called “<a href="http://www.nytimes.com/2011/05/19/opinion/19kristof.html?_r=1&amp;ref=nicholasdkristof">Getting Smart on Humanitarian Aid</a>,” where he said: “How can we most effectively break cycles of poverty? For decades, we had answers that were mostly anecdotal or hot air. But, increasingly, economists provide answers that are rigorously field-tested.”  That sounds good but do we really have answers, and to what?</p>
<p>The evidence that Kristoff was pointing to drew on the excellent work of Duflo and Banerjee on randomized controlled trials.  Kristoff, and a string of other journalists, came to the conclusion that “we now have the answers” based on 2-3 examples that included the cost effectiveness improving school attendance by deworming kids and providing them with school uniforms.  I’ve read the studies.  I’m pretty convinced that schools should deworm and that school uniforms in Africa are probably worth the money. But do education policy makers now have all the answers whereas before they just had ‘hot air?’  Not quite.</p>
<p>These are fairly simple interventions.  I don’t doubt that they are helpful. But idea that we have all the evidence we need or can get it through trials, is not helpful.  It dumbs down development problems by arguing that, until now, everyone working in development has been running around with no clue.  It suggests that governments, implementing agencies, funding agencies, just need to run some experiments to find out what the policy should be.  It’s a simple idea.  But poverty and development are complex.</p>
<p>There is nothing wrong with experiments.  The right tool in any situation is the one that best answers the questions being asked.  My critique is of the idea that development is just about getting the data right, or that evidence ‘neutral’ or has nothing to do with politics.</p>
<p>Why is this a dangerous idea? Kristoff goes on to suggest that “For those who want to be sure, to get the most bang for your buck, there is also a &#8220;proven impact fund&#8221; that supports interventions like deworming…that have proved to be cost-effective in rigorous trials. But what would happen if we only  fund the proven, cost effective things, the sure things?  It’s hard to be sure about many things that matter.</p>
<p>Funding only the sure things would certainly rule out a great deal of things that many of us think are important including work to address:  climate change, violence against women, son preference, human rights, or conflict.   Much of this work takes generations to see results and is deeply contextual; in many of these areas we don’t have ‘sure things.’</p>
]]></content:encoded>
			<wfw:commentRss>http://genuineevaluation.com/the-rise-and-risk-of-evidence/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Causal inference for program theory evaluation</title>
		<link>http://genuineevaluation.com/causal-inference-for-program-theory-evaluation/</link>
		<comments>http://genuineevaluation.com/causal-inference-for-program-theory-evaluation/#comments</comments>
		<pubDate>Thu, 10 Jun 2010 13:50:25 +0000</pubDate>
		<dc:creator>Patricia Rogers</dc:creator>
				<category><![CDATA[Appropriate inference]]></category>
		<category><![CDATA[Causal inference]]></category>
		<category><![CDATA[Causal inference strategies]]></category>
		<category><![CDATA[Evaluation Theory]]></category>

		<guid isPermaLink="false">http://genuineevaluation.com/?p=1298</guid>
		<description><![CDATA[How do we find out whether programs, projects and policies have really made a difference?  Given the complex array of other influences on the outcomes, is it all too hard?  Jane and I have been doing some separate thinking and writing about this.  Putting these together has produced a new map of the issues which might be very useful.  
.
.
 <a href="http://genuineevaluation.com/causal-inference-for-program-theory-evaluation/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fgenuineevaluation.com%2Fcausal-inference-for-program-theory-evaluation%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fgenuineevaluation.com%2Fcausal-inference-for-program-theory-evaluation%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Debates about ways of investigating cause and effect have been a feature of many international evaluation meetings in recent years.  It&#8217;s something I&#8217;ve been wrestling with while finishing a forthcoming book on program theory with Sue Funnell (<strong>Purposeful Program Theory: Effective Use of  Theories of Change and  Logic Models</strong>, to be published by Jossey-Bass).</p>
<div class="wp-caption alignright" style="width: 510px"><img title="Pic by  ralphbijker http://www.flickr.com/photos/17258892@N05/2588347668/" src="http://farm4.static.flickr.com/3149/2588347668_a1006846fa.jpg" alt="" width="500" height="389" /><p class="wp-caption-text">Pic by  ralphbijker http://www.flickr.com/photos/17258892@N05/2588347668/</p></div>
<p>Jane raised the issue of causal inference in a post back in <a href="http://genuineevaluation.com/why-genuine-evaluation-must-include-causal-inference/">February</a>, in a recent <a href="http://realevaluation.co.nz/pres/causation-anzea09.pdf">presentation </a>and her book <a href="http://books.google.co.nz/books?id=ePfuba9tDbEC&amp;lpg=PP1&amp;pg=PA67#v=onepage&amp;q=&amp;f=false" target="_blank">Evaluation Methodology Basics Chapter 5 on Causation</a>. By causal inference we mean both causal attribution (working out what was THE cause) and causal contribution (identifying what was one or more of the causes that together produced the outcomes and impacts).</p>
<p>We both agree that it is an issue that needs to be tackled in evaluation, in ways that are commensurate with the available evaluation resources, and that don&#8217;t assume it&#8217;s simply a matter of using a particular research design (such as using Randomised Controlled Trials) or data collection method (such as stakeholder interviews).</p>
<p>Where we have taken a different tack is in strategies for causal inference.</p>
<h3>Jane&#8217;s take on causal inference for evaluation</h3>
<p>Jane recently (in her book, Evaluation Methodology Basics) shared a list of 8 strategies for causal inference that can be used for a &#8216;patchwork&#8217; or bricolage approach to causal inference:</p>
<blockquote>
<ol>
<li>Ask those who have observed or experienced the causation first-hand</li>
<li>Check if the content of the intervention (or, supposed cause) matches the nature of the outcome</li>
<li>Look for distinctive effect patterns (Scriven’s modus operandi method)</li>
<li>Check whether the timing of the outcomes makes sense</li>
<li>Look at the relationship between “dose” and “response”</li>
<li>Use a comparison or control (RCTs or quasi-experimental designs)</li>
<li>Control statistically for extraneous variables</li>
<li>Identify and check the causal mechanisms</li>
</ol>
</blockquote>
<h3>My take on causal inference for program theory evaluations</h3>
<p>Looking at causal inference for program theory evaluations, I&#8217;ve been thinking more broadly about three components of causal inference:</p>
<ol>
<li>Congruence with the program theory &#8211; do the results match the program theory?</li>
<li>Counterfactual comparisons &#8211; what would have happened without the intervention?</li>
<li>Critical review &#8211; what are plausible alternative explanations for the results?</li>
</ol>
<h3>Combining the two frameworks</h3>
<p>Combining these two framings of the issue has identified some additional techniques that should be added to the repertoire (such as techniques that compare results with those predicted by statistical models of the theory or the predictions of experts) and explains why experimental design by itself is inadequate for causal inference &#8211; this deals with the counterfactual but may not provide sufficient information on the factual (what happened?) and needs to be complemented by critical review. Some techniques (such as asking participants) can provide all three types of evidence, although none by itself will be sufficient.</p>
<p>A bricolage approach, covering these three different components, with triangulation of sources, seems to be what is needed for really credible causal inference.</p>
<p>What do you think? (And, yes, the cogs in the picture show what happens when there is too much interaction between causal forces).</p>
<p><em>from: Purposeful Program Theory: Effective Use of  Theories of Change and Logic Models, by Patricia J. Rogers and Sue C. Funnell, ISBN: 9780470478578, John Wiley/Jossey-Bass, (in press).</em></p>
<address> </address>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="177" valign="top">
<p align="center"><strong>Congruence</strong></p>
<p align="center">Do   the results match the program theory<strong>?</strong></p>
</td>
<td width="202" valign="top">
<p align="center"><strong>Counterfactual comparison</strong></p>
<p align="center">What   would have happened without the intervention?</p>
</td>
<td width="190" valign="top">
<p align="center"><strong>Critical review</strong></p>
<p align="center">Are   there other plausible explanations of the results?</p>
</td>
</tr>
<tr>
<td width="190" valign="top">Comparing achievement of intermediate outcomes with achievement of final outcomes</p>
<p>Disaggregating results for complicated interventions</p>
<p>Statistically controlling for extraneous variables</p>
<p>Modus operandi</p>
<p>Comparing timing of outcomes with program theory</p>
<p>Comparing dose-response patterns with program theory</p>
<p>Comparing statistical model with actual results</p>
<p>Comparing expert predictions with actual results</p>
<p>Asking participants</p>
<p>Asking other key informants</p>
<p>Making comparisons across different cases</td>
<td width="190" valign="top">Control group or comparison group</p>
<p>Comparing the trajectory before and after the intervention</p>
<p>Thought experiments to develop plausible alternative scenarios</p>
<p>Asking participants</p>
<p>Asking other key informants</p>
<p>Making comparisons across different cases</td>
<td width="190" valign="top">Identifying alternative explanations and seeing if they can be ruled out</p>
<p>Identifying and explaining exceptions</p>
<p>Comparing expert predictions with actual results</p>
<p>Asking participants</p>
<p>Asking other key informants</p>
<p>Making comparisons across different cases</td>
</tr>
</tbody>
</table>
]]></content:encoded>
			<wfw:commentRss>http://genuineevaluation.com/causal-inference-for-program-theory-evaluation/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Intention To Treat and checking for implementation failure and differential effects &#8211; questions about vitamin A trials in Ghana</title>
		<link>http://genuineevaluation.com/intention-to-treat-and-checking-for-implementation-failure-and-differential-effects/</link>
		<comments>http://genuineevaluation.com/intention-to-treat-and-checking-for-implementation-failure-and-differential-effects/#comments</comments>
		<pubDate>Fri, 07 May 2010 09:30:48 +0000</pubDate>
		<dc:creator>Patricia Rogers</dc:creator>
				<category><![CDATA[Causal inference]]></category>
		<category><![CDATA[Causal inference strategies]]></category>
		<category><![CDATA[Health]]></category>
		<category><![CDATA[differential]]></category>
		<category><![CDATA[implementation]]></category>
		<category><![CDATA[ITT]]></category>
		<category><![CDATA[RCT]]></category>

		<guid isPermaLink="false">http://genuineevaluation.com/?p=983</guid>
		<description><![CDATA[Has a large RCT provided definitive proof that vitamin A supplementation is ineffective in reducing maternal mortality?  Or could there be another explanation?  And why hasn't the widespread reporting of these findings examined these?

 <a href="http://genuineevaluation.com/intention-to-treat-and-checking-for-implementation-failure-and-differential-effects/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fgenuineevaluation.com%2Fintention-to-treat-and-checking-for-implementation-failure-and-differential-effects%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fgenuineevaluation.com%2Fintention-to-treat-and-checking-for-implementation-failure-and-differential-effects%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>A recent RCT in Ghana that found Vitamin A supplementation did not lead to reduced maternal mortality ( a finding very different to a study in Nepal that found a 44% reduction) has raised some important questions for me about how well clinical trial procedures match with our understanding of using program theory in evaluation &#8211; in particular whether implementation failure or differential impacts might explain the findings rather than the ineffectiveness of the program.</p>
<p><img title="Childbirth" src="http://i1024.photobucket.com/albums/y305/patriciajrogers/childbirth.jpg" alt="div xmlns:cc=http://creativecommons.org/ns# about=http://www.flickr.com/photos/dfid/4420202860/a rel=cc:attributionURL href=" width=" mce_href=" height="288" /></p>
<p><em>http://www.flickr.com/photos/dfid/4420202860 CC BY-NC-ND 2.0</em></p>
<p>Here are the main findings, according to the summary published in The Lancet, <a href="http://www.thelancet.com/journals/lancet/issue/vol375no9726/PIIS0140-6736%2810%29X6125-6"> Volume 375, Issue 9726</a>,  Pages 1640 &#8211;  1649, 8 May 2010</p>
<div>
<blockquote>
<div>
<div id="authorList" style="display: none; position: absolute; top: 390px; left: 300px;">Original  Text</div>
</div>
</blockquote>
<div>
<blockquote>
<div>
<h2>Effect of vitamin A supplementation in women of  reproductive age on  maternal survival in Ghana (ObaapaVitA): a  cluster-randomised,  placebo-controlled trial<strong></strong></h2>
<p><strong>Background </strong>- A previous trial in Nepal showed that  supplementation with vitamin A or its precursor (betacarotene) in women  of reproductive age reduced pregnancy-related mortality by 44% (95% CI  16—63). We assessed the effect of vitamin A supplementation in women in  Ghana.</div>
<div>
<p><strong>Methods </strong> &#8211; ObaapaVitA was a cluster-randomised,  double-blind, placebo-controlled trial undertaken in seven districts in  Brong Ahafo Region in Ghana. The trial area was divided into 1086 small  geographical clusters of compounds with fieldwork areas consisting of  four contiguous clusters. All women of reproductive age (15—45 years)  who gave informed consent and who planned to remain in the area for at  least 3 months were recruited. Participants were randomly assigned by  cluster of residence to receive a vitamin A supplement (25 000 IU  retinol equivalents) or placebo capsule orally once every week.  Randomisation was blocked and based on an independent,  computer-generated list of numbers, with two clusters in each fieldwork  area allocated to vitamin A supplementation and two to placebo. Capsules  were distributed during home visits undertaken every 4 weeks, when data  were gathered on pregnancies, births, and deaths. Primary outcomes were  pregnancy-related mortality and all-cause female mortality. Cause of  death was established by verbal post mortems. Analysis was by intention  to treat (ITT) with random-effects regression to account for the  cluster-randomised design. Adverse events were synonymous with the trial  outcomes. This trial is registered with <a href="http://clinicaltrials.gov/" target="_blank">ClinicalTrials.gov</a>,  number <a href="http://clinicaltrials.gov/ct2/show/NCT00211341" target="_blank">NCT00211341</a>.</div>
<div>
<p><strong>Findings </strong>544 clusters (104 484 women) were randomly  assigned to vitamin A supplementation and 542 clusters (103 297 women)  were assigned to placebo. The main reason for participant drop out was  migration out of the study area. In the ITT analysis, there were 39 601  pregnancies and 138 pregnancy-related deaths in the vitamin A  supplementation group (348 deaths per 100 000 pregnancies) compared with  39 234 pregnancies and 148 pregnancy-related deaths in the placebo  group (377 per 100 000 pregnancies); adjusted odds ratio 0·92, 95% CI  0·73—1·17; p=0·51. 1326 women died in 292 560 woman-years in the vitamin  A supplementation group (453 deaths per 100 000 years) compared with  1298 deaths in 289 310 woman-years in the placebo group (449 per 100 000  years); adjusted rate ratio 1·01, 0·93—1·09; p=0·85.</div>
<div>
<p><strong>Interpretation </strong>The body of evidence, although limited,  does not support inclusion of vitamin A supplementation for women in  either safe motherhood or child survival strategies.</div>
<p><strong>Funding </strong>- UK Department for International  Development, and USAID.</p></blockquote>
<p>The first author of the study, Professor Bettey Kirkwood, was quoted by <a href="http://www.alertnet.org/thenews/newsdesk/LDE63T0BH.htm">Reuters</a>:</p>
<blockquote><p>Kirkwood noted that the Ghana findings contradicted previous results  from a trial in Nepal which showed a 44 percent decrease in maternal  death among women given vitamin A.  She said the discrepancy  showed why it is wise not to rush to change maternal health policies on  the basis on one piece of research. &#8220;Research is as important to  identify potentially good ideas that do not work, as it is in  establishing those that do,&#8221; she wrote. &#8220;This avoids governments wasting  resources on ineffective interventions.&#8221;</p></blockquote>
<p>Now it might be the case that the study has provided compelling evidence that the intervention is ineffective &#8211; but I have two lingering questions about it, and none of the available reports have provided enough information to answer these.  <em>(If there is additional information available, please send me the details).</em></p>
<h2>Is this a case of misinterpreting ITT &#8211; Intention To Treat &#8211; data?</h2>
<p>The study involved an ITT analysis &#8211; that is, analyzing the results for those assigned to the groups <strong>regardless </strong>of compliance  with the treatment.  That is, the results for the treatment group includes all those who were assigned to the treatment group, but did not actually take the supplements.  So if there was low adherence, the intervention would appear to be ineffective, even if it was effective for those who took the supplements.  Was there low adherence?  This information is not provided in the available summary (my online journal subscriptions don&#8217;t include the most recent 2 months&#8217; issues) or in any of the reports on the study.</p>
<p>In fact some of the reporting referred to the effect of TAKING the supplements, rather than the effect of being SUPPLIED with the supplements:</p></div>
</div>
<p><a href="http://www.ghanaweb.com/GhanaHomePage/NewsArchive/artikel.php?ID=161205">GhanWeb </a>reported it as follows:</p>
<blockquote><p>Accra, April 27, GNA &#8211; A study conducted in Kintampo by a team of  experts has shown beyond doubt that taking Vitamin A does not reduce the  risk of death in pregnancy or childbirth.</p></blockquote>
<p>By contrast, <a href="http://uk.reuters.com/article/idUKTRE6425R720100503">Reuters </a>correctly reported what had been tested:</p>
<blockquote><p><span id="articleText"><span>Giving vitamin A to  women aged between 15 and 45 in poor nations does not cut maternal death  rates, scientists said Tuesday in a study which contradicts earlier  research showing a dramatic drop in death rates.</span></span></p></blockquote>
<p><span><span>There is an important debate going on about the use of ITT analysis.  Clearly it is important to know the end result of a policy change &#8211; but it is also essential to understand whether a lack of effects is due to implementation failure or theory failure. </span></span></p>
<p><a href="http://www.jerrydallal.com/LHSP/itt.htm">Gerard Dallal</a> has set out an interesting analysis of this issue which focuses on the fact that ITT analysis is appropriate for answering the question &#8220;What happens once a treatment is started or recommended?&#8221; but it does not answer the question about whether it is effective for those who follow the treatment.</p>
<p><span><span>For public policy issues, we need to know the latter &#8211; and if there is a big difference between the two we need to investigate ways to improve adherence.  This is what <a href="http://www.springerlink.com/content/t508k0uj27074305/fulltext.pdf">a group of researchers</a> have advocated for studies of the effectiveness of microbicides for HIV prevention, where adherence is an important issue (and very hard to measure). </span></span></p>
<p><span><span>At the very least we need to be clear in the reporting of findings whether or not there was an effect for those compliant with the treatment. But even better would be to go beyond a simple summative judgement (did the policy work ?) and look at whether there is scope to improve it.<br />
</span></span></p>
<h2><span><span>Were those who were most likely to benefit from the treatment screened out of the study?</span></span></h2>
<p><span><span>Who would be most likely to benefit from vitamin A supplementation?  Women who had vitamin A deficiency of course.  Ideally the study would have been able to report on whether or not the results were different for those with vitamin A deficiencies as the effects might be lost in the noise from the total population.</span></span></p>
<p><span><span>The information about the trial recorded at <a href="http://http://clinicaltrials.gov/ct2/show/NCT00211341">http://clinicaltrials.gov</a> had this surprising statement:</span></span></p>
<blockquote><p>There will be no exclusions to participation, except for women who  have  nightblindness or other signs of VAD.  These, and any women who  develop  VAD in the course of the study will be treated according to  current  IVACG recommendations (IVACG, 1997). They will continue to be  followed,  but will be given vitamin A and considered separately in the  analysis.</p></blockquote>
<p>These women might have been considered separately in the analysis, but their participation (or exclusion) and outcomes have not been reported in the summaries of the project.  If they are being excluded from the study on ethical grounds, to avoid them being denied a proven treatment for Vitamin A deficiency, wouldn&#8217;t that reduce the likelihood of the study producing significant effects &#8211; not because Vitamin A is ineffective, but because those for whom it would have made a difference were not included in the study?</p>
<p>I&#8217;d be very interested to hear more about this study, and whether the full report addresses these issues.  And of course I&#8217;ll be following  how these findings are reported.</p>
]]></content:encoded>
			<wfw:commentRss>http://genuineevaluation.com/intention-to-treat-and-checking-for-implementation-failure-and-differential-effects/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>£6 million over 5 years &#8211; and STILL no genuine evaluation of Blueprint?</title>
		<link>http://genuineevaluation.com/6-million-over-5-years-and-still-no-genuine-evaluation-of-blueprint/</link>
		<comments>http://genuineevaluation.com/6-million-over-5-years-and-still-no-genuine-evaluation-of-blueprint/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 09:21:29 +0000</pubDate>
		<dc:creator>Jane Davidson</dc:creator>
				<category><![CDATA[Adequate scope]]></category>
		<category><![CDATA[Causal inference]]></category>
		<category><![CDATA[Causal inference strategies]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Evaluation team composition]]></category>
		<category><![CDATA[Evaluative questions & answers]]></category>
		<category><![CDATA[Government programs]]></category>
		<category><![CDATA[Learning from failure]]></category>
		<category><![CDATA[The client's role]]></category>
		<category><![CDATA[comparisons]]></category>
		<category><![CDATA[drug education]]></category>
		<category><![CDATA[government]]></category>
		<category><![CDATA[sample size]]></category>
		<category><![CDATA[statistical power]]></category>
		<category><![CDATA[UK]]></category>

		<guid isPermaLink="false">http://genuineevaluation.com/?p=797</guid>
		<description><![CDATA[When a large and expensive evaluation fails to produce useful results, it&#8217;s worth seeing if at least it can be useful as a cautionary tale. Blueprint is a UK Government-funded drugs education programme consisting of five components: drug education in &#8230; <a href="http://genuineevaluation.com/6-million-over-5-years-and-still-no-genuine-evaluation-of-blueprint/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fgenuineevaluation.com%2F6-million-over-5-years-and-still-no-genuine-evaluation-of-blueprint%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fgenuineevaluation.com%2F6-million-over-5-years-and-still-no-genuine-evaluation-of-blueprint%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>When a large and expensive evaluation fails to produce useful results, it&#8217;s worth seeing if at least it can be useful as a cautionary tale.</p>
<p><a href="http://drugs.homeoffice.gov.uk/young-people/blueprint/index.html" target="_blank">Blueprint</a> is a UK Government-funded drugs education programme consisting of five components:  drug education in schools (for 11 and 12-year-olds &#8211; this was the main emphasis), drug education for parents, media coverage of Blueprint, partnerships with other key health policy stakeholders, and a set of shared principles for drug education across prevention practitioners in the community. Blueprint was developed to support the UK Government target to ‘reduce the use of Class A drugs and the frequent use of any illicit drugs among all young people under the age of 25, especially by the most vulnerable people’.</p>
<p>According to <a href="http://www.bbc.co.uk/blogs/thereporters/markeaston/2009/09/project_blueprint.html" target="_blank">BBC home editor Mark Easton&#8217;s blog</a>, the evaluation budget for this 5-year longitudinal study was £6 million. Surely it&#8217;s reasonable to expect a genuine evaluation that actually answers important questions when you&#8217;ve got that kind of budget &#8211; right?</p>
<p>Right &#8230; Check out this quote from the <a href="http://www.ism.stir.ac.uk/pdf_docs/Blueprint/finalreport.pdf" target="_blank">Blueprint evaluation report</a> (p. 71):</p>
<blockquote><p><span style="text-decoration: underline;">Measuring Impact and Outcomes</span></p>
<p>It was originally intended that this report would describe the impacts and outcomes of the Blueprint programme. As described in the delivery report (Stead 2007), the six local schools were to act as a comparator to the 23 Blueprint schools, so that conclusions could be drawn on the efficacy of the programme. However, when analysis during the early stages of the evaluation concluded that to detect differences between the two samples would require a sample of at least 50 schools, it was decided that the implementation of the programme would become the main focus. Nonetheless, it was still planned that the local school data would be presented alongside the Blueprint school data, to enable some comparisons to be drawn between the two samples. However, a recent review concluded that to present the data in this way would be misleading, given that the two samples are not matched, and to make comparisons between the two samples or to draw conclusions on the efficacy of the programme, based on these findings, would be wrong. While it is disappointing that this stage of the evaluation has not met expectations, the evaluation has looked in detail at how the programme was received and provides valuable insights for future initiatives of this kind.</p></blockquote>
<p>The lack of an adequate comparison sample is a basic and foreseeable  flaw in the research that was picked up in the EARLY stages of the evaluation. <strong>Why wasn&#8217;t this rectified at  the time by adding more schools to the comparison sample?</strong> How could a <em>process evaluation</em> (looking only at &#8216;implementation&#8217;) possibly be worth £6 million of taxpayers&#8217; money as a knowledge and insight-generating exercise?</p>
<p><strong>Where was the client</strong> (presumably the UK Home Office) when these discoveries were appearing? Why weren&#8217;t they insisting on a design that would actually answer what they needed to know? And if the evaluation team couldn&#8217;t deliver within such a substantial budget, what possessed them to allow that much money to continue to be spent on such a low-value evaluation?</p>
<p>Even if the size of the comparison sample was inadequate for the intended design  to be the SOLE basis for inferring causation, <strong>what consideration was given to complementing the less-than adequate comparison data with other causal inference methods?</strong> With a bit of creative thinking, it&#8217;s possible that a less-than-adequate comparison could have been strengthened with some quantitative, qualitative, and mixed-method strategies so that there might at least have been an approximate answer to the causal inference/outcome-related questions.</p>
<p>The acknowledgments page presents a long list of <strong>44 researchers</strong> (are we surprised they don&#8217;t list themselves as evaluators?) <strong>from seven universities and research organisations</strong> who worked on the Blueprint evaluation.</p>
<p>All this strikes me as a classic case of four themes I frequently see when evaluation has been a waste of time and money:</p>
<ol>
<li><strong>Hiring researchers to do an evaluators&#8217; job</strong> &#8211; in my experience, those whose career focus is research usually lack the necessary expertise in evaluation to be able to scope the work adequately to get the most important questions answered and weed out the &#8220;wouldn&#8217;t it be nice to know&#8221; lines of inquiry. The fault lies on both sides &#8211; the client&#8217;s (it&#8217;s like hiring an electrician to do your plumbing, or vice versa &#8211; and saying &#8220;how was I to know there was a difference?&#8221; won&#8217;t cut it), and the researchers&#8217; (for presenting themselves as being sufficiently skilled in something they are not).</li>
<p></p>
<li><strong>Setting the evaluation plan in concrete way too early</strong> &#8211; an evaluation needs to retain a level of fluidity and flexibility in its design so that it can be changed and improved midstream if found to be lacking, or if the stakeholders&#8217; needs or questions change. This fluidity is often in conflict with the traditional research approach, particularly when the research team is hoping to get academic publications out of the work, because of the traditional emphasis on the exact same measures being used, and the timing of measures being identical for experimental and comparison groups. Of course a pure Randomised Controlled Trial design (with random assignment to the treatment group or the control group) does require this to be done up front, but in some cases a moderate loss of design purity can be an acceptable price to pay for better questions, measures, comparison opportunities, or other causal evidence.</li>
<p></p>
<li><strong>The &#8220;more the merrier&#8221; approach to evaluation team composition</strong> &#8211; too many people and organisations involved, which often means too much diffusion of responsibility and expertise, and (most importantly) the people who might have the right expertise are frequently not involved at critical stages of evaluation design, so that less qualified individuals make decisions that are flawed and expensive to reverse.</li>
<p></p>
<li><strong>Failing to pull the plug when the evaluation is clearly not going to deliver sufficient valuable insights</strong> &#8211; According to the UK Telegraph newspaper, <a href="http://www.telegraph.co.uk/news/newstopics/politics/labour/6228900/Bob-Ainsworth-wasted-6-million-on-pointless-research.html" target="_blank">academic advisers to the Home Office told one of the ministers overseeing the evaluation that it would be a waste of public funds</a>. An email sighted by the BBC from one of the advisers, Professor Sheila Bird, said: &#8220;I/we thought the decision-making so obvious = NOT to go ahead that we did not assiduously follow-up to ensure that the obvious decision was actually made!&#8221;</li>
</ol>
<p>What&#8217;s particularly irritating to me (especially as a taxpayer) is that these same mistakes are made over and over again, often by different parts of the same organisation. If client organisations &#8211; those who commission evaluation &#8211; don&#8217;t insist on value-for-money genuine evaluation, they can expect more of the same in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://genuineevaluation.com/6-million-over-5-years-and-still-no-genuine-evaluation-of-blueprint/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Advocating for RCTs &#8211; with a non-RCT example?</title>
		<link>http://genuineevaluation.com/advocating-for-rcts-with-a-non-rct-example/</link>
		<comments>http://genuineevaluation.com/advocating-for-rcts-with-a-non-rct-example/#comments</comments>
		<pubDate>Wed, 24 Mar 2010 00:02:48 +0000</pubDate>
		<dc:creator>Patricia Rogers</dc:creator>
				<category><![CDATA[Causal inference]]></category>
		<category><![CDATA[Causal inference strategies]]></category>
		<category><![CDATA[Government programs]]></category>
		<category><![CDATA[evidence-based policy]]></category>
		<category><![CDATA[experiments]]></category>
		<category><![CDATA[RCTs]]></category>

		<guid isPermaLink="false">http://genuineevaluation.com/?p=709</guid>
		<description><![CDATA[Curious post by Tim Harford in the Financial Times recently  "Political ideas need proper testing"  that slides from advocating for better empirical investigation of public policy by systematic experimentation to discussing this only in terms of RCTs - and then uses as the exemplar a brilliant example of using other types of evidence to inform policy. <a href="http://genuineevaluation.com/advocating-for-rcts-with-a-non-rct-example/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fgenuineevaluation.com%2Fadvocating-for-rcts-with-a-non-rct-example%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fgenuineevaluation.com%2Fadvocating-for-rcts-with-a-non-rct-example%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Curious post by Tim Harford in the Financial Times <a href="http://timharford.com/2010/03/ft-comment-political-ideas-need-proper-testing/">recently </a>&#8220;Political ideas need proper testing&#8221;  As is often the case, his argument slid from advocating for better empirical investigation of public policy by systematic experimentation to discussing this only in terms of RCTs (Randomised Controlled Trials). I was reminded of <a href="http://survey.ate.wmich.edu/jmde/index.php/jmde_1/article/view/160/186"> Michael Scriven&#8217;s critique</a> of the appropriation of the word &#8216;experimental&#8217; (in a recent issue of the Journal of MultiDisciplinary Evaluation)</p>
<blockquote><p>&#8230; the RCT campaign also involves the less-remarked parallel effort, going back further, to redefine the concept of an experiment. In standard scientific usage, experiments are just carefully constrained explorations, and the RCT is simply a special case of these. To call the RCT the only “true experiment” is part of an attempt at redefinition that distorts the original and continuing usage, and excludes experiments designed to test many simple hypotheses about—or simple efforts to find out—what happens if we do this.<br />
This effort at persuasive redefinition is allied with an implicit denigration of the so-called “quasi-experimental” designs, which are in fact perfectly respectable experiments, only ‘quasi’ with respect to the one respect in which they have less control over one possible way of excluding one type of alternative explanation.</p></blockquote>
<p>What makes the recent post particularly surprising is that he is advocating for RCTs through an example which shows clearly the value of drawing on other types of evidence &#8211; the changing advice on infant sleeping positions to reduce the risk of Sudden Infant Death Syndrome.  No RCTs were used to investigate the effectiveness of these and yet adequate evidence was produced to be able to make effective changes in policy.</p>
<p>I recently used the same example to show the value and feasibility of using a range of credible evidence about effectiveness, including non-experimental designs.  It will be published shortly  by the Productivity Commission in proceedings from a <a href="http://www.pc.gov.au/research/confproc/strengthening-evidence">Roundtable </a>on Strengthening Evidence-based Policy in the Australian Federation.</p>
<blockquote>
<h3>Non-RCT data can provide good quality evidence of effectiveness</h3>
<p>Good quality evidence of effectiveness can also come from quasi-experimental approaches, which compare program participants to a comparison group rather than to a randomly assigned control group, and from non-experimental approaches, when such approaches systematically and rigorously test causal conclusions and combine evidence thoughtfully.<em> </em></p>
<p>Sudden Infant Death Syndrome (SIDS) is one of two exemplars in the National Health and Medical Research Council guide <a href="http://http://www.nhmrc.gov.au/_files_nhmrc/file/publications/synopses/cp71.pdf"><em>How to Put the Evidence into Practice: Implementation and Dissemination Strategies</em></a> (NHMRC 2000).</p>
<p>It shows both the value of drawing on a diverse set of evidence and how it is possible to develop effective policy even when the evidence is not definitive. Bringing together evidence from many studies, including retrospective and prospective epidemiological studies, pathological studies and case studies, a number of possible contributing factors were identified, and other possible causes (such as vaccinations) were ruled out.</p>
<p>On the basis of this incomplete evidence, recommendations were developed — to put babies to sleep on their backs, avoid overheating and avoid cigarette smoke. No RCTs were used to test the effectiveness of these recommendations. The recommendations were communicated directly to parents and to health professionals working with parents, resulting in widespread change in the sleeping positions they used for infants.</p>
<p>By 2005, the number of SIDS deaths had been reduced to fewer than 100, a decline of 83 per cent (ABS 2007).</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://genuineevaluation.com/advocating-for-rcts-with-a-non-rct-example/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

