Outcome Based Government

By Toby Eccles and Sarah Doyle

A few weeks back we were at a retreat for Social Impact Bond Developers on an Island on the Muskoka Lakes in Canada. It was a sensational venue (thank you to the Breuninger and BMW Foundations, and the MaRS Centre for Impact Investing) and a really interesting few days. One of the topics that a small group of us spent some time thinking about was: What does Outcome Based Government look like? Social Impact Bonds are a really interesting model for a range of opportunities, but they are still a specialist subject that will only ever affect a modest amount of overall funding. There is however a need for a much greater proportion of government spending to use some of the elements that SIBs encourage. In particular:

  • Deciding what you are trying to achieve and how you are going to measure it before you start;
  • Benefiting from the combination of outcomes data and flexible contracting to use rapid iteration and adaptation models to test, learn and improve services on a continual basis;
  • Transparency on what has been achieved;
  • Explicitly analysing interactions between the outcomes you are trying to achieve and other parts of government expenditure, and measuring their impact on each other.

These would have some fairly profound implications. Examples might be:

  • The expectation that, wherever feasible, any policy idea would be treated as a hypothesis or series of hypotheses to be tested in an experimental way, informing future decisions on the potential to expand funding or commit to a wider roll out. This would potentially allow more policy ideas to be tested, with only a selection implemented on a wider scale;
  • Government would over time learn the actual cost of its services and external service providers at producing certain outcomes, making it easier to compare;
  • There would be a much clearer understanding of what works and whether a given initiative had been successful;
  • Budgets being set including both funds and measurable outcomes (tied to specific funding streams), at a sufficient level of granularity to demonstrate success or failure. This could happen at multiple levels of government, both for departments and within departments.
  • Service providers would benefit from longer-term contracts, increased flexibility to adapt over the course of the contract with a view to delivering on outcomes, and a reduced reporting burden regarding day-to-day operations and interim outputs.
  • New partnerships would be incentivized within government, cutting across traditional silos, and outside of government, where a consortium of service providers (generally coordinated by a prime contractor) may have a better chance of delivering on target outcomes.
  • For it to work there would need to be at the least an independent entity defining, or assessing or auditing the outcomes that government is using, to ensure rigour and accountability, and avoid politicization.

This is in part about better measurement, but it is also understanding that we live in a complex world, that we can’t produce predictable outcomes simply by better planning, that instead we need to create feedback loops, gather information, and adapt accordingly. In short we need social services to have a rigorous model of learning that generates knowledge that can be built upon and improved. The path to getting there is not straightforward. It strikes me that there are at least four products that alongside the SIB can help move government along this path:

  • Support bringing outcomes into the budgeting and financial planning and management processes;
  • Help bringing outcome elements to contract renewals in such a way as to drive improved services;
  • Procurement and contracting models that build in feedback loops and an expectation that the contract will adapt over time, rather than stay the same;
  • A better model for outcome work at scale than the present large scale national or state-wide procurement process that we are seeing with the likes of the work programme or transforming justice. This would involve developing a model that allowed feedback and learning, starting experimentally in two or three areas. For example one could create a framework of sought after outcomes and maximum values that can be paid for them, a referral methodology, and an initial community of providers. Thereafter the providers can be added to, say annually, and the outcome values can also be changed according to what outcomes are generated for what value. There would be transparency in terms of what providers are doing and the outcomes they are achieving. Toby will be writing more on this soon.

Would this work? What else is needed? All thoughts welcome!

PS This is intended as a starter for discussion. There are plenty of areas of government where an outcomes approach may be inappropriate. There are also plenty of poor ways of introducing outcomes that simply become target cultures with strange perverse incentives, or the creation of meaningless numbers outside of government control. Previous attempts have tended to create top down targets as a way of managing from the centre, rather than as a way of creating feedback loops from the outside. We hope to explore some of these challenges and issues in future blogs.

Thanks to Peter Barth @ Third Sector Capital, and Caitlin Reimers @ Social Finance US for a great conversation…



Further data on the Peterborough Social Impact Bond

The Office of National Statistics provided further data on Peterborough at the end of July, this time on the complete first cohort of 1,000 prisoners.

While this is largely confirmatory information, the Ministry of Justice found a closer matching baseline, by focusing on local prisons rather than all national prisons. This responds to the concern that Peterborough may be hard to emulate or unrepresentative as it is local and therefore returns more of its prisoners to the local area.

The updated data looks like this:

Peterborough (and national equivalent) interim re-conviction figures of cohort 1 with a 6 month re-conviction period


National local prisons

Discharge Period

Cohort size











 Sep06- Jun08






 Sep07- Jun09






 Sep08- Jun10






 Sep10- Jun12






Binary: Reconviction rate over six months
Frequency: Frequency of reconviction events per 100 offenders within six months

A few topics to cover:

– Is this a better baseline and therefore does it give us greater confidence in the effect that Peterborough is having?

– Is this data good, or mixed as some have reported?

1. Is this a better baseline?
It should be, as it better matches the Peterborough cohort. As an experiment, I thought I would put together similar graphs to the ones before and compare them.

Data to March, with National baseline
Rebased reoffending data

Data to June with National local baselineRebased reoffending 2

And now the relative change graphs

Data to March, with National baseline
Peterborough relative to national

Data to June with National local baselinePeterborough relative to national2

What this shows visually is that the new baseline appears to be a better fit. Movements in the baseline prior to the intervention are closer to the movements in the Peterborough cohort, in other words the baseline appears to explain more of the movement in the Peterborough data. So it gives us greater confidence that we are seeing an intervention effect.

It also gives us a degree of greater confidence that we will get paid. The previous data ended with Peterborough’s frequency number equalling the national average. This one ends with Peterborough at least improving upon it. The propensity score matching process should bring out a comparison cohort that is even more similar, but of course we still haven’t tried it.

So, is it time to pop open the champagne and celebrate? Not yet. This is good news, but it is still only on six month data. We will be measured on whether we reduce offending over twelve months. What we can say is that our intervention appears to at least delay reoffending behaviour.

We should also say, this is only the first cohort of the first Social Impact Bond. It is incredibly early days so drawing significant conclusions at this stage is premature. On the other hand, we are learning and developing all the time, so the fact that we see a significant impact on such an early group is clearly exciting.

2. So is this data good, or mixed as some have reported?

We are cautious, because this is early days and early data. It isn’t a randomised control trial, sure. Nor is it the formal comparison cohort that will be developed for payment purposes using propensity score matching. But this is very positive data, on the best available information.

In the first set of results, which were also good, one of the caveats people put forward was that the Peterborough frequency was now only the national average. On this closer baseline this is no longer the case.

Another concern was that the jump in re-offending frequency in the national data should be treated with caution. I understand that, and see the potential for regression to the mean, but comparison with national data is more precise than looking at a comparison with historical data. Thus the 20% relative decline is the more useful figure than the 8% decline against historical figures, particularly given the strong correlation between the local prison data and the Peterborough data historically.

It is important to draw a distinction between responding with caution, on the basis of the caveats outlined above, and saying that results are “mixed” as we have seen in a few quarters. They’re not mixed, they’re surprisingly strong – but early and indicative at this stage.