Recommendation 20

20 Accepted

Government still faces challenges in robustly evaluating and sharing AI pilot learning.

Conclusion

We questioned DSIT on how it is evaluating and sharing learning from AI pilot activity across government to avoid reinventing the wheel and to support AI adoption at scale. It told us that it was taking a range of approaches including developing guidance and identifying good practice case studies, establishing communities of AI practitioners, and adopting an ‘open build’ approach to developing AI tools, publishing their results to make them available to others in the public and private sectors.47 41 C&AG’s Report, para 12 42 C&AG’s Report, para 12 and Figure 7 43 Qq 17, 24 44 Q 3 45 Q 3 46 C&AG’s Report, para 3.5 47 Q 21 15 A witness from the Department said that “you want to run pilots, work out what works and then scale them”.48 It also confirmed that the AI Knowledge Hub that it is piloting is intended to bring together advice and guidance in one place and make it accessible and user–friendly.49 On the issue of robust evaluation of AI pilots, the Cabinet Office told us that the Evaluation Taskforce had recently published guidance for evaluating the impact of AI tools. It also told us there is a lot of guidance on evaluating programmes from a financial, commercial and technological perspective, but the next challenge was to ensure that these techniques were understood and people knew where to go for support.50 AI procurement

Government response summary AI-generated

DSIT is establishing workstreams to gather and share insights from AI pilots, including the Prime Minister's AI Exemplars, to identify conditions for successful scaling. This includes assessing maturity, identifying core components for scaling, resolving bureaucratic blockers, and creating guidance with HMT for consistent and rigorous impact evaluation.

Summary of the government's response below — read the verbatim text to verify.

Government Response Accepted

HM Government · verbatim extract Accepted

4. PAC conclusion: DSIT has no systematic mechanism for bringing together learning from pilots and there are few examples of successful at–scale adoption across government. 4a. PAC recommendation: To learn from AI pilots and support the scaling of the most promising use cases DSIT should: • set up a mechanism for systematically gathering and disseminating intelligence on pilots and their evaluation; and 4.1 The government agrees with the Committee’s recommendation. Target implementation date: May 2026 4.2 DSIT is establishing a number of workstreams to create a structured process for gathering and sharing insights from AI pilots (branded as the Prime Minister’s AI Exemplars), including those built by the Incubator for AI (i.AI). Across a variety of AI use cases and public services, these workstreams will collect insights from pilots of both commercial and DSIT own- build AI tooling, conducting rigorous evaluation to identify the conditions and common patterns that enable successful scaling. 4.3 Led by the Public Sector AI Adoption programme, these learnings will be synthesised into patterns for implementation, including best practice guides and case studies to help service teams build the government’s collective capacity to maximise the potential of AI for better outcomes and avoid duplication. Bringing a systematic approach to evaluating AI’s impact on delivering better public services for citizens, and productivity. 4b. PAC recommendation: • set out how it will identify common and scalable AI products and support their development and roll–out at scale. 4.4 The government agrees with the Committee’s recommendation. Target implementation date: May 2026 4.5 DSIT is well positioned, in terms of both its remit and its specialist capabilities, to enable more successful scaling within departments and across organisational boundaries, ultimately realising better outcomes for government and for citizens from these technology investments. 4.6 A proposal is currently being developed in the Public Sector AI Adoption programme with the aim to build maturity and capability within government as a whole to be able to effectively scale AI initiatives past pilot stage. This work would include creating the resources and capability to: • Assess maturity and capability across government departments and synthesise into a government-wide picture, in order to validate understanding of blockers & prioritise work to resolve them. • Effectively identify core components best suited to scale and roll out to a wider audience. • Catalogue and create (or advocate for) targeted interventions to resolve common bureaucratic blockers to timely scaling, to include: procurement frameworks and skillsets; information governance processes; Digital, Data and Technology (DDaT) recruitment and funding. • Guidance agreed with HM Treasury to evaluate impact consistently and rigorously - enabling departments to reliably access funding. 4.7 The first part of this proposal will be to decide scope, secure funding, and build a minimal viable product of cross-cutting technical components to be built and maintained by the digital centre of government.

Read the full response on Parliament ↗

Recommendation 20

Government still faces challenges in robustly evaluating and sharing AI pilot learning.

Source

Addressee Bodies

Timeline