Nearly half a million dollars. That's what Australia's Department of Employment and Workplace Relations paid Deloitte for a report that turned out to be riddled with AI-generated nonsense.
Nearly half a million taxpayer dollars wasted on an AI-generated report filled with fabricated citations and fictional academic sources.
The consulting giant used Azure OpenAI GPT-4o to help produce their July 2025 analysis of the country's welfare compliance system. Big mistake.
The $440,000 report was supposed to be an independent review of Australia's Targeted Compliance Framework—the system that automatically penalizes jobseekers who don't comply with requirements. Instead, it became a masterclass in how not to use artificial intelligence for serious work.
What went wrong? Pretty much everything. The AI hallucinated citations to academic studies that simply don't exist. Imagine trying to fact-check a bibliography where half the sources are figments of a computer's imagination.
The report also botched the connection between compliance rules and actual legislation, contained system defects, and featured an IT design that officials criticized for being unnecessarily punitive toward welfare recipients. This incident highlights how black-box algorithms can create opaque decision-making processes that undermine public trust in critical government assessments.
Deloitte eventually admitted they screwed up and issued a partial refund to the Australian government. They updated the report to fix the errors but insisted their main findings hadn't changed. Right.
Labor Senator Deborah O'Neill wasn't buying it. She called the whole mess a "human intelligence problem" and slammed Deloitte's apology as insufficient.
Government officials started questioning whether they could trust AI-supported consulting work on critical welfare policies. Fair point.
The episode sparked broader concerns about outsourcing sensitive government evaluations to firms using AI tools without proper oversight. Media coverage highlighted the obvious risks of letting algorithms run wild on public sector projects.
Deloitte's response was curious. While admitting fault, they simultaneously announced plans to roll out Anthropic's Claude AI to over 470,000 employees company-wide.
They also signaled intentions to lean harder into AI for consulting services over the next few years. Because doubling down always works so well.
Experts viewed Deloitte's problems as symptomatic of wider challenges facing professional services firms trying to integrate AI. The incident underscored a simple truth: when AI hallucinates on government policy work, taxpayers foot the bill. Industry experts predict AI could eliminate half of entry-level jobs within five years, fundamentally reshaping how consulting firms operate. The seven-month project that produced this flawed analysis had concluded earlier in 2025, making the errors all the more embarrassing given the extended timeline.

