When AI Lies to You: Hallucination, Data Quality, and the Old School Fix

AI doesn’t know it’s wrong. That’s the problem. Here’s how traditional discipline keeps the machine honest.

AI systems now influence drug prescriptions, financial decisions, and insurance outcomes. Most of the people on the receiving end have no idea that the model generating those outputs can — and sometimes does — fabricate them entirely, with perfect confidence and zero warning.

That phenomenon has a name: hallucination. And in the world of data quality, it is not a quirk. It is a defect. A serious one. One that traditional quality management disciplines have the tools to address — if we are willing to use them.

I am Faheem, a practitioner in the healthcare industry working at the intersection of AI deployment, data quality management, and reporting analytics. What I have observed over the past two years is that the organisations navigating AI most successfully are not the ones with the most sophisticated models. They are the ones that have wrapped those models in the most disciplined quality frameworks — frameworks that predate machine learning by decades.

The Scale of the Problem

The numbers are not reassuring. The AI market has scaled faster than its governance. The result is a growing gap between what organizations believe their AI systems are producing and what those systems are actually generating.

Stanford’s AI Index (2024) is particularly damning for high-stakes domains: hallucination rates are highest precisely where they matter most — medicine, law, and finance — because training data in these domains is sparser, and the model is more likely to fill gaps with invention rather than retrieval.

When AI Gets It Expensively Wrong

These are not theoretical risks. The following cases are documented, public, and costly. In each, a failure of data quality governance — training data integrity, output validation, or human oversight — turned AI confidence into organizational catastrophe.

The pattern is consistent: AI was trusted to produce consequential outputs without either sufficient upstream data validation or downstream rule-based checks. The machine was not controlled — it was believed (Obermeyer et al., 2019).

What Data Quality Actually Means — and Why AI Needs It

Data quality is not a software setting. It is a discipline. The DAMA International framework (DAMA-DMBOK, 2017) defines six dimensions that every data-consuming system must satisfy, whether that consumer is a human analyst or a large language model:

Completeness, Accuracy, Consistency, Timeliness, Uniqueness, and Validity. These dimensions predate AI by thirty years. They remain the correct framework for assessing whether a dataset — including a training corpus — is fit for its intended purpose (Redman, 1996; Wang & Strong, 1996).

The traditional toolkit for achieving these dimensions includes data profiling, rule-based validation, Master Data Management (MDM), ETL quality gates, and root cause analysis with lineage tracking. Each of these is deterministic: the same input always produces the same quality verdict. That predictability is precisely what makes them a reliable control mechanism for AI outputs — which are anything but deterministic.

AI vs Traditional Rule-Based Approaches: The Honest Comparison

This is not a table that recommends choosing one over the other. It is a map of complementary strengths. Traditional methods provide the structural integrity regulated industries require. AI provides the scale and pattern recognition traditional methods cannot match. The intelligent question is not which to use, but how to combine them (Gartner, 2024).

Six Sigma: Treating Hallucination as a Defect

Six Sigma was designed to reduce process defects to fewer than 3.4 per million opportunities — a standard expressed as 6σ (Harry & Schroeder, 2000). It does not care whether the process is a stamping machine or a language model. A defect is a defect. A hallucinated clinical data point is a defect. The DMAIC cycle applies.

The Hybrid Stack: Where Old and New Meet

A hallucination-resistant pipeline does not choose between AI and traditional methods. It assigns each to the tasks it does best, and designs hard boundaries so the weaknesses of each are covered by the strengths of the other.

The critical principle embedded in this stack: AI is trusted for detection at scale; rules are trusted for final adjudication. A model’s confidence score is not a quality certification. Confidence does not correlate reliably with factual accuracy, particularly in domain-specific contexts (Ji et al., 2023). Every AI output that enters a clinical or regulatory decision must pass through a deterministic validation gate before use. No exceptions.

Retrieval-Augmented Generation (RAG) is one of the most practically effective hybrid controls. By anchoring AI outputs to a curated, validated knowledge base rather than parametric memory alone, RAG constrains the model’s ability to fabricate — it can only cite what exists in the validated data store (Lewis et al., 2020). Combined with post-processing rules that verify entity existence in source records, it constitutes a genuinely robust control. Not perfect. Robust.

Conclusion: Old School Wins

The most sophisticated AI system in the world is only as trustworthy as the traditional data quality disciplines surrounding it. This is not pessimism. It is a finding — supported by two years of accelerating AI deployment failures and thirty years of quality management science.

AI is extraordinary at pattern recognition and scale. It is poor at knowing what it does not know. Hallucination is not a version-upgrade problem. It is a structural property of probabilistic models. The answer is not to wait for a better model. The answer is to wrap the model you have in a quality framework that is recognisably, unapologetically classical: define the defect, measure the rate, find the cause, fix the process, control the output.

Six Sigma taught manufacturing that quality is not an inspection activity — it is a design activity. That lesson transfers exactly. The question to ask of any AI system in a regulated environment is not “how accurate is it?” but “how is its accuracy governed?”

And the answer, every time, must include data quality controls at ingestion, validation rules at output, control charts on ongoing performance, and human experts at the point of consequence. That combination — old-school thinking, new methodology — is what makes AI trustworthy. Without it, the model is just very confident guessing at scale.

Old school, deployed thoughtfully, always wins.

Selected References

Amnesty International (2021). Xenophobic machines: Discrimination through unregulated use of algorithms in the Dutch childcare benefits scandal.
Bender, E. M. et al. (2021). On the dangers of stochastic parrots. ACM FAccT Conference.
CBC News (2024). Air Canada chatbot gave wrong information; airline liable, tribunal rules.
DAMA International (2017). DAMA-DMBOK: Data management body of knowledge (2nd ed.). Technics Publications.
Gartner (2024). Top strategic technology trends for 2025.
Harry, M. J., & Schroeder, R. (2000). Six Sigma: The breakthrough management strategy. Doubleday.
Ji, Z. et al. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12).
Lewis, P. et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS 2020.
McKinsey & Company (2023). The state of AI in 2023: Generative AI’s breakout year.
Mehrotra, D. (2023, Feb 9). Google shares drop $100 billion after Bard gives wrong answer. The Guardian.
NIST (2023). Artificial intelligence risk management framework (AI RMF 1.0).
Obermeyer, Z. et al. (2019). Dissecting racial bias in an algorithm. Science, 366(6464).
Pande, P. S. et al. (2000). The Six Sigma way. McGraw-Hill.
Redman, T. C. (1996). Data quality for the information age. Artech House.
Ross, C., & Swetlitz, I. (2017). IBM pitched Watson as a revolution in cancer care. STAT News.
Stanford HAI (2024). Artificial intelligence index report 2024.
Topol, E. J. et al. (2024). Digital medical tools and AI in healthcare. npj Digital Medicine, 7(1).
Wang, R. Y., & Strong, D. M. (1996). Beyond accuracy. Journal of Management Information Systems, 12(4).

When AI Lies to You: Hallucination, Data Quality, and the Old School Fix was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.