Skip to main content
Back to Blog
|1 May 2026

Air Canada's Chatbot Hallucination Nightmare: Why Your Customer-Facing LLM Needs Custom Guardrails Today

Air Canada tried to argue its chatbot was a 'separate legal entity' after it hallucinated a refund policy. They lost. Here is why off-the-shelf LLMs are legal time bombs and how to fix them.

i

iReadCustomer Team

Author

Air Canada's Chatbot Hallucination Nightmare: Why Your Customer-Facing LLM Needs Custom Guardrails Today
Imagine trying to argue in a court of law that a chatbot on your own corporate website is a "separate legal entity" responsible for its own actions. 

It sounds like a desperate punchline, but that is exactly the defense Air Canada mounted in a civil tribunal. Unsurprisingly, the tribunal didn't buy it. In a landmark decision that sent shockwaves through the C-suites of every Fortune 500 company, the adjudicator established a brutal new reality for the generative AI era: **Your chatbot is a part of your website, and you are entirely liable for its AI hallucinations.**

*Moffatt v. Air Canada* isn't just an aviation story. It is a massive, blinking red warning sign for every consumer-facing brand currently rushing to deploy large language models (LLMs) to their customer support frontlines.

## Why Off-the-Shelf LLMs Are Legal Time Bombs

The AI gold rush led to a predictable enterprise anti-pattern: companies wrapping an off-the-shelf ChatGPT API in a slick chat widget, adding a system prompt that says *"You are a helpful customer service agent,"* and unleashing it onto the public.

But LLMs like GPT-4, Claude, or Gemini are not databases of facts; they are probabilistic prediction engines. By their very nature, they want to be helpful. If they don't know the answer to a policy question, they don't naturally say, "I don't know." Instead, they predict what a helpful answer *should* look like. 

In the tech world, we call this an **AI hallucination**. In the legal world, it's called false advertising, breach of contract, or negligent misrepresentation.

In the case of Jake Moffatt, he asked the Air Canada chatbot about bereavement fares following the death of his grandmother. The bot confidently hallucinated a policy, assuring Moffatt he could book a full-fare flight immediately and apply for a retroactive refund within 90 days. The actual policy dictates that bereavement approvals must be completed *before* flying.

Moffatt booked the flight, applied for the refund, and was denied. He sued for the difference—about $650 CAD. Air Canada lost the case, but the financial penalty was pocket change. What they actually lost was global credibility, becoming the poster child for what happens when you deploy naked AI without a safety harness.

If your customer support team is running an ungoverned generative AI agent, you are exactly one viral hallucination away from a catastrophic PR and legal nightmare.

## The Technical Fix: What Custom Guardrails Actually Look Like

You cannot fix an AI's tendency to hallucinate by simply adding *"Please don't lie"* to its prompt. Enterprise-grade AI requires an architectural overhaul designed around determinism, compliance, and verifiability.

If you want to protect your enterprise from **<strong>AI hallucination liability</strong>**, you must move beyond off-the-shelf models and adopt a custom development pattern featuring strict, programmatic guardrails:

### 1. Strict Retrieval-Augmented Generation (RAG) Grounding
The first step to fixing hallucination is removing the LLM's ability to rely on its pre-trained weights for factual answers. 

Through **Retrieval-Augmented Generation (RAG)**, your chatbot is tethered to a secured vector database containing only your company's approved policies, FAQs, and product catalogs. When a user asks a question, the system queries this database first. The LLM is then strictly instructed to synthesize an answer *only* from the retrieved documents. If the search yields no results, the system is grounded—it is mathematically blocked from inventing an answer.

### 2. Engineering Verifiable Refusal Patterns
One of the hardest things to teach an eager LLM is how to say no. 

Custom AI solutions require explicit **refusal patterns** and intent routing. Before the LLM even generates a response, an intent classifier analyzes the user's prompt. If the prompt involves high-risk domains—such as refund requests, legal threats, pricing negotiations, or safety issues—the system bypasses the generative model entirely. 

Instead, the guardrails trigger a deterministic, pre-approved script: *"I cannot process refunds directly, but you can view our full policy at [Link], or I can connect you with a human agent right now."*

### 3. Required Citations and Human-in-the-Loop (HITL)
Enterprise bots must show their work. Every policy claim generated by the AI should be accompanied by a verifiable citation, linking the user directly to the source document on the official website.

Furthermore, zero-touch automation is a myth for complex customer service. A robust **Human-in-the-Loop (HITL)** architecture ensures that when the conversational context shifts from a simple FAQ to a nuanced dispute, the AI seamlessly degrades its role from 'decision maker' to 'data gatherer,' handing the context over to a human agent without frustrating the user.

## The Custom AI Development Pattern: The New Enterprise Standard

The Air Canada debacle has crystallized a new reality: the "wrapper" era of AI is dead. 

Enterprises must adopt a custom AI development pattern. This means building domain-grounded models where guardrails sit outside the LLM as separate, deterministic software layers. This architecture doesn't just prevent hallucinations; it guarantees compliance. 

When you build custom guardrails, you gain:
*   **Traceability:** The ability to audit exactly which internal document triggered a specific AI response.
*   **Data Privacy:** Guardrails that automatically redact Personally Identifiable Information (PII) before it ever hits the LLM.
*   **Brand Security:** Absolute control over the boundaries of the conversation, ensuring your bot doesn't get tricked into writing poems about your competitors.

## Conclusion: Who is Auditing Your Chatbot Right Now?

The breathless excitement surrounding Generative AI has led countless brands to prioritize speed to market over safety. But the courts have spoken, and they do not care about your startup's "move fast and break things" philosophy. 

When your AI speaks to a customer, it speaks with the full legal authority of your brand. Your customers will hold you to the promises your AI makes, and the law will back them up. 

If you are currently running a public-facing LLM without strict retrieval grounding, custom refusal patterns, and robust intent routing, you are playing Russian roulette with your brand's reputation and legal budget. The most urgent question you should be asking your tech team today isn't *"How smart is our AI?"* but rather, *"What exactly is stopping our AI from making a promise we can't afford to keep?"*