At Ajust, an AI agent handles consumer complaints and negotiations on behalf of users. It contacts businesses, negotiates outcomes, and resolves cases autonomously.
First Telstra bill negotiation. The agent doesn't know the right contact channel, so it delegates to a browser agent to find one. It tries a web form. The form goes nowhere. It sends an email through the email agent. No response. Eventually it finds the live chat, gets through to a representative, and negotiates the bill down. Case closes successfully.
Next week, different user, same situation. Telstra bill too high. The agent starts from zero. It searches the web for contact information, finds the same useless web form, wastes the same time, and eventually rediscovers the live chat. Everything it learnt in the first case has evaporated.
The Problem
Agents are stateless between sessions. The conversation history from case #1 doesn't exist when case #2 starts. Every insight, every failed approach, every successful strategy disappears when the case closes.
The traditional fix is manual prompt engineering. You watch how cases unfold, notice patterns, and update the system prompt. "For Telstra, use live chat, not the web form." This works when you have ten businesses. It doesn't work when you have thousands, each with their own contact channels, policies, and quirks.
You can't hand-maintain instructions for every business the agent will encounter. And even if you could, those instructions go stale. The phone number that worked last month might route to a different department today.
The Insight
The full conversation log is right there. It contains what happened, which approaches failed, which ones worked, the contact channel that finally got through, the reference number the business assigned, the strategy that landed the outcome.
What if a specialised agent reviewed that log after each case? Not fine-tuning. Not RLHF. Just a structured post-mortem that extracts the useful parts, stores them, and makes them searchable for future cases.
The Data Flywheel
When a case completes, a background task kicks off the post-mortem agent. It reads the conversation history, generates a structured document, and stores it with an embedding for hybrid search. No human decides whether a case is "worth learning from." Every completed case, successful or not, generates a knowledge document. The feedback loop is automatic.
The next time an agent handles a similar case, it searches the knowledge base as its first research action and gets the learnings from every previous case.
The Post-Mortem
The post-mortem agent is deliberately simple. It uses a reasoning model with extended thinking, so it can work through the full conversation before committing to an output. The schema captures:
- Business name and industry
- Issue type, summary, and outcome
- Whether the outcome was a success or failure
- A step-by-step log of actions taken, including URLs and contact details used
- What information the agent needed from the user to proceed
The instructions are strict about what the post-mortem agent can and can't do.
- Only use what's in the conversation log. Don't invent or infer anything.
- Log the steps exactly as they happened, including URLs and contact details used.
- Redact all PII.
No hallucination. No embellishment. No PII. The post-mortem agent is a disciplined transcriber, not a creative writer.
Here's what a stored knowledge document looks like after a successful Telstra bill negotiation.
{
"business": "Telstra",
"industry": "Telecommunications",
"issueType": "Bill Negotiation",
"issue": "Monthly plan charge increased by $15 after contract period ended",
"outcome": "Negotiated $10/month discount for 12 months via live chat",
"outcomeType": "Success",
"logs": [
"Searched web for Telstra contact options",
"Attempted contact via web form at telstra.com.au/contact-us - no response after 48 hours",
"Contacted Telstra via live chat at telstra.com.au/contact-us/chat",
"Representative offered $5/month discount, declined",
"Escalated to retention team, negotiated $10/month discount for 12 months",
"Confirmation email received with reference number"
],
"requiredInformation": [
"Account number",
"Account holder name",
"Current plan name and monthly charge",
"Date contract period ended"
]
}
Now when the next Telstra bill negotiation arrives, the agent's first research action is to search the knowledge base.
- Before doing anything else, search the knowledge base.
- Prioritise contact channels from the knowledge base.
The agent searches "telstra bill negotiation" and immediately learns that live chat works, web forms don't, the retention team is where the real discounts happen, and it needs the account number and plan details before making contact.
Case #1 spent hours discovering this. Case #10 knows it before the first message.
The Guardrails
Past patterns are not current reality. The knowledge base shows what happened before, not what's happening now. A past case might mention receiving an automated reply within 24 hours. That doesn't mean the current case has received one.
- Past cases show what happened before, not what's happening now.
- Only tool outputs from the current conversation confirm current reality.
Without this guardrail, the agent would hallucinate outcomes based on historical patterns. "In the last case, Telstra replied within 24 hours, so they must have replied by now." That's confabulation, not reasoning.
PII redaction at extraction time. Names, account numbers, and contact details get stripped when the knowledge document is generated, with fallback checks for anything the redaction misses.
A consistent schema makes documents comparable. Every knowledge document conforms to the same schema, so you can search across them, diff them, and aggregate them.
Human in the Loop
The agent-generated post-mortems aren't the final word. Knowledge documents have a notes field where humans can add context, correct mistakes, or fill gaps the agent missed.
When a future agent retrieves a document, it gets both the auto-generated content and any human notes. They're indexed together and surfaced together.
Fully autonomous self-improvement sounds appealing, but the agent doesn't always get it right. It might hallucinate an outcome, marking a case as resolved when it wasn't. Something might happen outside the case, like the consumer calling the business directly or the business reaching out through a channel the agent doesn't monitor. Or something changes after the knowledge document was created, like a contact channel going offline. These are corrections the agent can't make on its own, because it can't see what it can't see.
The Tradeoffs
Garbage in, garbage out. Failed cases still generate knowledge documents. A strategy that wasted three days on a web form gets recorded alongside the strategy that worked. The outcomeType field helps future agents weigh the evidence ("Failure" means "don't do this"), but they still need to interpret it.
Cold start. The first N cases get zero benefit because the knowledge base is empty. You can bootstrap with a bulk import of known contact channels and strategies, but that's manual work upfront.
Lookalike matching helps. The agent handling its first Optus case can pull learnings from Telstra and Vodafone, because telco billing disputes follow similar patterns regardless of the provider. But the system only starts compounding after enough cases have flowed through it.
Knowledge drift. Contact channels change. Phone numbers get disconnected. Policies update. A knowledge document from six months ago might recommend a live chat URL that no longer exists. Stale knowledge is worse than no knowledge if the agent trusts it blindly.
This is the hardest problem to manage over time. Search results need reranking to prioritise recent cases and deduplicate when ten documents say the same thing about the same business.
At some point you probably need master summaries that aggregate learnings per business, collapsing dozens of individual post-mortems into a single authoritative document that gets updated as new cases flow in. We haven't built this yet, but the structured schema makes it possible. Every document has the same shape, so aggregation is a query, not a research project.
The Takeaway
The agent gets better at its job every time it completes a case. No fine-tuning, no prompt updates, no manual knowledge management. Just a structured knowledge base, a search index, and a human who can correct course when needed. The tenth Telstra case benefits from the nine that came before it. The hundredth benefits from ninety-nine.
This is the new AI moat. Everyone has access to the same models. The difference is the data flywheel you build on top. The knowledge compounds, and the agent compounds with it.
