Things I Built With AI That Completely Failed.
Nobody publishes these. I am.
Every AI newsletter I read is full of wins. Success stories. Workflows that saved 10 hours a week. Products that went from idea to launch in 72 hours. ROI that justified the investment three times over.
Those stories are real. They are also incomplete. For every AI workflow I have deployed successfully, there is at least one I deployed and then quietly dismantled. Here are the ones that failed, and why.
The Automated Client Update System
The idea: pull project status from our database every week, generate a personalized update email for each client, and send it automatically. I spent three days building this. It produced accurate, well-formatted emails that were technically correct and felt like they were written by someone who had never actually met the client.
The problem was not the content. The problem was the register. Client relationships in agency work exist in a specific emotional context — there is tension around timelines, excitement about results, anxiety about costs, trust that has been built through specific interactions. The AI had none of this context. It produced emails that were professionally appropriate and relational vacuous.
Two clients mentioned that my communications had started to feel ‘different.’ I shut it down before any damage was done. Lesson: relationships that exist in emotional context cannot be automated. You can use AI to draft, but you must rewrite enough that your voice and awareness of the specific relationship comes through.
The Automated Lead Qualification System
The idea: automatically process incoming leads, score them on fit criteria, and route high-quality leads to a priority follow-up queue. I built the scoring model based on industry, company size, stated problem, and budget signals in the initial message.
What I did not account for: the signals that make a lead genuinely interesting are often not in the initial message. Some of the best clients I have worked with sent brief, vague initial emails. Some of the most time-consuming inquiries sent detailed, well-specified briefs. The scoring model was measuring proxies for quality rather than quality itself.
Result: I missed two significant opportunities in the first month because the scoring system routed them to the low-priority queue. Both went with other agencies. I dismantled the scoring system and now review all leads personally, using AI to help me research the company before the call rather than to decide whether to take the call.
The AI Research Report That Went to a Client
This is the painful one.
I was preparing a market analysis for a client. I was behind on the deadline. I used AI to generate the core research synthesis and reviewed it at speed rather than with the care it required. The report went out with a statistic that was directionally correct but numerically wrong — the AI had conflated data from two different studies measuring slightly different things.
The client’s CFO caught it during a board presentation. It was not a relationship-ending error, but it was an embarrassing one, and it required a corrected version and an explanation that consumed more time than doing the research properly would have in the first place.
The rule I implemented afterward: every specific factual claim with a number in it gets verified against a primary source before it leaves my hands. Not spot-checked. Verified. Every one. This takes time. It is not optional.
The Voice Cloning Outreach Experiment
I will be brief about this one. I tried using an AI voice system to make initial outreach calls at scale. The quality of the voice was good. The conversations were not. People could tell. The experimental system also created awkward moments when people asked follow-up questions the AI was not equipped to handle. I ended the experiment after 48 hours and did not tell the people I had called what they had actually been talking to. That part I am still thinking about.
The Fully Automated Social Content Calendar
I built a system that would generate 30 days of social content, schedule it, and post it automatically. Quality on day one: good. Quality on day 15: noticeably generic. Quality on day 28: I had three posts go out that said things I would not have said, about topics I had moved on from, in a tone that was technically consistent with my style guide and subtly wrong.
The automation had no awareness that my thinking had evolved. It was faithfully producing content based on a snapshot of my perspective that was already three weeks out of date.
I use AI to draft social content now. I schedule it myself after reviewing it. The automation saved me 4 hours a month. The oversight cost me 2. Net saving: 2 hours. Acceptable. But not the frictionless system I had imagined.
Every system I shut down taught me something the success stories didn’t.
What the Failures Have in Common
Looking at these together, the pattern is clear: every failure involved deploying AI in a context where the relevant information was not in any document I could give it.
Relationship texture. Signal in what is not said. The evolution of my own thinking over time. The judgment calls that depend on reading a situation rather than analyzing it.
AI is excellent at working with explicit, structured information. It is poor at working with implicit, contextual, relational information. The failures happened every time I forgot this.
Publish your failures.
The AI field is drowning in optimism and short on honesty. The most useful thing you can contribute to the people around you is an accurate picture of what AI actually does, not just what it can do on a good day with a well-defined task and a patient human reviewing every output.
I will keep publishing mine.


