Source record

@ray_fu

2026-06-03

Most people don't realize the reason their AI agents and automations give inconsistent answers is because they're not testing them properly.

Share source record
Tiktokexcerpt onlyen

Source Text

i

Most people don't realize the reason their AI agents and automations give inconsistent answers is because they're not testing them properly. You build something, it works once, you ship it, and then it gives a completely different answer the next day, even with the same input. Big companies like Notion and Stripe solve this problem with a tool called Braintrust, and I'm gonna show you exactly how to use it to test out my new customer support bot that I'm selling to restaurants.

Braintrust has a genius feature that I'm using to test this out. First I create a project and then a playground. I pasted my prompt for the chat support.

I then input my data set, which is just 50 real questions that customers can ask. Braintrust ran my prompt against every single test case at once and scored each output automatically. I found that 30% of them were failing and I would have never found this out just manually testing.

But I don't want to just send it and hope that it keeps working. So I connected Braintrust to my chatbot by wrapping my API calls. Now every single conversation that my chatbot has with a real customer gets logged automatically.

Every question and answer that comes in shows exactly how long it took and how much it cost, and I can see it all happening live on the dashboard. Then I set up a quality threshold so that if, say, the chatbot keeps getting a customer peanut allergy wrong, it would alert me as soon as it happens.

And here's the coolest part. For that peanut allergy question that Braintrust caught, I was able to add that question directly into my data set so that every time I update my prompt, Braintrust automatically checks against the new edge cases. So it never fails twice the same way.

This is the mistake that people who are making AI tools and automations are missing out on.

How can you be confident charging and keeping customers if you're building out an AI automation? I have a step-by-step detailed guide on how to build this out just like the video. Comment email and I'll send it to you.

Follow for more ways to make money and run your business with tech and AI.

Source Intelligence

i

No reviewed Source Intelligence cards are published for this source yet. Base2026 only shows reviewed source-backed cards here; unreviewed candidates stay out of the public UI until evidence review.