I've been integrating LLM features into several web products like support tools and workflow automation, but we've hit a major reliability issue. Once AI starts triggering real actions—like processing refunds or updating accounts—it becomes more than just about user experience; it's about ensuring everything works correctly. I've created a tool called Verifact that acts as a middleman between the AI and the API, using deterministic checks and audit logs to verify actions before they're executed. I'm curious about how other web development teams are managing this. Are you allowing AI to make API calls directly? Are you using hardcoded rules, or implementing a human review process? Or are you avoiding direct actions altogether? I'd love to learn what solutions have been effective (or not) for you.
5 Answers
Honestly, you can't fully ensure safety with AI; if you let it loose, it could wreak havoc.
It really varies. Using standard LLM APIs can be problematic and lead to unexpected results. Custom-trained models can perform remarkably well, but that relies on high-quality training data. It's a balancing act!
You can't really fully prevent errors. Nondeterministic behaviors mean you need safeguards in place. Using AI to handle refunds? I’d only let it suggest actions, while the real checks happen elsewhere to avoid mishaps that could cost you.
Exactly. Treat AI like an unverified service—don’t let it call the shots.
Honestly, just don’t let AI directly push to production. That's the simplest solution. If you can limit its actions, you'll avoid a lot of headaches down the line!
I wish I could upvote this more than once!
I've faced this in smaller projects too. My solution was a hybrid approach where AI suggests actions, but everything is validated through deterministic logic first. Your Verifact model sounds solid—good idea to think of AI as untrusted.
Yep, that’s the takeaway I got too!

Haha, right! 'You owe me $10,000!' 'Oh, oops! I totally do!'