I'm curious about how others are dealing with LLM drift in their development workflows. I usually set some constraints like architecture and tools early on, and while the model agrees at first, it soon starts suggesting options that clash with those decisions. They're not wrong, but the consistency fades over time. What I've found useful is to avoid relying on conversation history for memory. Instead, I pull decisions into a structured layer and expose them via an API, injecting only relevant decisions into prompts. This approach has significantly stabilized the model's responses. I even created a library to automate this process, which you can check out on GitHub. I'd love to hear how others are tackling this issue.
2 Answers
I've integrated Architecture Decision Records (ADR) into my workflow. When a critical choice is made, I create an ADR, and all of them get listed in an INDEX.md file. A subagent manages these and pulls relevant ADRs as needed. So far, it’s working great, but I’m considering a vector database for handling a larger number of decisions more smoothly in the future. I want to look into your project too! I'm interested in what other alternatives you've considered for your approach.
I totally agree with the idea of injecting rather than remembering! Maintaining context over long sessions can get messy. A markdown decisions file can work when things are simple, but your API wrapper is what really shines when dealing with a large set of constraints. It helps filter out irrelevant details. I'm also playing with selective injection based on context rather than just dumping everything back. Looking forward to seeing how your library adapts to these challenges!

Related Questions
Biggest Problem With Suno AI Audio
How to Build a Custom GPT Journalist That Posts Directly to WordPress