I'm developing a chatbot with AWS Bedrock (Claude) and storing conversations in OpenSearch and RDS. I'm particularly worried about accidentally storing or processing sensitive user information, like passwords. Right now, I have AWS Comprehend set up to detect PII in user inputs. If it identifies any PII, I block the message, notify the user, and don't save the conversation. I've recently discovered Bedrock Guardrails, which can enforce rules to prevent the model from managing or generating sensitive data. I'm wondering if I should just rely on Bedrock Guardrails instead of using Comprehend to filter inputs first. Should I use both in combination for better security? Are there any examples of both being used together effectively? Looking for insight from those who have experience with secure LLM pipelines and handling PII in AI. Thanks!
5 Answers
I don’t have a direct answer, but I’m following this thread because it’s very interesting!
Don't replace one with the other; they serve different roles! Comprehend acts as a preprocessing filter that catches PII before it hits your pipeline, while guardrails enforce runtime measures to manage sensitive data after the model processes it. The best practice is a layered defense approach: filtering during input and monitoring during output. A common setup includes a pre-filter, model, and a post-filter to ensure you have full auditability and protection.
I've addressed your original question. By the way, why did you choose RDS for conversation storage? I've seen people use OpenSearch or S3 more frequently.
Absolutely, you can configure Bedrock Guardrails to identify PII and block those requests from reaching the model entirely. They’ve recently improved guardrails with new reasoning capabilities, which might be worth exploring! Check out their [Automated Reasoning](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-automated-reasoning-checks.html) feature.
Thanks for the link! I looked it over, and it seems like the Automated Reasoning checks mainly validate model outputs against policies for compliance and accuracy, but I wonder if they really stop sensitive inputs from reaching the model in the first place. For that, I might still need Comprehend to filter out PII before it even reaches Bedrock.
I think combining both is definitely the way to go! Using Comprehend as a proactive filter for sensitive inputs is solid, but it seems like some models, like Claude Sonnet, have better native guardrails than AWS. Everyone has different experiences, so I'm very curious about other setups.
Good question! We use OpenSearch for searching conversations, but have Lambda functions that process the data and export analytics results into RDS, which is then linked to QuickSight for reporting. It's cleaner than trying to connect QuickSight directly to OpenSearch.