Do I Need to Create Embeddings Manually for AWS Opensearch Serverless Vector Search?

0
1
Asked By TechieExplorer42 On

Hey everyone! I'm currently looking into the semantic search features offered by AWS Opensearch, specifically using the vectorsearch collection type. From what I gather in the documentation, it seems like I need to generate the embeddings for a field before I can even ingest my documents. I was hoping there might be an automatic way to generate these embeddings when I set up a knn_vector type. I've also read about integrating with Sagemaker or Bedrock, but I'm not seeing that option available for the serverless collection. Any insights or guidance would be really helpful, thanks!

5 Answers

Answered By BudgetHunter99 On

Right? Pinecone definitely seems like a better deal when you look at AWS Opensearch pricing!

Answered By DataInnovator56 On

You could also utilize Bedrock's knowledge bases to generate embeddings automatically when you sync your data source, like S3. It involves mapping the fields properly, but it can be done with your existing OpenSearch setup.

Answered By MLGuru23 On

It doesn't look like automatic embedding generation is supported right off the bat. However, you can set it up with the ML plugin along with an ingestion pipeline. This is based on the regular AWS Opensearch, so I can't say how it would play out in the serverless scenario though! Check out this link for more details.

Answered By CloudSavvy2023 On

Honestly, I'd recommend considering Pinecone instead. It's much more budget-friendly compared to AWS Opensearch costs.

Answered By VectorWizard88 On

Yep, you'll need to create the embeddings yourself. You can utilize AWS Bedrock, particularly with the Titan model, to help with this. Remember, embeddings are simply vectors that represent your text or other data in a certain space. OpenSearch won't know what you're trying to represent, whether it's a document field, the whole document, or something like an image. Just check those out!

DocSearchNerd -

Got it, I'm actually looking to use the pre-trained models mentioned in the OpenSearch docs. Thanks for the tip!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.