What are the best AWS services for faster text embeddings generation?

0
6
Asked By TechieExplorer42 On

I'm working on a project where I use a Hugging Face transformer model to generate text embeddings. While the accuracy is great, I'm facing some performance and latency challenges, especially when handling large batches of data. Since I'm already using AWS for hosting, I'm curious if there are any AWS-native or managed services that can generate embeddings directly via API, like the APIs from OpenAI or Cohere. Ideally, I'd prefer a solution that doesn't require me to deploy any models myself. Any suggestions?

2 Answers

Answered By DataDrivenDude On

You might want to consider using Amazon Bedrock for your needs. It includes Titan Embeddings and models from Cohere and Anthropic, all accessible via API. It's designed to handle scale seamlessly, so you won’t have to worry about managing infrastructure. If you're looking for a SageMaker option, SageMaker JumpStart has 'text embedding' models with real-time endpoints that can provide lower latency compared to raw Hugging Face models. I’d recommend starting with Titan Embeddings on Bedrock since it's serverless and integrates well with other AWS services.

Answered By CloudNinja99 On

Have you checked out Amazon Bedrock? It provides embeddings from both AWS and Cohere. Some users say Cohere has better performance, but it might come at a higher cost. Here's a couple of links to get you started: [Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed-v4.html) and [Titan Models](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html). You could also explore using Bedrock Knowledge Bases for creating cheaper retrieval-augmented generation (RAG) setups with embeddings.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.