I have two main questions regarding embedding models and AWS deployment. First, which embedding model is best for testing? I'm looking to create embeddings from some PDF forms. Second, which AWS service would be the right fit for this task? Any guidance on how to deploy the model would be appreciated.
To give you some background, I tried deploying the Qwen3 0.6 model on SageMaker, but it didn't work out. I spent an entire evening on it, trying to run the quick deployment code from the Qwen3 Hugging Face page. The deployment itself was successful, but I kept getting a timeout error when trying to make inferences. The error message said: "Your invocation timed out while waiting for a response from container primary. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."
2 Answers
Are you looking for a customized solution or would something from the Bedrock knowledge base suffice? If you just need a ready-to-use embedding model, that might be a solid option, especially if you plan to connect it to S3 for vector storage.
I managed to get embedding models working on a Dockerized Lambda setup. Choosing the right model really depends on your specific needs, and it can evolve over time. If you're just looking for a quick and easy solution, consider using the embedding models directly from Bedrock. It's serverless, on-demand, and you only pay per token. I tried the Titan Embeddings model recently, and it worked well for me—I processed about a million tokens for under 10 cents!

Related Questions
Neural Network Simulation Tool
xAI Grok Token Calculator
DeepSeek Token Calculator
Google Gemini Token Calculator
Meta LLaMA Token Calculator
OpenAI Token Calculator