I've found that working with the 64k context can be quite challenging, and I'm concerned about the reliability of third-party providers when it comes to quantization. I'm looking for recommendations on the best provider to use for the R1 model that supports 160k context, ideally without any issues related to quantization. Any experiences or suggestions?
5 Answers
Honestly, I’d recommend using the direct API when possible. It can save you a lot of potential headaches compared to relying on providers.
I’m currently using the Nebius Playground, which has a version of DeepSeek R1-0528, and it’s super affordable. You might want to give it a look!
Wouldn't using the official DeepSeek API be the best route for you? I guess the advantage of using a third-party provider like NanoGPT might be the no quantization factor, but I'm curious what specific benefits they offer.
But doesn’t the DeepSeek API only go up to 64K context? That’s what the OP is trying to avoid.
I found a tool called Fireworks AI, which could be an alternative worth considering!
Hey! I happen to work with NanoGPT, which offers the full DeepSeek R1 0528 model without any quantization and actually supports 164k context! If you're interested, I can send you an invite to try it out. It really might be worth checking out!
Count me in! I’m looking to test it on my new setup.
I’d love an invite too! Sounds amazing!
Sounds great, thanks! I’ll definitely check it out.