What’s the best GPU cluster/configuration 30k $ can buy?

Edit: I’m getting the consensus is that the budget I suggested is not enough for my lil ambitious project. I’d like to reshape the question for the upcoming comments: what’s the minimal budget to achieve my goal? And with which gpu configuration?

Hello,

I’m trying to figure out a realistic on-prem setup for a small team (approx 20–30 developers) to use a local coding/agent model (thinking something like Kimi K2.5 or GLM 5.1)

I guess my constraints are:

  • everything has to stay on-prem
  • vram is important but bandwidth and low latency are essential
  • decent UX is important (not expecting instant responses obvy, but I also don’t want it to feel laggy or constantly queued)

My initial pick was a cluster of 4 DGX Spark connected with a Switch, but I read a few articles about heat and latency issues which steered me away from it. A cluster of mac studios was my second option but given how difficult it is to get your hands on a couple of 512GB macs nowadays, I dont think it's a viable option either. Plus the fact that it's not tailored for batch processing (vllm-mlx is still rudimentary in that regard).

I rambled a lot but I guess my question is : What’s the best hardware + model + serving setup that $30k can buy that actually feels “comfortable” for 20–30 devs using it in parallel?

If anyone is running something similar:

  • what did you end up with?
  • what bottleneck surprised you?
  • anything you’d do differently?

Appreciate any feedback... I'm trying to avoid building something that looks good on paper but feels sluggish in real use.

Cheers.

submitted by /u/TomatilloFine682
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top