What’s the best GPU cluster/configuration 30k $ can buy?

Edit: I’m getting the consensus is that the budget I suggested is not enough for my lil ambitious project. I’d like to reshape the question for the upcoming comments: what’s the minimal budget to achieve my goal? And with which gpu configuration?

Hello,

I’m trying to figure out a realistic on-prem setup for a small team (approx 20–30 developers) to use a local coding/agent model (thinking something like Kimi K2.5 or GLM 5.1)

I guess my constraints are:

everything has to stay on-prem
vram is important but bandwidth and low latency are essential
decent UX is important (not expecting instant responses obvy, but I also don’t want it to feel laggy or constantly queued)

My initial pick was a cluster of 4 DGX Spark connected with a Switch, but I read a few articles about heat and latency issues which steered me away from it. A cluster of mac studios was my second option but given how difficult it is to get your hands on a couple of 512GB macs nowadays, I dont think it's a viable option either. Plus the fact that it's not tailored for batch processing (vllm-mlx is still rudimentary in that regard).

I rambled a lot but I guess my question is : What’s the best hardware + model + serving setup that $30k can buy that actually feels “comfortable” for 20–30 devs using it in parallel?

If anyone is running something similar:

what did you end up with?
what bottleneck surprised you?
anything you’d do differently?

Appreciate any feedback... I'm trying to avoid building something that looks good on paper but feels sluggish in real use.

Cheers.

submitted by /u/TomatilloFine682
[link] [comments]

Leave a Comment