Has anyone here actually used local LLMs for decision-making inside real workflows?

I’ve been spending some time experimenting with local models recently, mostly trying to move beyond the usual chat or coding assistant use cases. What I’m really interested in is whether they can reliably sit inside a workflow and make decisions, not just generate text.

For example, taking something like incoming messages or form inputs and having the model decide what should happen next. In theory it sounds straightforward, but in practice it’s been a bit unpredictable. Even when the prompts are tightly structured, the outputs don’t always stay consistent enough to trust across multiple steps.

I’ve been running smaller quantized models locally just to keep things fast, and they’re surprisingly capable, but the reliability starts to break down when you try to depend on them for anything that needs repeatable structure. It almost feels less like a model limitation and more like a pipeline problem, but I’m not completely sure yet.

What I can’t figure out is whether people are actually pushing local models this far in real setups, or if most are still keeping them at the assistive level. I’m especially curious how others are dealing with consistency when the output actually matters, not just for readability but for triggering actions.

Would be really interesting to hear if anyone here has managed to make this work in a stable way, or if you ended up falling back to hybrid setups or more traditional logic.

submitted by /u/Comfortable-Week7646
[link] [comments]

Leave a Comment