Hey everyone,
I’m pretty new to running LLMs locally and I’m trying to figure out what works best for my setup. I’d love to hear from people who are already using local models for similar stuff.
My specs:
· RTX 5070 (12GB VRAM)
· 32GB DDR5 RAM
· Ryzen 5 7500F
· 1TB NVMe SSD
I mostly do cybersecurity work , both red and blue team stuff. That means a lot of code analysis (Python, C, JS, some assembly), reverse engineering help, writing small proof-of-concept scripts, summarizing threat reports, and occasionally brainstorming attack paths or defense strategies. So the model needs to be comfortable with infosec topics and not refuse every second prompt just because it mentions an exploit or malware.
I’ve read about uncensored and abliterated models, but I’m honestly not sure if they’re necessary for this kind of work. Are they actually better, or can a well-prompted "normal" model handle it just fine? I don’t want it to be completely unhinged, but I also can’t have it refusing to discuss legitimate security research. What’s your real-world experience?
Also trying to figure out what size model makes sense for my VRAM. Should I stick to 7B-14B models to keep things fast, or is it worth trying something like a 32B with partial offloading to system RAM? What quants (Q4_K_M, Q5_K_M, etc.) do you guys run on similar hardware?
For tools, I’ve played a bit with Ollama and LM Studio. Any reason to pick one over the other for infosec? I sometimes need to paste large logs or entire decompiled functions, so context length matters. Is 32k enough, or do I really want a model with 128k+ like Qwen2.5?
Lastly, are there people here with similar specs (especially the 12GB 5070) running LLMs for security work? I’d like to hear what you’re using day to day and how the performance feels.
Cheers, and thanks for any pointers. I’ll test things out and report back.
[link] [comments]