/u/InternalMode8159 - Provide.ai

The option i see online seem to make the model slower

/u/InternalMode8159 / May 17, 2026

This are the option I'm currently using, setting parallel at 1, using more draft or adding the draft-min-P at 0.75 seem to not be improving, i have a 5090 and I'm running inside docker, now it runs at 100 tok/s and modifying this option it fall…

Author name: /u/InternalMode8159

The option i see online seem to make the model slower