If ChatGPT can reach the same answer through completely different reasoning paths… what does “correct” actually mean?

I’ve been testing something recently and it’s starting to mess with how I think about “correct answers.”

Same prompt.
Same model.
Same temperature and settings.

But the outputs don’t just vary a little.

Sometimes they take completely different reasoning paths — like totally different ways of getting to an answer.

And here’s the strange part:

Sometimes the final answer is still the same.

And that’s where it gets weird.

Because if different runs can take completely different paths —
but still land on the same answer —

what exactly are we calling “correct”?

Is it:

If the path changes every time, even under the same setup:

Curious if others have noticed this, and how you think about it.

submitted by /u/yuer2025
[link] [comments]