Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is GPT-4 giving different answers with same prompt & temperature=0?

Tags:

gpt-4

This is my code for calling the gpt-4 model:

messages = [
    {"role": "system", "content": system_msg},
    {"role": "user", "content": req}
]

response = openai.ChatCompletion.create(
        engine = "******-gpt-4-32k",
        messages = messages,
        temperature=0,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )

answer = response["choices"][0]["message"]["content"]

Keeping system_msg & req constant, with temperature=0, I get different answers. I got 3 different answers when I last ran this 10 times for instance. The answers are similar in concept, but differ in semantics.

Why is this happening?

like image 539
Kavya Bhandari Avatar asked Sep 02 '25 13:09

Kavya Bhandari


2 Answers

This blogpost authored by Sherman Chann argues that:

Non-determinism in GPT-4 is caused by Sparse MoE [mixture of experts].

Note that it’s now possible to set a seed parameter. From platform.openai.com/docs (mirror):

Reproducible outputs

Beta

Chat Completions are non-deterministic by default (which means model outputs may differ from request to request). That being said, we offer some control towards deterministic outputs by giving you access to the seed parameter and the system_fingerprint response field.

To receive (mostly) deterministic outputs across API calls, you can:

  • Set the seed parameter to any integer of your choice and use the same value across requests you'd like deterministic outputs for.

  • Ensure all other parameters (like prompt or temperature) are the exact same across requests.

Sometimes, determinism may be impacted due to necessary changes OpenAI makes to model configurations on our end. To help you keep track of these changes, we expose the system_fingerprint field. If this value is different, you may see different outputs due to changes we've made on our systems.

like image 95
Franck Dernoncourt Avatar answered Sep 05 '25 15:09

Franck Dernoncourt


Found a solution here: https://community.openai.com/t/observing-discrepancy-in-completions-with-temperature-0/73380

TLDR; some discrepancies occur due to floating pt operations sometimes when 2 tokens have very close probabilities. Even a single token change affects the whole chain and leads to divergent generations.

like image 31
Kavya Bhandari Avatar answered Sep 05 '25 17:09

Kavya Bhandari



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!