Let's Talk AI Artificial Intelligence,Fine Tuning Temperature in LLM API Calls: Balancing Creativity and Predictability

Temperature in LLM API Calls: Balancing Creativity and Predictability

The temperature parameter in API calls to Large Language Models (LLMs), such as OpenAI’s GPT, plays a crucial role in determining how the generated text varies in creativity and predictability.

Antropic defines it as “Temperature is a parameter that controls the randomness of a model’s predictions during text generation. Higher temperatures lead to more creative and diverse outputs, allowing for multiple variations in phrasing and, in the case of fiction, variation in answers as well. Lower temperatures result in more conservative and deterministic outputs that stick to the most probable phrasing and answers. Adjusting the temperature enables users to encourage a language model to explore rare, uncommon, or surprising word choices and sequences, rather than only selecting the most likely predictions.”

** This article is written for OpenAI, but contains annotations for Anthropic models.

1. What is Temperature?

Temperature is a numerical value typcially between 0 and 2 that determines how random or deterministic the model’s output is. It regulates the probability distribution with which the model selects words when generating a response

2. How Does Temperature Work?

  • Each time an LLM generates a word, it calculates a probability distribution of possible next words.
  • The temperature setting influences how these probabilities are processed:
    • Low temperature (<1.0): The model selects more probable words, resulting in predictable and coherent responses.
    • High temperature (>1.0): The model selects less probable words, making the text more creative and varied.
    • 0.0: The model always chooses the most likely word, leading to deterministic and formal responses.
    • 2.0: Words are chosen almost randomly, which can result in illogical or incoherent responses.
Anthropic: “The temperature parameter is used to manipulate this probability distribution before sampling the next token. If the temperature is low (close to 0.0), the probability distribution becomes more peaked, with high probabilities assigned to the most likely tokens. This makes the model more deterministic and focused on the most probable or “safe” choices. If the temperature is high, the probability distribution becomes more flattened, with the probabilities of less likely tokens increasing. This makes the model more random and exploratory, allowing for more diverse and creative outputs.”
Temperature

3. Effects of Temperature

OpenAI
Temperature EffectUsage Scenarios
0.0 – 0.3Highly predictable and formalScientific articles, legal documents
0.4 – 0.7Balance between accuracy and creativityBlog posts, emails, product descriptions
0.4 – 0.7Balance between accuracy and creativityBlog posts, emails, product descriptions
1Default Value OpenAI API 
1.3 – 2.0Highly experimental and unpredictableAbstract art descriptions, random inspiration

 

Important note: for ChatGPT  is the default temperature setting not explicitly documented. However, some users have observed that the web interface may have a default temperature of 0.7, but this isn’t officially

Anthropic – more details
  • Ranges from 0.0 to 1.0.
  • Defaults to 1.0.
  • Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks.

Note that even with temperature of 0.0, the results will not be fully deterministic.

Important note: no documentation found on the default value for the temperature for Claude.ai

4. Practical Example

Let’s say you use the following prompt: “Write a short summary of the book ‘1984’ by George Orwell.”

Temperature = 0.2 (Low)

  • “The book ‘1984’ by George Orwell describes a dystopian society where the government exercises total control over individuals. The protagonist, Winston Smith, tries to rebel against the regime of Big Brother.”
    • Factually accurate
    • Concise and to the point
    • Limited creativity

Temperature = 1.2 (High)

  • “In a bleak, technologically oppressed world, Winston Smith struggles with love, betrayal, and the illusion of freedom. Shadows whisper state propaganda, and thoughts are as dangerous as actions. Can he escape the all-seeing eyes of Big Brother?”
    • More narrative and descriptive
    • Surprising word choices
    • More creative but potentially less precise

5. When Should You Use Different Temperature Settings?

  • For strict and factual answers → Use a low temperature (0.0 – 0.4).
  • For a mix of fact and creativity → Use a medium temperature (0.5 – 0.8).
  • For brainstorming sessions or creative experiments → Use a high temperature (0.9 – 2.0).

Receive Latest Updates!

We don’t spam! Read our privacy policy for more info.

2 thoughts on “Temperature in LLM API Calls: Balancing Creativity and Predictability”

    1. hi James,
      OpenAI has a temperature range of 0 – 2 vs Anthropic only 0 – 1range. 1.2 as a temperature is setting the LLM to be more creative in its answers: it will stick less to the details of the training data (less replication) and will favor a bi t more less probable tokens as the next tokens when it constructs the text, so a more “creative” solution.
      When setting the Anthropic temperature value to > 1 e.g. 1.2, the API resets it to the maximum range of 1 and givs you this warning : “Warning: Temperature should be between 0 and 1. Setting to 1.0.”
      thanks
      Jan

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post