Temperature in LLM API Calls: Balancing Creativity and Predictability

The temperature parameter in API calls to Large Language Models (LLMs), such as OpenAI’s GPT, plays a crucial role in determining how the generated text varies in creativity and predictability.

Antropic defines it as “Temperature is a parameter that controls the randomness of a model’s predictions during text generation. Higher temperatures lead to more creative and diverse outputs, allowing for multiple variations in phrasing and, in the case of fiction, variation in answers as well. Lower temperatures result in more conservative and deterministic outputs that stick to the most probable phrasing and answers. Adjusting the temperature enables users to encourage a language model to explore rare, uncommon, or surprising word choices and sequences, rather than only selecting the most likely predictions.”

** This article is written for OpenAI, but contains annotations for Anthropic models.

1. What is Temperature?

Temperature is a numerical value typcially between 0 and 2 that determines how random or deterministic the model’s output is. It regulates the probability distribution with which the model selects words when generating a response

2. How Does Temperature Work?

Each time an LLM generates a word, it calculates a probability distribution of possible next words.
The temperature setting influences how these probabilities are processed:
- Low temperature (<1.0): The model selects more probable words, resulting in predictable and coherent responses.
- High temperature (>1.0): The model selects less probable words, making the text more creative and varied.
- 0.0: The model always chooses the most likely word, leading to deterministic and formal responses.
- 2.0: Words are chosen almost randomly, which can result in illogical or incoherent responses.

Anthropic: “The temperature parameter is used to manipulate this probability distribution before sampling the next token. If the temperature is low (close to 0.0), the probability distribution becomes more peaked, with high probabilities assigned to the most likely tokens. This makes the model more deterministic and focused on the most probable or “safe” choices. If the temperature is high, the probability distribution becomes more flattened, with the probabilities of less likely tokens increasing. This makes the model more random and exploratory, allowing for more diverse and creative outputs.”

3. Effects of Temperature

OpenAI

Temperature	Effect	Usage Scenarios
0.0 – 0.3	Highly predictable and formal	Scientific articles, legal documents
0.4 – 0.7	Balance between accuracy and creativity	Blog posts, emails, product descriptions
0.4 – 0.7	Balance between accuracy and creativity	Blog posts, emails, product descriptions
1	Default Value OpenAI API
1.3 – 2.0	Highly experimental and unpredictable	Abstract art descriptions, random inspiration

Important note: for ChatGPT is the default temperature setting not explicitly documented. However, some users have observed that the web interface may have a default temperature of 0.7, but this isn’t officially

Anthropic – more details

Ranges from 0.0 to 1.0.
Defaults to 1.0.
Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks.

Note that even with temperature of 0.0, the results will not be fully deterministic.

Important note: no documentation found on the default value for the temperature for Claude.ai

4. Practical Example

Let’s say you use the following prompt: “Write a short summary of the book ‘1984’ by George Orwell.”

Temperature = 0.2 (Low)

“The book ‘1984’ by George Orwell describes a dystopian society where the government exercises total control over individuals. The protagonist, Winston Smith, tries to rebel against the regime of Big Brother.”
- Factually accurate
- Concise and to the point
- Limited creativity

Temperature = 1.2 (High)

“In a bleak, technologically oppressed world, Winston Smith struggles with love, betrayal, and the illusion of freedom. Shadows whisper state propaganda, and thoughts are as dangerous as actions. Can he escape the all-seeing eyes of Big Brother?”
- More narrative and descriptive
- Surprising word choices
- More creative but potentially less precise

5. When Should You Use Different Temperature Settings?

For strict and factual answers → Use a low temperature (0.0 – 0.4).
For a mix of fact and creativity → Use a medium temperature (0.5 – 0.8).
For brainstorming sessions or creative experiments → Use a high temperature (0.9 – 2.0).

2 thoughts on “Temperature in LLM API Calls: Balancing Creativity and Predictability”

James Carstairs says:

March 8, 2025 at 10:46 am

You have a high example with temperature at 1.2 – what is the impact of that being over the suggested range of 0-1 ?

1. jan.syssauw@syssauw.com says:
  
  March 8, 2025 at 11:18 am
  
  hi James,
  OpenAI has a temperature range of 0 – 2 vs Anthropic only 0 – 1range. 1.2 as a temperature is setting the LLM to be more creative in its answers: it will stick less to the details of the training data (less replication) and will favor a bi t more less probable tokens as the next tokens when it constructs the text, so a more “creative” solution.
  When setting the Anthropic temperature value to > 1 e.g. 1.2, the API resets it to the maximum range of 1 and givs you this warning : “Warning: Temperature should be between 0 and 1. Setting to 1.0.”
  thanks
  Jan

Temperature in LLM API Calls: Balancing Creativity and Predictability

1. What is Temperature?

2. How Does Temperature Work?

3. Effects of Temperature

OpenAI

Anthropic – more details

4. Practical Example

Temperature = 0.2 (Low)

Temperature = 1.2 (High)

5. When Should You Use Different Temperature Settings?

Like this:

Related

2 thoughts on “Temperature in LLM API Calls: Balancing Creativity and Predictability”

Leave a Reply Cancel reply

Temperature in LLM API Calls: Balancing Creativity and Predictability

1. What is Temperature?

2. How Does Temperature Work?

3. Effects of Temperature

OpenAI

Anthropic – more details

4. Practical Example

Temperature = 0.2 (Low)

Temperature = 1.2 (High)

5. When Should You Use Different Temperature Settings?

Share this:

Like this:

Related

2 thoughts on “Temperature in LLM API Calls: Balancing Creativity and Predictability”

Leave a Reply Cancel reply

Related Post

What Are AI Hallucinations, Why Do They Happen, and How to Minimize Them?What Are AI Hallucinations, Why Do They Happen, and How to Minimize Them?

OpenAI Introduces O3 and O4 Mini: Better AI Reasoning and Tool UseOpenAI Introduces O3 and O4 Mini: Better AI Reasoning and Tool Use

15 GPT-4.1 Pro Prompt Tips (API Only)15 GPT-4.1 Pro Prompt Tips (API Only)