Advanced Settings

Temperature, frequency penalties, and other generation controls.

Last updated June 2026

Overview

The Advanced Settings panel gives you fine-grained control over how AI models generate text. These settings affect the randomness, diversity, repetition, and length of the output.

Most writers never need to touch these — the defaults work well for the vast majority of use cases. But if you want to push a model toward more creative (or more predictable) output, this is where you do it.

Info

Advanced settings are configured per writing style. When you create or edit a writing style, the Prompts step includes Temperature and Diversity (Top P) sliders. The values you set there apply whenever that style is active in the writer.

Configure generation parameters in the Style Creator — each style carries its own settings

Temperature

Temperature controls how “random” the AI's word choices are. It's the single most impactful setting for shaping the feel of generated prose.

Value	Behavior	Good For
0.0	Deterministic — always picks the most likely next word	Technical writing, outlines, consistency checks
0.5–0.7	Balanced — mostly predictable with occasional surprises	Most fiction writing, dialogue, descriptions
0.8–1.0	Creative — more varied word choices, unexpected phrasings	Poetry, experimental prose, exploratory writing
1.2–2.0	Very random — unpredictable, sometimes incoherent	Wild exploration, surrealist writing (use with caution)

The default temperature varies by model and writing style. Most models default to around 0.7–0.8, which is a good balance between creativity and coherence. Writing styles can override this — for example, a “literary” style might set temperature higher to encourage more varied vocabulary.

Tip

If the AI is producing prose that feels too “safe” or predictable, try bumping the temperature up by 0.1–0.2. If the output is too wild or inconsistent, lower it. Small adjustments make a big difference.

Top P (Nucleus Sampling)

Top P (also called nucleus sampling) is another way to control diversity. Instead of temperature's “how random,” Top P says “from how many candidates.”

A Top P of 0.9 (the default) means the AI considers the top 90% most likely words at each step, ignoring the bottom 10% of unlikely choices. Lowering this narrows the pool — a Top P of 0.5 would only consider the top 50% of candidates, producing more focused output.

Value	Effect
0.5–0.7	More focused, less diverse word choices
0.9 (default)	Good balance — ignores only very unlikely words
1.0	Consider all possible words (maximum diversity)

Note

Most AI experts recommend adjusting either Temperature or Top P, not both at once. Changing both can have unpredictable effects. If you're experimenting, start with Temperature and leave Top P at its default.

Frequency Penalty

Frequency Penalty reduces repetition by penalizing words that have already appeared in the output. The more often a word appears, the less likely the model is to use it again.

Value	Effect
0.0 (default)	No penalty — words can repeat naturally
0.2–0.5	Light penalty — reduces obvious repetition
0.6–1.0	Moderate penalty — noticeably less repetitive, more varied vocabulary
1.5–2.0	Strong penalty — can make prose feel forced or unnatural

A light frequency penalty (0.2–0.5) is useful if you notice the AI repeating the same phrases or descriptions. Be careful with values above 1.0 — heavy penalties can make the model avoid common words it actually needs to use, leading to awkward prose.

Warning

Frequency Penalty is not supported on all models. Gemini 3.x preview models (Gemini 3 Flash, Gemini 3.1 Flash Lite, Gemini 3.1 Pro) do not support frequency or presence penalties. If you set a penalty with these models, it will be silently ignored.

Presence Penalty

Presence Penalty encourages the model to bring up new topics and ideas. Unlike Frequency Penalty (which penalizes based on how often a word appeared), Presence Penalty penalizes a word simply for having appeared at all.

Value	Effect
0.0 (default)	No penalty — model writes naturally
0.2–0.5	Light encouragement to explore new ground
0.6–1.0	Moderate push toward new topics and vocabulary

This is most useful when you want the AI to explore different aspects of a scene rather than circling back to the same descriptions. For most fiction writing, the default of 0.0 works fine.

Warning

Like Frequency Penalty, Presence Penalty is not supported on Gemini 3.x preview models. The setting will be ignored for those models.

Max Tokens

Max Tokens sets the maximum length of the AI's output. This is measured in provider tokens (roughly 4 characters per token, or about 0.75 words per token).

Each model has its own hard maximum:

Max Output	Models
32K tokens	GPT-4.1, Grok 4.3
64K tokens	Gemini 2.5 Flash Lite, Gemini 3 Flash, Gemini 3.1 Flash Lite, Gemini 2.5 Pro, Gemini 3.1 Pro, Claude 4.5 Haiku, Claude 4.5 Sonnet, Claude 4.6 Sonnet, Kimi K2.5, Kimi K2.6
128K tokens	DeepSeek V4 Flash, GPT-5 Mini, GPT-5.2, GPT-5.4 Nano, GPT-5.4 Mini, GPT-5.4, GPT-5.5, Claude 4.8 Opus

You can set a lower max to control output length. For example, if you want short, punchy paragraphs, set max tokens to 500–1,000. For full chapter-length generations, leave it at the model's default maximum.

Tip

If your generation hits the max token limit, you'll see a yellow truncation warning. You can Continue from where it stopped to extend the passage.

Model Override

Writing styles can specify a preferred model — for example, a style designed for literary fiction might prefer Claude 4.5 Sonnet. The model override setting lets you lock a specific model for your session, overriding any style preferences.

This is useful when you want to use a particular style's prose instructions but pair them with a different model. For example, you might use a romantic style (designed for Claude) with Grok 4.3 for its Freedom rating.

To set a model override, use the model selector dropdown in the toolbar. Your selection takes priority over the style's preferred model until you change it or close the session.

Model Compatibility Notes

Not all models support all settings. Here's what to know:

Setting	Supported Models	Notes
Temperature	All 20 models	Universally supported
Top P	All 20 models	Universally supported
Frequency Penalty	All except Gemini 3.x preview models	Silently ignored on Gemini 3 Flash, 3.1 Flash Lite, 3.1 Pro
Presence Penalty	All except Gemini 3.x preview models	Same limitation as Frequency Penalty
Max Tokens	All 20 models	Capped at each model's hard maximum

Info

When you set a penalty that a model doesn't support, the generation still works normally — the unsupported parameter is simply ignored. You won't see an error, but the penalty won't have any effect.

Recommended Defaults

If you're not sure what to set, here are sensible defaults for common scenarios:

Scenario	Temperature	Top P	Freq. Penalty	Presence Penalty
General fiction	0.7	0.9	0.0	0.0
Literary/poetic	0.9	0.9	0.3	0.2
Action/thriller	0.6	0.9	0.0	0.0
Outlines/planning	0.4	0.8	0.0	0.0

When to Adjust These Settings

The honest answer: most of the time, you don't need to. The defaults work well for the vast majority of creative writing. Consider adjusting when:

Output feels too predictable — increase Temperature by 0.1–0.2.
Output is too random or incoherent — decrease Temperature.
The AI keeps repeating phrases — add a light Frequency Penalty (0.2–0.4).
The AI circles back to the same ideas — add a light Presence Penalty (0.2–0.4).
You need shorter/longer output — adjust Max Tokens.
A style's preferred model isn't right — use Model Override.

Tip

When experimenting, change one setting at a time and generate a few samples to see the effect. Changing multiple settings at once makes it hard to tell which change improved (or worsened) the output.

For the deepest understanding of how the AI sees your story, read Context Building & Story Bible.

PreviousThe Token Economy NextContext Building & Story Bible