Advanced Settings
Temperature, frequency penalties, and other generation controls.
Last updated March 2026
Overview
The Advanced Settings panel gives you fine-grained control over how AI models generate text. These settings affect the randomness, diversity, repetition, and length of the output.
Most writers never need to touch these — the defaults work well for the vast majority of use cases. But if you want to push a model toward more creative (or more predictable) output, this is where you do it.

The Advanced Settings panel — fine-tune how the AI generates text
Temperature
Temperature controls how “random” the AI's word choices are. It's the single most impactful setting for shaping the feel of generated prose.
| Value | Behavior | Good For |
|---|---|---|
| 0.0 | Deterministic — always picks the most likely next word | Technical writing, outlines, consistency checks |
| 0.5–0.7 | Balanced — mostly predictable with occasional surprises | Most fiction writing, dialogue, descriptions |
| 0.8–1.0 | Creative — more varied word choices, unexpected phrasings | Poetry, experimental prose, brainstorming |
| 1.2–2.0 | Very random — unpredictable, sometimes incoherent | Wild brainstorming, surrealist writing (use with caution) |
The default temperature varies by model and writing style. Most models default to around 0.7–0.8, which is a good balance between creativity and coherence. Writing styles can override this — for example, a “literary” style might set temperature higher to encourage more varied vocabulary.
Top P (Nucleus Sampling)
Top P (also called nucleus sampling) is another way to control diversity. Instead of temperature's “how random,” Top P says “from how many candidates.”
A Top P of 0.9 (the default) means the AI considers the top 90% most likely words at each step, ignoring the bottom 10% of unlikely choices. Lowering this narrows the pool — a Top P of 0.5 would only consider the top 50% of candidates, producing more focused output.
| Value | Effect |
|---|---|
| 0.5–0.7 | More focused, less diverse word choices |
| 0.9 (default) | Good balance — ignores only very unlikely words |
| 1.0 | Consider all possible words (maximum diversity) |
Frequency Penalty
Frequency Penalty reduces repetition by penalizing words that have already appeared in the output. The more often a word appears, the less likely the model is to use it again.
| Value | Effect |
|---|---|
| 0.0 (default) | No penalty — words can repeat naturally |
| 0.2–0.5 | Light penalty — reduces obvious repetition |
| 0.6–1.0 | Moderate penalty — noticeably less repetitive, more varied vocabulary |
| 1.5–2.0 | Strong penalty — can make prose feel forced or unnatural |
A light frequency penalty (0.2–0.5) is useful if you notice the AI repeating the same phrases or descriptions. Be careful with values above 1.0 — heavy penalties can make the model avoid common words it actually needs to use, leading to awkward prose.
Presence Penalty
Presence Penalty encourages the model to bring up new topics and ideas. Unlike Frequency Penalty (which penalizes based on how often a word appeared), Presence Penalty penalizes a word simply for having appeared at all.
| Value | Effect |
|---|---|
| 0.0 (default) | No penalty — model writes naturally |
| 0.2–0.5 | Light encouragement to explore new ground |
| 0.6–1.0 | Moderate push toward new topics and vocabulary |
This is most useful for brainstorming or when you want the AI to explore different aspects of a scene rather than circling back to the same descriptions. For most fiction writing, the default of 0.0 works fine.
Max Tokens
Max Tokens sets the maximum length of the AI's output. This is measured in provider tokens (roughly 4 characters per token, or about 0.75 words per token).
Each model has its own hard maximum:
| Max Output | Models |
|---|---|
| 16K tokens | Grok 4.1 Fast |
| 32K tokens | GPT-4.1 Nano, GPT-4.1, Grok 4 |
| 64K tokens | Gemini 2.5 Flash Lite, Gemini 3 Flash, Gemini 3.1 Flash Lite, Gemini 2.5 Pro, Gemini 3.1 Pro, Claude 4.5 Haiku, Claude 4.5 Sonnet, Claude 4.6 Sonnet |
| 128K tokens | GPT-5 Mini, GPT-5.2, GPT-5.4, Claude 4.6 Opus |
You can set a lower max to control output length. For example, if you want short, punchy paragraphs, set max tokens to 500–1,000. For full chapter-length generations, leave it at the model's default maximum.
Model Override
Writing styles can specify a preferred model — for example, a style designed for literary fiction might prefer Claude 4.5 Sonnet. The model override setting lets you lock a specific model for your session, overriding any style preferences.
This is useful when you want to use a particular style's prose instructions but pair them with a different model. For example, you might use a romantic style (designed for Claude) with Grok 4 for its Freedom rating.
To set a model override, use the model selector dropdown in the toolbar. Your selection takes priority over the style's preferred model until you change it or close the session.
Model Compatibility Notes
Not all models support all settings. Here's what to know:
| Setting | Supported Models | Notes |
|---|---|---|
| Temperature | All 16 models | Universally supported |
| Top P | All 16 models | Universally supported |
| Frequency Penalty | All except Gemini 3.x preview models | Silently ignored on Gemini 3 Flash, 3.1 Flash Lite, 3.1 Pro |
| Presence Penalty | All except Gemini 3.x preview models | Same limitation as Frequency Penalty |
| Max Tokens | All 16 models | Capped at each model's hard maximum |
Recommended Defaults
If you're not sure what to set, here are sensible defaults for common scenarios:
| Scenario | Temperature | Top P | Freq. Penalty | Presence Penalty |
|---|---|---|---|---|
| General fiction | 0.7 | 0.9 | 0.0 | 0.0 |
| Literary/poetic | 0.9 | 0.9 | 0.3 | 0.2 |
| Action/thriller | 0.6 | 0.9 | 0.0 | 0.0 |
| Outlines/planning | 0.4 | 0.8 | 0.0 | 0.0 |
| Brainstorming | 1.0 | 0.95 | 0.0 | 0.5 |
When to Adjust These Settings
The honest answer: most of the time, you don't need to. The defaults work well for the vast majority of creative writing. Consider adjusting when:
- Output feels too predictable — increase Temperature by 0.1–0.2.
- Output is too random or incoherent — decrease Temperature.
- The AI keeps repeating phrases — add a light Frequency Penalty (0.2–0.4).
- The AI circles back to the same ideas — add a light Presence Penalty (0.2–0.4).
- You need shorter/longer output — adjust Max Tokens.
- A style's preferred model isn't right — use Model Override.
For the deepest understanding of how the AI sees your story, read Context Building & Story Bible.