Phrase Bias Tutorial

written by OccultSage, with thanks to AlexeiP, Pause, Aini, and u/sgt_brutal

Note: Biases are extremely strong on Krake. You will need to reduce their intensity considerably to avoid soft bans or constant looping. It is suggested to add a 0 immediately after the decimal point. Thus, 0.3 would become 0.03. Anything over 0.05 is usually too strong.

Introduction
NovelAI has an advanced and powerful feature, Phrase Bias, that you can use to adjust the bias of groups of tokens. This is more powerful than the earlier Logit Bias feature that is available via the API but never exposed directly in the UI, as the older functionality only biases single tokens rather than groups of tokens.

To reach the Phrase Bias functionality, access the Options panel on the right hand side of the NovelAI interface, and scroll down until you get to the Advanced Options expansion. Click on that to open, and scroll down further to see the interface shown in the screenshot to the right.

The drop down under Phrase Bias is for Phrase Bias Groups that have the same biases. Phrase Groups can be added, removed, and edited. When you type in text you want to bias for such as  and hit   it will be added to the current Phrase Bias Group.

The slider right below the text entry allows you to bias in ranges from 'almost never' on the left hand side, to 'almost always' on the right hand side. A light touch is suggested, and the slider is non-linear to reflect this.

You may find more usage details here: NovelAI_Features

But first, before we get into how to use Phrase Biasing, let's discuss and understand what Biasing is and how it impacts on our phrases and tokens.

What is Biasing?
NovelAI at its heart is an advanced text prediction system that constructs responses one token at a time. The closest human language analogue for a token would be a syllable, although it does not usually map or correspond 1:1; there are words with multiple syllables that are a single token.

For each token, NovelAI picks from a list of probable tokens -- the token is selected from a list that is formed up influenced by:
 * The Sigurd (GPT) model itself.
 * What text already exists in the context sent, including Story, activated Lorebook entries, Author's Note, and Memory.
 * The module in use (such as Cross-Genre, 19th Century Romance, and Lovecraft)
 * Generation settings.
 * Token bans.
 * ... and now Phrase Biases.

Each 'token has a probability assigned to it by the above process. For example, if we submit:, the following top token probabilities are calculated: From the above table, we can see that the two most probable tokens are  and , and that the following two probable tokens are   and. The generation node then picks one of them randomly accordingly to the weights above.

When we generate for a number of tokens such as, the above process is repeated   times. With Token Streaming enabled, these tokens are fed back to the NovelAI UI one at a time.

What if we could adjust the probabilities of these tokens? Well, with Phrase Biasing, we can!

What Can We Do With Phrase Biasing?
We can do many things with Phrase Biasing that does not seem obvious at first glance.

We can: ... and much more!
 * Change the pacing of the story entirely by controlling when NovelAI finishes sentences, goes to the next paragraph, how much dialogue is generated, and how long each alternation of dialogue happens.
 * Enhance NovelAI's grammatical usage by biasing against words such as, or beginning sentences with   or
 * Control how a NovelAI story starts by either biasing for or against common phrases that show up in your story.
 * Control the theme, slant, and emphasis of the story.

Story Pacing
We can use Phrase Biasing to control the pacing of the story.

Credit to https://www.reddit.com/user/sgt_brutal/ for his original post that crystallized my thoughts on this!

Before
We perform around ~1200 characters of generations on the default module, General: Cross-Genre, with the Storywriter preset with no biases so we can get a general feel for what the shape of the text is.

Settings
There are four main tokens that control the flow of a story:
 * -- a period, going onto the next sentence. If we positively bias for a period, it will cause shorter sentences; conversely, if we negatively bias against periods, it will cause longer sentences to be produced.
 * -- a double quotation mark; biasing for this will cause more dialogue, biasing against this will cause less dialogue. Note that this is for opening quotations at the beginning of a line.
 * -- a double quotation mark followed by a space; biasing for this will cause ping-pong conversations.
 * -- a line break, going onto the next paragraph. NovelAI generally talks about a subject per paragraph, so if you want more or less detailed descriptions, we can bias for or against line breaks.

Let's demonstrate by typing in the following into the Phrase Bias text field:

Once entered, let's slide the bias towards. Remember, a light touch is often all what's required! Make sure the Unbias When Generated checkbox is disabled. When done, our pane should look something like the following:

After
Now, we  twice and see what kind of output we get!

Bias -0.1 against:

Amazing! The paragraphs are much longer, more detailed, and have far less dialogue! Next, let's see what happens when we slide from `-0.1` to `+0.1` and hit  twice:

Bias +0.1 for: We have a completely different terse and dialogue-heavy writing style!

Control NovelAI Grammatical Usage
We can control how often speakers get referenced, as well as aim for better writing by biasing against or for certain words. For example, let's work from a writing advice article about common pitfalls here:

https://owl.purdue.edu/owl/general_writing/academic_writing/conciseness/avoid_common_pitfalls.html

Avoiding Using Expletives At The Beginning of Sentences
Expletives are phrases of the form it + be-verb or there + be-verb.

Bias -0.1 against:

Avoid Circumlocutions In Favor of Direct Expressions
Circumlocutions are commonly used roundabout expressions that take several words to say what could be said more succinctly.

Bias -0.1 against:

Avoid Overusing Noun Forms Of Verbs
Use verbs when possible rather than noun forms known as nominalizations. Sentences with many nominalizations usually have forms of be as the main verbs.

Bias -0.1 against:

Bias +0.1 for:

Trying It Out
Let's see what kind of output we get when we hit  a few times:

This has definitely helped improve the output that NovelAI produces! There is a lot of other things we can control as far as grammar such as the overuse of  or  ; this is not intended to be comprehensive, but a starter for ideas.

Control The Theme Of The Story
One of the most interesting things about Phrase Biasing is that we can use it to control the theme of the story.

A Story About Rainbows, Unicorns, and Girls on an Island
Bias +0.43 for:

Sample Output:

A Worldship's First Contact
Enable Ensure Completion After Start.

Bias +0.3 for:

Prompt:

Sample Output:

Layering Phrase Biases
A common difficulty that users may run into is that when they bias against or for a phrase, any other that begins with the token is much less likely.

For example, if an user biased for, which tokenizes to  , any word that tokenizes to a start of   would have a strong bias towards   on the next token. Example words include.

We can solve this with multiple biases that are layered:
 * A bias for:
 * A slight bias against

NovelAI will generate  and on the next token, the bias against   will reduce the bias for.

Notes and Caveats

 * When you enter in a Phrase without enclosing in, it is preceded by a space. Most words in a sentence are preceded by a space when tokenized. But this means that starts of paragraphs or quotes escape the bias.
 * Pay attention to the Unbias When Generated setting! Sometimes you want this enabled, and sometimes you don't!
 * Ensure Completion After Start can really bite you. Avoid enabling this for words with beginning tokens that are common!

Questions, Feedback, Ideas
Questions and feedback are always appreciated. Find @OccultSage on one of the many NovelAI discords!

You may find the links for the two bias files created as part of writing this tutorial:
 * Story Flow Control.bias
 * Better Writing Biases.bias