NovelAI Features

⬅ Back to Lander

-

= Detailed Concepts =

This section seeks to explain, indepth, the various elements that make up NovelAI.

-

'GPT' Artificial Intelligence
Science fiction has us think of artificial intelligence as something for robots and alien spaceships, but the reality is very different. In this context, an artificial intelligence is quite unlike the natural intelligence exhibited by humans. The phrase that conjures thoughts of robotic assassins and intelligent holographic people is actually artificial general intelligence, or AGI.

The eponymous AI that NovelAI uses is a text transformer. This can be thought of as a very, very advanced form of autocorrect - it chooses words in sequence based on recognized patterns. This works by converting text to Tokens, running a series of pattern recognition tests to determine the probability of the next possible Token in the sequence, then returning the result.

No, you are not talking to a machine with thoughts and feelings. While NovelAI is very advanced, it is purely designed as a literature generation service. At times, you may feel like the AI understands you, and it may even seem emotional. This is it performing a very convincing facsimile of human literature - exactly what it was designed to do.

The AI is only capable of producing text that looks convincingly human, without understanding language on its own.

As such, NovelAI has no understanding of morality, as it has no understanding of anything. Please keep this in mind when experiencing the narratives presented in your Stories, and remember you can always ban certain Tokens from appearing.

Parameters
Think of an AI like a human brain. Inside the brain, you find neurons, and those neurons are connected by synapses.

The number of parameters is the number of synapses. It's a measure of the density of the neural network's connections. The denser it is, the richer the connections, and the broader the scope of the AI's creativity.

The model is, in reality, a vector space and each parameter is a set of floating point numbers that constitutes a vector. Each Vector is a connection between tokens, and that network of vectors is an attempt to represents human language mathematically.

A model with more parameters takes considerably more Memory on the machine's hardware, but is capable of greater creativity and more 'natural' language.

⬆ Return to Page Top

-

Tokens
The AI interprets text by converting it to tokens. Tokens are how the AI sees pieces of text. Much like morphemes, Tokens are combined to form words or sentences. Because the AI has no judgment of its own, it relies on evaluating the relationships between Tokens based on its training data, then determining the most likely Token to come next in the sequence.

Think of it as a huge game of probabilities. There is no winner, but there are more likely answers, and the AI picks from those.



For example, a raincoat is composed of two Tokens - the Token for rain and the Token for coat. When put together, the AI recognizes the pattern as raincoat, and evaluates its training material to figure out what Tokens are associated with raincoat. Not all Tokens are whole words - some Tokens are as simple as punctuation marks, spaces, and even partial words like mah.

Because the AI is quite powerful, it's capable of evaluating patterns with up to 2048 Tokens in total. When you hit Send, the Current Context is fed to the AI as Tokens, the AI estimates the most likely next word in the sequence, then repeats the process until the Completion is returned.

Because NovelAI works by identifying links between Tokens, including a Token will cause it to have a higher chance of appearing. This means that writing, a negative, will cause the AI to still consider   as a possibility.

This is similar to ironic process theory. Instead, you should phrase positively -  - where possible.

If you wish to experiment with Tokenization, the Author's Note and Memory fields provide an accurate Token count for everything you type within.

⬆ Return to Page Top

Story Library


Also known as the "left panel", this part of the interface controls everything outside of your Story. This includes the Settings menu, as well as the list of your Stories.

The menu icon (Ξ) contains the Settings menu, and is also where you can log out. Below them is the Search Field (🔍), which allows you to search through your Story collection by typing next to the magnifying glass.

Every Story displays its Title, Description, the date it was last edited, and whether or not it is marked as a Favorite. The Title and Description can be modified by switching to the Story, then adjusting them in the Story Options. You can switch to a Story by simply clicking on it in the Story Library.

A new Story can be started by selecting the New Story button (+), found at the bottom of the Story Library.

⬆ Return to Page Top

-

Input Field
Here is the Input Field. This section of the interface is used to control the edit history of the Story, as well as being a place to type additions to your Story.



The edit history can be thought of as a timeline - every change you make is a step forward in the timeline. Undoing will take a step backward, Redoing will take a step forward, and Retry will make a new attempt at the last Completion. The Retry Tree is basically a way to jump between timelines - any time the AI sends a Completion that's on the same point as an existing Completion, it will create a new entry in the Retry Tree.

More information can be found in their dedicated categories below.

Undo ↩
This option will take a step backwards in the edit history.
 * It will never overwrite anything in the edit history, only stepping backwards.
 * This will remove entire sentences at a time from the point you began editing, and will remove entire Completions by the AI.

Redo ↪
This option will take a step forwards in the edit history.
 * It will never overwrite anything in the edit history, only stepping forwards, and always with the most recent entry in the Retry Tree.
 * This functions inversely to Undo - instead of removing entire sentences and Completions, it will return them.
 * This is not affected by the location of your text cursor.

Retry Tree 0


This option provides a list of all the attempted Completions at this point in the Edit History.
 * You can choose the one you felt was the most appropriate.
 * This does not count as a Completion.

📕 Lorebook
Opens the Lorebook window.

Retry 🔁
This option will remove the last Completion, then make a new one.
 * While doing this, it will also create a new entry in the Retry Tree.
 * This does not affect any text you edit - if there's no Completion at this point in the edit history, it's the same as hitting Send.

Send ➡
This option will place all of the text in the input field onto the end of your Story, send all of the Current Context to the AI, then ask the AI to send a Completion.

This is a new step in the edit history, and does not create a new entry in the Retry Tree.

⬆ Return to Page Top

-

= Story Options =



This section of the interface controls settings specific to your currently active Story. There are two tabs here - Story and Options.

Story Title
The name you gave to the Story, and the one you look for when using Search.

Story Description
A short blurb to explain the contents of the Story, or a witty quip to help you identify it at a glance.

Create a new tag
Adds a new tag which you can search for.

Saved Tags
All the tags that apply to this story. You can remove them by clicking in the little cross next to them. The ordering does not matter.

Memory, Author's Note, Lorebook and Context have their own entries in this section.

⬆ Return to Page Top

-

Context
View Last Context opens a window which displays all the tokens sent to the AI for the previous generation. This helps you check if anything you feel is important was omitted. View Current Context does the same, but for the input you're about to send.



⬆ Return to Page Top

-

Prompt
The prompt is displayed in cream by default. It is the first piece of text fed into the AI. If you have put anything into the Memory or Author's note, they will be inserted before it in the context before being sent to the AI.



⬆ Return to Page Top

Injected Text
Injected Text is any text that is not part of the story, but part of the context. All of these elements are injected text:


 * The Memory.


 * The Author's Note, or A/N colloquially.


 * Lorebook entries.


 * Ephemeral Context entries.

Fundamentally, All Injected Text works the same way. It's read by the AI, and influences its generation. Traditionally, all Injected elements are encased between square brackets [ ] as they reduce the risk of the style of their content "leaking" into the generation, but this is usually not an issue.

There are two important things to consider about injected text:


 * Position determines the strength of the injection's influence. Closer to the end = Stronger. Further at the top = weaker.


 * Style determines how it influences the generation. Generally, you want to stay close to your Story's style, perhaps with minor concessions such as removing determiners, prepositions, etc.

⬆ Return to Page Top

-

Memory
By default, the Memory is inserted at the top of the context, before anything else. Its position may be adjusted for a stronger (closer to the bottom) or a weaker (further to the top) effect. Traditionally used it to make the AI remember broad context elements.



⬆ Return to Page Top

-

Author's Note
The Author's Note or A/N is identical in format and use to Memory, but it is inserted three newlines by default before the last token in the input. It has a greater influence as a result. The A/N's position may be adjusted for a stronger (closer to the bottom) or a weaker (further to the top) effect. Traditionally, it is used to either give immediate instructions, specify style, author, and tone elements, etc.



⬆ Return to Page Top

-

Lorebook


The Lorebook allows you to create entries for specific elements in your story. This helps the AI have more information about characters, places, items, concepts and so on.

You can import a Lorebook file by clicking the 📤 Import button at the top left.

Basics
Type an entry's name then press Enter to access it, click the 📕 button, or Open Lorebook in the story tab on the right, to open the main lorebook window.

Click ➕ Add Entry to create a new entry. You can name it in the first text box. The second contains the entry's contents. Only the content and settings are read by the AI. The entry name is just an identifier.

Enabled determines if the entry will be inserted in the context if detected. If it is disabled, it won't trigger. This is useful to reduce context cluttering if you don't need details about specific things.

Keys are all the words that the AI will associate with this entry. If the AI reads this word, then the connected entry will be inserted in the context. Keys are case insensitive.

Type the key and press Enter to register it. Keys are case insensitive by default.

To make a Key case-sensitive, preface it with  and close it by  :

Advanced Lorebook Settings

 * Search Range: Determines how many characters of text will be read by the AI when it looks for lorebook keys.


 * Force Activation: If turned ON, the entry will ALWAYS be in the context (if it can fit in there).


 * Key-Relative Insertion: By default, Lorebook Entries are inserted relative to the top, or the bottom of the text, see Insertion Position. When this toggle is ON, entries are inserted relative to the last occurrence of the Key found in the context.


 * Cascading Activation: When ON, this entry will also look for its keys in other Lorebook entries, the Memory, and the Author's note. Search Range will be disregarded if this toggle is ON.

You may also use a  (a newline marker), which helps isolate the entry further by separating it with a full newline.
 * Prefix & Suffix: These two are intended to work in tandem to allow for lengthier entries without losing coherence when the entry is trimmed. For example, you could add the prefix  and the suffix   to encapsulate the entirety of your entry despite trimming. If your entry read as , and the last sentence was trimmed, it would still read as   despite the end of the entry being trimmed - your prefix and suffix still remain.


 * Token Budget: Keeps this number of tokens in the context window for this entry. This will overwrite other content if necessary! It's recommended to set it a litter lower than the entry's full size.


 * Insertion Order: The higher this number, the earlier the entry is processed. Entries with a low value may be dropped to save space for those with a higher value. If you have three entries, with order 500,0 and -500, they will be processed from highest (500) to lowest (-500).


 * Insertion Position: How far from the top (if positive) or the bottom (if negative) will the entry be inserted in the window. The unit is defined in Insertion Type. It can be a number of tokens, sentences, and newlines.

As an example, if you set it to -3 Newline, then it will insert the entry's text as soon as it finds the third newline, reading back from the bottom of the window. -1 will mean it is always placed at the very bottom of the Context, just as positive 1 will always place it at the very top of the Context. 0 will always be the very top.


 * Trim Direction: If the entry needs to be inserted partially due to lack of room in the context window, should it trim from the beginning towards the end, (top) end towards the beginning, (bottom) or omit the entire entry if it can't fit fully (Do Not Trim)?

⬆ Return to Page Top

Why use Brackets?
Bracketed text is often assumed to be read by the AI as "Read this, but this is not part of the text." This helps it keep things into memory without trying to continue from them as if they were sentences in the text.

This isn't entirely true to be very specific, but it's the easiest way to explain it for the average user. Different bracketing techniques exist, but the specific effect of containing text in brackets is unclear and debated.


 * Traditionally, entire entries are encased in brackets.


 * You can also make several blocks, one per element. e.g One block for style, one block for author, one for immediate action in the Author's note.


 * You can encase only descriptive passages in Injected Text entries if they differ from the usual style of your prose.

⬆ Return to Page Top

Context Viewer
The Context Viewer is a powerful tool to identify what elements were used by the AI in the last generation. This helps you diagnose Memory, Author's Note and Lorebook usage. Check for bloat, trimmed entries, or ones that take too much space using this tool.



Identifiers
Lists the Identifier of each element of the context that describes it's origin from one of the following:
 * Story: From the main text.
 * Memory: From the Memory block.
 * Author's Note: From the Author's Note block.
 * Display Name of a Lorebook Entry: The name that you gave that entry in the Lorebook, not its keys.

Inclusion
Lists if this element is included in the context:
 * Included: Successfully inserted in the context.
 * Partially Included: Inserted in the context but some trimming was performed.
 * Not Included: Insertion in the context failed or was not attempted.

Reason
Lists the reason for this element's inclusion or omission:

Included
 * Default: Reserved to Story, Memory and Author's Note. Included by default.
 * Key Activated: This Lorebook entry was triggered by one of its keys.
 * Forced: This Lorebook entry was activated because it was set to Forced.

Omitted
 * Disabled: This Lorebook entry was omitted because it was disabled.
 * No key: This Lorebook entry was ommitted because it could not find any of its keys in the text.
 * No space: This entry was ommitted because it could not be allocated enough tokens to fit.
 * No text: This entry was deactivated because it contains no text.

Key
Lists the key that triggered this Lorebook entry.

Reserved
Lists the amount of tokens reserved for this entry. This is usually lower than the Reserved Tokens setting of that entry, as that setting is the upper limit.

Tokens
Lists how many tokens are used by this entry solely on its own. Tokenization can cause a couple extra (or sometimes less) tokens to be used when this entry is placed in the text.

Trim Type
Lists how this entry was trimmed. There are four trim steps, which occur in this sequence:
 * Fit the entire entry without trimming. (No Trim) If it doesn't fit, go to the next step:
 * The entry was trimmed to a new line character inside its text. (New Line) If this results in the entry having less than 30% of its allocated token content inserted, go to the next step:
 * The entry was trimmed to a sentence delimited (period, ellipse, semicolon) (Sentence) If this causes the entry to have less than 30% of its allocated token content inserted, go to the next step:
 * The entry is trimmed by the individual token, and then all the content that can fit in the space that remains is inserted. (Token) If this STILL fails, this is likely because the Prefix and Suffix can't fit in the context, so the entire entry is omitted.

Advanced Context Settings


Remember all the advanced settings of the Lorebook? Those are used here, but for the Story, Memory and Author's Note. This allows you to:


 * Fine-tune the maximum size of these blocks.


 * Make Memory or the Author's Note get trimmed before the Story by setting it to a higher priority.


 * Change the way these three blocks are trimmed.


 * Force suffixes and prefixes that you won't need to write in the blocks directly.

⬆ Return to Page Top

-

Ephemeral Context


Ephemeral Context entries are effectively time-sensitive context injections. Think Mission Impossible:

Every time you generate text, you perform a step. Ephemeral Context entries wait a certain number of steps, appear, remain for a certain number of steps, and disappear.

The syntax example is as follows:

Several symbols are used to define the type of information specified:


 * {} Contains the block.


 * The first number specifies the exact starting step, if necessary. You can also specify negative steps using -


 * + specifies the delay in steps before activation. +0 will trigger immediately. Adding r to it will make it repeat after the number of steps set passes, even if the entry is still active. As a result, make sure the delay is longer than the duration if you don't want the entry to be always on, if it repeats.


 * ~ specifies the duration of the entry, in steps, before it disables.


 * , followed by + or - specifies the insertion position of the entry, in new lines. + starts from the top the context, - starts from the bottom of the context.


 * : specifies the beginning of the text content of the entry.

Thus:  will add "[Angela's amnesia temporarily dissipates.]" to the context, five new lines from the top of the context, for fifteen steps, starting thirty steps after you set up this entry. Effectively, it'll be on half the time.

You may also add a ! after the first curly brace to be able to specify a temporarily inactive entry. This makes it always present except during the Ephemeral Context's entry duration.

This one will be off half the time, when the other entry is active.

You can also type out Ephemeral Context entries directly in the Input box.

⬆ Return to Page Top

-

= Options Tab =

AI Modules


AI Modules are data modules that are inserted into the AI's memory in order to influence the text it will generate. These modules reduce the total context space by twenty tokens when in use, but are not tokens in themselves.

Each module is similar to a "mini-finetune", a corpus of text that was used to adjust the AI based on how it is written. Different modules have different effects, which depend on your own writing and the ideas, characters and scenarios you write about.

There are three types of modules: Style, Theme and Inspiration.


 * Styles are based on multiple works from the same author.


 * Themes are based on multiple works, from multiple authors, but from the same genre.


 * Inspirations are based on a singular, specific work, from a single author.

Experiment to find what works best with what you enjoy and want to write about!

Dataset Cleaning Guidelines
Preparation of a clean dataset is the most important factor in the creation of a proper AI Module. Simply scraping a wiki or throwing together some PDF conversions is vastly insufficient and will at best result in sub optimal outputs riddled with leaked symbols, odd spacing, and circular repetition.

Luckily cleaning your dataset is not difficult, only time-consuming. Assuming you have the patience and follow these guidelines, you too can create high quality AI modules.

General Overview
Dataset files should be... Plain text in UTF-8 encoded  format with no tags/markdown/html, instead focusing on standard formatted English prose.

One paragraph per newline. That is, no paragraphs should be split onto multiple lines. To visualize this, it helps to turn on Word Wrapping on whatever text editor you are using (in Notepad++ this done by checking View > Word Wrap).

No empty newlines between paragraphs (ie. double spacing). If desired, you could leave a single empty newline between chapters as a chapter break, but it is instead recommend that you leave only  on its own newline between chapters (replacing the chapter title).

Do not leave any leading/trailing space, tabs or other whitespace. This includes checking for any spaces after the end of a sentence before a newline.

Use regular single and double quotation characters ( and  ) not the fancy ones (  and  ). The AI is trained using the former and thus will not use the latter properly.

Try to focus in on only one specific subject matter and ensure all included material is focused on what your module should achieve in terms of what you expect from its output. This process necessitates nuance, not stuffing as much as you can into your dataset. Keep in mind this can be tricky, as for example some Steampunk novels don't actually talk all that much about the type of content one would relate to the Steampunk genre and thus in practice it is not very effective using them to train a "Steampunk" module.

A little data goes a long way. 1MB to 5MB still provides excellent results for an authorial or thematic style assuming you provide it enough training steps. On this note, feel free to experiment with short data in general. There is nothing stopping you from turning a short prompt into a module, and this would also require much fewer training steps (perhaps within the range of 50-100 for a typical scenario prompt).

If you want to avoid the same characters appearing constantly or other forms of overfitting, try to keep the data balanced with a variety of names, phrases, and terms by including stories featuring different characters or locations.

Do not expect a module to memorize relational/factual data. For instance if you feed it a story with Pokémon descriptions, while it will reference those same Pokémon, it may still get their types, moves, or other data you included wrong.

The form or format in which the data exists plays a significant role in the form it will generate as AI output. For this reason, avoid wikis or other encyclopedic data unless you specifically want to generate encyclopedia entries (which could be useful for utility modules as content generators).

Cleaning Headings & Auxiliary Sections
Before anything else, it is important to note that discretion is key here as you typically want to be as *least* destructive to your data as possible considering how easy it is to completely ruin your data with a single misguided Find and Replace. However unless you want to see them leaked in the AI's output, make sure to remove Fore/Afterwords, Acknowledgements, Author's Notes, About the Authors, and any other sections that have nothing to do with the story or data. This also includes any author commentary or excerpts *unless* it is diagetic to (takes place within the narrative of) the story.

Chapter titles can be replaced with  alone on a newline to signify breaks between chapters or   for breaks between individual short stories as these are conventions of the base training data and the AI is used to working with them.

In general it is a good idea to remove, replace, or trim down anything too repetitive (such as numbered chapters, titles that use the same prefixes, or stylistic phrases like  repeated twenty times in a row) as this will increase the chance of these leaking into your output. That said, if you find yourself removing too much from a work it might just be a better idea to exclude it from your dataset altogether.

Cleaning Prose
Keep an eye out for odd symbols, characters, or other unusual formatting such as odd card suits or other symbols used as chapter breaks or Japanese quotations (  and  ) which are often found in visual novels (and can be replaced with regular quotations).

On this note, often times in scanning a book to a digital copy certain formatting will have errors or won't scan correctly. You can see examples of this occasionally with underscores such was from OCR software reading  as   or for things like telepathic dialogue communication, which is typically italicized but since raw text has no characters for italics they will appear as __dialogue encapsulated by underscores like this__ which need to be replaced with quotations or angled brackets (  and  ) as the AI knows to associate them with non-verbal dialogue.

Another common scanning issue is having chapters start in all uppercase capital letters (ie. ) or with the first letter of the first word separated from the rest of the word (ie.  ).

Extra spacing is yet another common issue such as with possessive indicators (ie. ) or at the end of sentences before a newline. On that note extra newlines aren't good either as the AI tends to associate them with chapter breaks or a passage of time.

Also be on the lookout for vertical bars which can be replaced with colons. Lastly, if there any square brackets ( and  ) in an unedited file, you might want to remove them unless they are encapsulating something such as an indication of time, location, point-of-view or some other note you intend to use to nudge the AI in your story.

Miscellaneous Tips

 * Don't worry about and don't include the well-known  token. It is not needed for training AI modules as the training process handles that already.
 * Don't worry about your clean-up or tokens being perfect. AI Modules are not as powerful as the actual underlying fine-tuned model and thus don't have as strong an effect as the existing model's training. If your resulting module produces at least fun or interesting results, that's still very much a success.
 * Make use of good feature-rich text editors such as Notepad++ or at least something that supports regular expressions which greatly cut down on editing time.

Recommended Dataset Tools
Gnurro's ReFormatter Belverk's Cleaning Python Script ScrapeFandom Notepad++ Regex101

Useful Regular Expressions
A regular expression or regex is a sequence of characters that can be used to quickly find and replace more complex patterns of text. The following are some useful regular expressions for cleaning datasets. Keep in mind these are potentially highly destructive and one should exercise much care when using these. Batch replacement (ie. "Replace All") with these is not recommended.

/ / Selects headers, titles, and anything else that does not end in punctuation before a new line.

/ / Selects sequences of text in all uppercase capital letters. You can optionally replace the selected text with normal lower case (preserving the first uppercase letter) using this in the Replace field / / though keep in mind character names and other proper nouns may lose their first uppercase letter this way.

/ / Selects various problems involving quotations.

/ / Selects cases where the first letter of a word is separated by a space (common scanning error).

/ / Catches many of the auxiliary sections of a work.

⬆ Return to Page Top -

Generation Presets


In order to make selecting the AI's various generation settings easier, NovelAI offers several generation presets.

Settings are divided into three categories:


 * User: Settings you have defined and saved, or imported.


 * Scenario: Settings that came included in the scenario you imported.


 * Defaults: Settings designed by NAI community researchers.

You can Import a .preset file, or export the currently selected custom preset in the same format.

Use the ➕ button to create a new preset based on the current generation settings.

Use the ✍ button to edit the preset's name.

⬆ Return to Page Top

Generation Options
These settings allow you to fine-tune the generation settings to your linking. These get really technical so only explore them if you like messing with the finer things. Otherwise, leave them to their defaults; they're usually good as is.

Most of them deal with the Pool of possible tokens. To understand what this means, look at these examples:

could result in "up, down, left, right, across, around, fancy" and so on.

would result in fewer potential matches, such as "hot, bright, with".



Randomness (Temperature)
True to its name, the Randomness setting (or "Temperature") increases the likelihood of less-expected tokens during text generation. This works by dividing logits by the Temperature before sampling. In plain English, this means the next part of the sentence will be more unexpected, as elements that have less of a chance of appearing are granted a greater likelihood of being used.

Max Output Length
This setting will adjust the highest number of Tokens returned at once by the AI in each Completion. It will not always hit this target, but it will never exceed it. Be aware that reducing the Output Length and enabling the Trim AI Responses option may cause undesirable effects such as very short sentences.

Trim AI Responses
A sentence delimiter separates subsequent text from the previous clause. Trimming AI responses to the last delimiter will prevent words from appearing after the latest delimiter in the generation.

Basically, this setting prevents dangling words from appearing like this:

To this:

Keep in mind that this can cause very short outputs depending on the current settings.

⬆ Return to Page Top

-

Top-K Sampling
This setting affects the pool of tokens the AI will pick from by only selecting the most likely tokens, then redistributing the probability for those that remain. The pool will only contain the K most likely tokens. If the setting is set to 10, then your pool will contain the 10 most likely tokens. (Top-10 Sampling).

In plain English, lowering this setting causes more consistent Completions at the cost of creativity.

Nucleus Sampling
Relating to the previous setting, this adds up the probability of each potential Token in descending order of likelihood until it reaches the value specified. This value is an inverse percentage of likelihood for the next Token - therefore, lowering this value creates a smaller subset of probable Tokens.

In plain English, lowering this setting causes more consistent Completions at the cost of creativity.

As an example, if the most likely token has 30% chance, the second 25, the third 20, the fourth 10, the fifth 5, and the sixth 3, and your setting is at 0.9 (90%), then you would do: 30+25+20+10+5 = 90. The sixth most likely token and onwards will be removed from the pool.

Tail-Free Sampling
A tail in this context is the least-likely subset of Tokens to be chosen in a Completion. This alternative sampling method works by trimming the least-likely tokens by searching for the estimated tail' probability, removing that tail to the best of its ability, then re-normalizing the remaining sample.

This method may have a smaller impact on creativity while maintaining consistency. However, take note that it tends to behave strangely if your context does not contain a lot of data.

Repetition Penalty
Because text generation is based on patterns, repetition is a constant concern. The Repetition Penalty introduces an artificial dampener to the probability of a token depending on the frequency of its appearance in the Current Context.

As such, increasing this value makes a word less likely to appear for each time it shows up in the text. Do take note that this can get really awkward with words that are recurrent in the current context, such as names, or objects being discussed. With high Repetition Penalty, the AI may find itself unable to use a word repeatedly, and will need to substitute it with another which may be inappropriate.

Repetition Penalty Range
Defines the number of tokens that will be checked for repetitions, starting from the last token generated. The larger the range, the more tokens are checked.

Repetition Penalty Slope
The penalty to repeated tokens is applied differently based on distance from the final token. The distribution of that penalty follows a S-shaped curve. If the sloping is set to 0, that curve will be completely flat. All tokens will be penalized equally. If it is set to a very high value, it'll act more like two steps: Early tokens will receive little to no penalty, but later ones will be considerably penalized.

Min Output Length
Much like the previous setting, this one adjusts the lowest number of Tokens returned at once by the AI in each Completion. It will never be below this target, but may exceed it. Setting this option towards its maximum also makes the AI more likely to produce long sentences.

End of Sampling Token ID
If you wish to end your Completions upon reaching a specific Token, simply add it to this field. Doing so will cause the Completion to end prematurely upon generating the Token.

This can be used to trim the output to single sentences by inputting punctuation, or to increase the accuracy of Lorebook entries by pausing Completion when a defined word is reached.

Ban Token
Any Tokens added here will have their likelihoods reduced to zero. This means they will not appear in Completions. As this adjusts the relationships between Tokens, this will have an impact on the phrasing chosen by the AI. Be careful about what you ban, because this can heavily disrupt output if used incorrectly.

When you add a new token to the banlist, it also adds: Any case variation of the token, and the token with a preceding space as well.

To prevent that from happening, and exclusively ban this token, add curly braces { } around the token you want to ban before pressing enter.

Banned Tokens
Relating to the previous setting, this field shows every Token currently blacklisted for generation. Clicking one of these tags will remove it from the list.

If you inserted an exclusive token (using curly braces), it will be displayed with brackets around it.

Ban Bracket Generation
At times, you may wish to include hints to the AI that are not considered for text generation. These can be encapsulated in square brackets ([ and ]) to relay information that will affect the Current Context while not being considered part of the actual text.

These most often take the form of hints. For more information on what to put between square brackets, see Keeping Track.

⬆ Return to Page Top

Duplication, Import and Export
At the bottom of the options menu, you'll find various options for saving and loading stories and other data.


 * Duplicate Story creates a copy of the story in your library and switches to it. The story will have "- Copy" appended to its name.


 * Export Generation Settings exports only the AI settings you have currently selected. This is useful if you find settings you believe are worth sharing.


 * Export Story allows you to download the story as a .story file, copy the integral story JSON to your clipboard (can cause massive lag!) or download a .scenario file, if you are using placeholders.


 * Import is used to import anything created by NovelAI with the exception of themes. This means: .story, .lorebook, .scenario, .generationsetting files. Alternatively, you can drag & drop these files to the main text box to import them.

You may export the entire story Lorebook as a .lorebook file by clicking 📥 Export Lorebook button at the top left of the Lorebook window.

⬆ Return to Page Top

= Content Creation =

Importing
You can create exportable scenarios from the options menu as well as importing them the same way you import a story, by dropping the file on the webpage or by clicking on the import story button when creating a new story.

Importing a scenario with placeholders
When importing a scenario, you may be asked to fill in some information. Those are called placeholders. You can simply edit those fields to your liking.



Turning on Import Settings will import the Generation settings that were used by the Scenario author.

Adding a placeholder
You can add a placeholder anywhere within your story's prompt, memory, author's note and lorebook entries.

Placeholders have to be written in this format:. The placeholder is divided into four parts:


 * order: the order in which the placeholders will be displayed. 1 goes first, then 2, etc.
 * id: the only mandatory part of the placeholder, it has to be unique. If you have more than one instance of the id, it will use the same value for each of the placeholders.
 * default: the default content of the field when importing the value.
 * description: the text displayed above the input field when importing the value. If there is no description set, the text displayed will be the id. You may want to word this like a question for ease.

To get the field from the above picture, you have to enter

'''You cannot put these characters inside the text fields of the placeholder: $ {} [] # : @ ^ |

Note for the lorebook entries
For the lorebook entries, you can add placeholders on the title of an entry, its descriptions and its keys. If you want to use regex for your keys, you have to prefix the expression with a  for example to match with the name of a character, you have to write

Placeholder-filling Order
Placeholders are requested to be completed by the user in the alphabetical order of their ID. This means that if you start their id with a number that increments with each placeholder, they will be requested in the order of that number.

Example: Job will be requested before Gender, because its ID comes before it.

You can also define the order by preceding the id with a number, followed by a pound sign (#):

If different entries have the same Order number, they will be processed alphabetically according to their id.

Placeholder Table of Contents
You can create a Table of Contents for placeholders where you can insert a large amount of them in advance, allowing you to easily keep track of all the placeholders you have defined.

The syntax is identical to normal placeholders, with these notable differences:


 * It must be inserted at the absolute top of the prompt.
 * The initiating symbol is a percentage sign (%) rather than a dollar sign.
 * Every Placeholder must be on its own new line.



⬆ Return to Page Top

Formats
Formats are different ways of writing Memory, Author's Note and Lorebook entries. Contributor Valahraban wrote an extensive research report on several formats, which goes into great details into various formats, their utility, and how to use them.

NovelAI does not recommend, endorse, or otherwise support any format type in particular.

⬆ Return to Page Top

-

= Account Settings =

The Account Settings can be accessed from the Story Library, then clicking on the Menu button (Ξ), then Settings.

There are four settings currently:


 * Default Storage Location: Switches between using Local Storage (Off) and Serverside storage (On) by default.


 * Force 1024 Token Context Limit: If you have the Scroll or Opus Subscription plan, you can choose to use the Tablet plan context window by turning this setting On. This lets you demo for users with less money to spend, for example.


 * Model Selection: Allows you to choose between Calliope, the original finetuned 2.7B GPT-NEO model used in the Alpha, or Sigurd, a finetuned GPT-J 6B model.


 * Trim Trailing Spaces: If a loose space is detected at the end of the text, it will be cleaned up. Trailing whitespace can cause generation inconsistency.


 * Gesture Controls: Enables swipe-based controls for the side tabs and the input box on mobile devices and touchscreens.


 * Trim AI Responses Default: Sets the default on/off state for the Trim AI responses setting.

⬆ Return to Page Top