How Chatzy AI generates responses

When you upload documents or connect data sources, Chatzy AI doesn’t just memorize everything at once.
Instead, it uses a structured process to break down, store, and retrieve information so answers are accurate, relevant, and fast.
This guide explains how answers are generated, why chunking matters, what the context window is, and how you can customize these settings.

How Answers Are Generated

User asks a question
The AI checks your connected knowledge base.
Search and retrieval
Instead of scanning the entire knowledge base, it splits content into smaller pieces (called chunks) and searches only the most relevant ones.
Ranking and filtering
Both semantic (meaning-based) and keyword-based search are used to find the best matches.
Context building
Relevant chunks, past conversation history, and the base system prompt are merged into a single context window.
Answer generation
The AI uses this context to produce the final response.

What Is Chunking

Chunking is the process of splitting text into smaller, meaningful parts.
Without chunking, the AI might miss details or struggle with efficiency.
With chunks that are too small, it may lose context. The goal is the right chunk size: large enough to preserve meaning, small enough to be searchable.

Ideal Chunk Size

Every dataset is unique, but here are general guidelines:

Small chunks (300–600 characters)
- Better precision for fact-based queries
- Risk of missing surrounding context
Medium chunks (800–1,500 characters)
- Balanced precision and context retention
- Recommended default for most documents
Large chunks (2,000+ characters)
- Preserve flow of long narratives (e.g., policies, manuals)
- May include extra irrelevant content

Formula:
No. of chunks × chunk length = context length Example:
3 chunks × 1,500 characters = 4,500 characters of context available for answer generation.

What Is the Context Window

The context window is the maximum amount of text the AI can consider at once.
It includes:

Base prompt (system rules and personality)
Selected knowledge chunks
Conversation history (if enabled)
Your current question

When the limit is reached, older or less relevant text is pushed out. Example:
If the AI has a 16k token context window and you feed it 20k tokens, the last 4k will be cut off.
This is why good chunking and retrieval are crucial.

Customization Options

Chatzy AI allows you to fine-tune how data is processed:

Chunk length and count – Adjust how your data is split
Hybrid search – Combine keyword and semantic matching
Historical context – Control how many past messages are remembered
Max tokens – Limit maximum response length
Token distribution – Decide how much of the window is used for knowledge, history, or reasoning

These settings help you balance accuracy, speed, and cost efficiency.

Best Practices

Use medium-sized chunks (800–1,500 characters) for most cases
Enable hybrid search when your data includes both structured terms and natural language
Keep critical information together in the same chunk (e.g., policy rule with its explanation)
If answers feel too vague, try smaller chunks; if too fragmented, try larger chunks

Summary

Chatzy AI generates answers by retrieving relevant chunks, combining them with conversation context, and producing a natural reply.
Chunking ensures data is manageable and searchable.
The context window defines how much information the AI can process at once.
You can customize settings to optimize accuracy and cost for your specific use case.

By understanding these concepts, you can design knowledge bases that are faster, smarter, and more reliable in answering user queries.

Documentation Index

​How Answers Are Generated

​What Is Chunking

​Ideal Chunk Size

​What Is the Context Window

​Customization Options

​Best Practices

​Summary