Documentation Index
Fetch the complete documentation index at: https://docs.chatzy.ai/llms.txt
Use this file to discover all available pages before exploring further.
Chunking
Chunking
Chunking is the process of splitting large documents or datasets into smaller, manageable pieces called chunks.
Chunking ensures that long documents remain searchable, indexable, and retrievable.
Why Chunking Matters
AI models can only process a limited number of tokens at once.Chunking ensures that long documents remain searchable, indexable, and retrievable.
Best Practices
- Smaller chunks → Higher precision (AI focuses better) but may lose broader context
- Larger chunks → Preserve context but may retrieve less relevant information
- The right balance depends on your data and use case
Summary
Chunking makes your data searchable and structured, preparing it for retrieval systems like RAG.RAG (Retrieval-Augmented Generation)
RAG (Retrieval-Augmented Generation)
RAG (Retrieval-Augmented Generation) improves chatbot accuracy by combining knowledge retrieval with LLM generation.
Instead of relying only on the model’s memory, responses are grounded in your uploaded data.
✅ Grounds answers in your data
✅ Works well for large knowledge bases
✅ Configurable retrieval behavior
✅ More accurate and trustworthy responses
In short: Chunking makes your data searchable, and RAG ensures the model uses the right information at the right time.
Instead of relying only on the model’s memory, responses are grounded in your uploaded data.
How RAG Works in Chatzy
1) Knowledge Base Upload & Processing
- Users upload their Knowledge Base in the Data section
- The system:
- Splits content into chunks
- Generates vector embeddings
- Stores them in the database
- Number of chunks
- Chunk length
2) Query Understanding
When a user sends a message:- The system checks if KB(knowledge base) sources are available
- A lightweight GPT model analyzes:
- User message
- Conversation history
- It generates a search query based on user intent
null.3) Retrieval Step
If query is null- No KB context is attached
- The LLM answers normally
- Vector search is performed on the KB
- Top N relevant chunks are retrieved
4) Hybrid Search (Optional)
Hybrid Search combines similarity search and text (keyword) search into a final relevancy score.How it works:- Extra chunks are fetched as a broad match
- This step may include some noise
- The system then reranks results
- Only the top N most relevant chunks are kept
- Better precision
- Less irrelevant context
- Higher response quality
5) Generation Step
- Retrieved chunks are combined
- Added to the system prompt as context
- Sent to the LLM with instructions
RAG Process Flow
Benefits of RAG
✅ Reduces hallucinations✅ Grounds answers in your data
✅ Works well for large knowledge bases
✅ Configurable retrieval behavior
✅ More accurate and trustworthy responses
In short: Chunking makes your data searchable, and RAG ensures the model uses the right information at the right time.
Advance Settings for Conversational AI Agent
Advance Settings for Conversational AI Agent
Location Request
If your chatbot needs the user’s live location during the conversation, you can instruct it to send a location request button. This is useful for scenarios such as booking a service visit, delivery verification, or assigning the nearest agent.Important:You must clearly mention the use case inside your Base Prompt, explaining when the bot should send this JSON.Use this when:
- You need the user’s address or live location for delivery, service visits, event check-ins, etc.
Call Permission Request
Use this when your AI agent needs to ask the user for permission to place a WhatsApp call (required for business-initiated calls).Why it’s needed:- WhatsApp requires explicit user consent before initiating an outbound call.
- Sales follow-ups
- Demo or consultation calls
- Appointment confirmation calls
Specify when the bot should request permission inside your Base Prompt along with json so it triggers appropriately.(e.g., “If the user requests a callback, use ONLY the following for call permission request”).JSON format:
WhatsApp Call Button
This option allows the agent to display a Call Now button directly inside the conversation.Use this when:- You want the user to start the call themselves (no approval template required).
- Ideal for support lines, quick escalation, or urgent help.
In your Base Prompt, describe in which situations this button should appear (e.g., “If the user asks to talk to support, use ONLY the following for call button”).JSON format: