Introduction

A well-structured knowledge base is essential for your AI Agents to deliver useful, accurate, and coherent responses. If the documents you upload are poorly structured, incomplete, or hard to interpret, the AI Agent may provide incorrect or fabricated answers. On the other hand, when the information is clear, organized, and intentionally written, the AI Agent better understands the context and delivers higher-quality responses. Creating a solid knowledge base is like training a new team member: you need to explain things clearly, provide well-crafted documents, and ensure they understand them.


Use AI-readable formats

AI can only read digital text. Scanned files or those containing text as an image are not useful, even if they look good visually.

Recommended: .docx, .txt, .md, .pdf (only if the text is selectable)

Examples

Correct: A .docx document exported from Word using heading styles, paragraphs, and lists.

Correct: A plain .txt file with sections separated by titles.

⚠️ Acceptable: A .pdf generated from Word or another digital tool, as long as you can select the text with your mouse.

Incorrect: A scanned PDF from paper or image-based file (even if it looks professional, the AI won’t be able to interpret it).

How can you tell if your file is valid? Open the PDF and try to select and copy the text. If you can’t, it won’t work for the AI. Don’t use screenshots or photos of documents as source files.


Clear and topic-based structure

An effective knowledge base is organized by topic. Don’t cram everything into a single long and ambiguous document. Use titles, subtitles, and segment the content based on its purpose. Do this:

  • Split your content into separate files by theme or department (e.g., “Billing Policies”, “Delivery FAQs”, “Product A Technical Sheet”).
  • Use hierarchical headings (Main title, Subtitle, Key points).

Examples

Correct: A document about warranties with sections like “What’s covered?”, “Duration”, “How to file a claim”.

Incorrect: A file that mixes sales policies, contact info, product specs, and support in one continuous page.

Add a table of contents at the beginning if the document has more than 3 sections. This helps AI Agents quickly navigate the content.


Write with complete context

Each text fragment must make sense on its own. AI often reads only an isolated paragraph, so it shouldn’t rely on titles or previous content. Do this:

  • Write with explicit context.
  • Include product names, locations, dates, or conditions directly in the sentence.

Examples

Correct: “The warranty policy for electric cars is valid in Mexico and lasts 24 months from the purchase date.”

Incorrect: “It has a 2-year coverage” (What has it? Where? Since when?)


Use clear, precise, and consistent language

AI does not interpret ambiguity, jargon, or tangled phrases like a human would. It’s essential to write clearly, avoid unnecessary synonyms, and define all terms. Do this:

Use short and direct sentences.

Stick to the same term for each concept (e.g., don’t switch between “client” and “user”).

Define acronyms the first time you use them.

Examples

Correct: “The CRM (Customer Relationship Management) helps manage all prospects.”

Incorrect: “The lead tool or user system allows…” (confusing and inconsistent)

Run a quick search in your document to ensure you’re consistently using the same key terms. This minimizes semantic errors in AI Agent responses.


Use templates and repeatable formats

When uploading many similar documents (like product sheets, processes, or guides), using a consistent structure helps the model recognize patterns.

Do this:

  • Use fixed templates for each document type.
  • Keep the same order of sections in every file.

Examples

Correct: Each product includes: name, description, specifications, FAQs, support contact.

Incorrect: Each product sheet has a different layout—some missing descriptions or lacking section separation.

Create a base template you can duplicate for every new document. This guarantees consistency effortlessly and speeds up your documentation process.


Include questions and examples

Anticipating the user’s questions helps the model better understand natural language. Including well-written Q&A examples significantly improves the quality of generated responses. Do this:

  • Include a small FAQ section.
  • Write real or representative questions with clear answers.

Examples

Correct:

Q: Do you offer international shipping? A: We currently ship within Mexico only. More countries will be added soon.

Incorrect: Just include “we don’t offer international shipping” out of context or in a bulleted list.


Eliminate redundancy and contradictions

Two files with different versions of the same data can confuse the AI and lead to incorrect answers. The AI won’t “choose the best”, it will mix whatever it finds.

Do this:

  • Centralize key data (pricing, dates, conditions).
  • Keep only the most recent version of each document.

Examples

Correct: A single file titled “Updated Warranty Policies - March 2025.pdf”

Incorrect: Three different files stating the warranty is 6, 12, and 24 months respectively.


Keep your knowledge base up to date

Your agent’s quality also depends on how current your information is. Documents should always be relevant and up to date. Do this:

  • Review your knowledge base monthly or quarterly.
  • Revisit it whenever there are changes to your products or services.
  • Remove outdated files and upload updated versions with clear dates.

Examples

Correct: “Product Catalog - updated April 2025”

Incorrect: Mixed documents with no clarity on which one is current.


What to include by industry

Each industry has key information that should always be present in its documents. Make sure to include these details so your agents have full context:

IndustryKey information to include
EducationEnrollment dates, academic levels, course formats
HealthcareProcedures, schedules, type of services
Real EstateLocation, pricing, contact methods
BillingPayment methods, invoice validity, support contact
AutomotiveAvailable models, warranties, dealership locations
JewelryProduct types, materials, return policies

Example: In the healthcare industry, a document that just says “available in the morning” isn’t enough. It’s better to write: “Dr. Ramirez sees patients at Clínica del Sur from Monday to Friday, 8:00 AM to 2:00 PM.”