Knowledge Base & RAG

Overview

In ZGI, the knowledge base is the core module that connects enterprise private data and large model applications. It uses RAG (Retrieval Augmented Generation) to transform unstructured content such as documents, FAQs, systems, product information, and research reports into searchable, recallable, and manageable knowledge assets, allowing agents and workflows to answer questions based on reliable context, rather than relying solely on the model’s own memory.

Core Value Turn enterprise documents into a contextual layer that can be retrieved and referenced by AI, improving answer accuracy, reducing hallucinations, and supporting ongoing maintenance by workspace, folder, document, and segment.

core concepts

concept	illustrate
knowledge base	Knowledge collections established for a certain business field or data source can be managed by workspace or classified by folders
document	When importing raw data into the knowledge base, the system will record the language, word count, number of hits, number of segments, index status and file metadata.
Segmentation	The smallest recall unit after the document has been cleaned and divided, which can be enabled, disabled, edited, deleted, or imported in batches
Sub-segment	Used for more fine-grained content maintenance, suitable for long paragraphs, hierarchical materials and scenarios that require fine recall
Retrieve configuration	Supports parameter settings such as semantic retrieval, graph retrieval, Top K, score threshold, Rerank rearrangement, etc.
Hit Test	Validate recall results with real issues before releasing the app, supporting single testing, external knowledge base testing, batch testing and test reporting

RAG workflow

Prepare data

Create a knowledge base and upload documents, or continue to add information to an existing knowledge base.

Parsing and cleaning

The system parses, cleans, and segments files, and generates indexing tasks.

Vectorization and indexing

Use the configured Embedding model to generate vectors and write vector indexes; when GraphFlow is enabled, it will also enter the graph indexing process such as extraction, alignment, and storage.

Search and reorder

After the user asks a question, the system recalls candidate fragments according to the retrieval configuration, and can be combined with the Rerank model to improve relevance ranking.

generate answer

The agent or workflow passes the retrieved context to the LLM to generate an informed answer.

Support capabilities

ability	illustrate
Knowledge Base Management	Create, query, update, and delete knowledge bases; view the knowledge base list by organization or workspace
Folder Management	Create, edit, delete knowledge base folders, and support moving the knowledge base to specified folders
Document Management	Upload documents, view details and metadata, enable/disable in batches, archive/unarchive, and delete in batches
Segmented Maintenance	View segments, edit content, batch import, enable/disable, delete; view keywords, number of Tokens, and number of hits
Issue Enhancement	Add FAQs to segments, support question generation and batch import, and improve the stability of FAQ scenario recall
Retrieval Test	Support hit-testing, external knowledge base testing, asynchronous batch testing, test records and reports
Map Capabilities	Supports knowledge graph data viewing and graph retrieval, which can be used for knowledge question and answer with stronger entity relationships.

How to build a knowledge base

Enter the “Knowledge Base” module of the console and click New Knowledge Base.
Fill in the name, description, workspace and icon, and select the data source type.
Configure the indexing method, Embedding model and retrieval parameters; if you need entity relationship retrieval, you can enable GraphFlow.
Upload files or import existing files and wait for parsing, cleaning, segmentation, and indexing to be completed.
Go to the document details page to check the quality of the segments, and if necessary, edit the segments, add segmentation questions or sub-segments.
Enter a real business problem in “Recall Test” and check whether the Top K, threshold and Rerank configurations are as expected.
Connect the knowledge base to the agent or workflow and use it as a context retrieval source in the application.

Maintenance and optimization suggestions

Split the knowledge base by business topic to avoid mixing completely unrelated knowledge in the same index
Prioritize cleaning the table of contents, headers and footers, duplicate disclaimers and invalid tables in the source document to reduce noise
Establish an update person responsible for time-sensitive content such as policies, product specifications, prices, terms of service, etc., and archive old documents regularly
Use segmented questions to enhance fixed Q&A scenarios such as FAQs, regulations, product descriptions, etc.
Cover high-frequency questions, boundary questions and synonymous questions through batch testing to avoid judging the effect based on a single test sample
Increase the score threshold or narrow the knowledge base when the hit result is too wide; lower the threshold, increase Top K or optimize segmentation when the recall is insufficient

Knowledge graph enhanced retrieval

The ceiling of traditional RAG is “semantic similarity” - vector searches can only find passages with similar wording. But in enterprise scenarios, a large number of issues are related: Company A’s contract → terms → liquidated damages → related persons. What this requires is not semantic matching but relational reasoning. ZGI automatically extracts entities (company, person name, contract number, amount, date) when importing documents, establishes relationship edges, and stores them in the graph database. When user queries involve multi-entity relationships, the graph path directly gives complete and accurate results.

Contrast Dimensions	Pure vector search	ZGI map + vector
semantic matching	? Good at	? Also supported
Related query	? Unable to process	? Graph traversal reasoning
Cross-document associations	? Isolation between documents	? Unified Entity Graph
Relevant question recall rate	70-85%	99%+

Application scenarios

scene	Example
Internal knowledge Q&A	When an employee asks “What is the company’s reimbursement process?”, the system recalls the corresponding terms from the system document and generates an answer.
Intelligent customer service and FAQ automation	Customer service robots answer customer inquiries based on product descriptions, after-sales policies and FAQs
R&D and Operation Assistant	Make interface documents, deployment manuals, and fault records into a knowledge base to assist in troubleshooting
Sales and solution support	Retrieve product information, industry cases and quotation rules, and quickly generate customer communication materials
Compliance and Legal Search	Search contract templates, compliance systems, audit instructions, and assist in locating key terms
Privatized Deployment Q&A	Run RAG in the local model and private vector library environment to meet the requirement that the data does not leave the domain

Advantages of ZGI Knowledge Base

Multi-level governance from knowledge base, documents, segments to sub-segments, suitable for long-term operations
Retrieval configuration visualization, supporting semantic retrieval, graph retrieval, threshold control and Rerank
Linked to the Model Network, model dependencies such as Embedding, Rerank, and LLM can be managed in a unified manner
Supports Permission Control, knowledge base viewing, management, recall testing, folder management and locking can be authorized by role
Complete test closed loop, supports single test, batch test, test history and reports, making it easy to evaluate quality before going online

Documentation Index

​Overview

​core concepts

​RAG workflow

​Prepare data

​Parsing and cleaning

​Vectorization and indexing

​Search and reorder

​generate answer

​Support capabilities

​How to build a knowledge base

​Maintenance and optimization suggestions

​Knowledge graph enhanced retrieval

​Application scenarios

​Advantages of ZGI Knowledge Base

​Knowledge base demo video