LightRAG, Deployment and Usage Guide: The Simplest Tutorial for Building a Local RAG System

RAG Systems LightRAG LLM Development Knowledge Graphs Smart QA AI Tutorial LLM Applications

Nov 20, 2024 7 min read

Cover image for LightRAG, Deployment and Usage Guide: The Simplest Tutorial for Building a Local RAG System

LightRAG: A Lightweight Retrieval-Augmented Generation System

In the current era of rapid Large Language Model (LLM) development, enabling LLMs to better utilize external knowledge has become a key challenge. While Retrieval-Augmented Generation (RAG) technology enhances model performance by incorporating relevant knowledge during generation, traditional RAG systems are often complex and resource-intensive.

LightRAG, developed by the HKU Data Science Lab, offers a lightweight solution. It cleverly combines knowledge graphs with vector retrieval, efficiently processing textual knowledge while capturing structured relationships between information.

Why Choose LightRAG?

Among various RAG systems, LightRAG stands out with these advantages:

Lightweight Design:
- Minimal dependencies, quick deployment
- Supports incremental updates without rebuilding indices
- Low memory footprint, suitable for individual developers
Dual Retrieval Mechanism:
- Vector retrieval captures semantic similarity
- Graph retrieval discovers knowledge relationships
- Smart fusion of both retrieval results
Flexible Model Support:
- Supports mainstream LLMs (OpenAI, HuggingFace, etc.)
- Compatible with open-source embedding models
- Customizable retrieval strategies

Quick Start

1. Installation

Recommended installation via pip:

pip install lightrag-hku

2. Basic Usage Example

Let’s understand LightRAG’s workflow through a practical example:

import os
from lightrag import LightRAG, QueryParam
from lightrag.llm import gpt_4o_mini_complete

# Create working directory
WORKING_DIR = "./my_rag_project"
os.makedirs(WORKING_DIR, exist_ok=True)

# Initialize LightRAG
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=gpt_4o_mini_complete
)

# Prepare sample documents
documents = [
    "Artificial Intelligence (AI) is a branch of computer science dedicated to developing systems that can simulate human intelligence.",
    "Machine Learning is a core AI technology that enables computers to learn and improve from data.",
    "Deep Learning is a subset of machine learning that uses multi-layer neural networks to handle complex problems."
]

# Insert documents
rag.insert(documents)

# Test query
query = "Please explain the relationship between AI, Machine Learning, and Deep Learning"
result = rag.query(query, param=QueryParam(mode="hybrid"))
print(result)

3. Advanced Usage Tips

Using Ollama Models

To use open-source Ollama models, configure as follows:

import os
import logging
from lightrag import LightRAG, QueryParam
from lightrag.llm import ollama_model_complete, ollama_embedding
from lightrag.utils import EmbeddingFunc

# Set logging level
logging.basicConfig(format="%(levelname)s:%(message)s", level=logging.INFO)

# Create working directory
WORKING_DIR = "./my_rag_project"
os.makedirs(WORKING_DIR, exist_ok=True)

# Initialize LightRAG with Ollama model
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=ollama_model_complete,
    llm_model_name="gemma2:2b",  # Use Gemma 2B model
    llm_model_max_async=4,  # Maximum concurrent requests
    llm_model_max_token_size=32768,
    llm_model_kwargs={
        "host": "http://localhost:11434",  # Ollama service address
        "options": {"num_ctx": 32768}  # Context window size
    },
    embedding_func=EmbeddingFunc(
        embedding_dim=768,
        max_token_size=8192,
        func=lambda texts: ollama_embedding(
            texts, 
            embed_model="nomic-embed-text",  # Use nomic-embed-text as embedding model
            host="http://localhost:11434"
        ),
    ),
)

# Insert documents and query
documents = [
    "Artificial Intelligence (AI) is a branch of computer science dedicated to developing systems that can simulate human intelligence.",
    "Machine Learning is a core AI technology that enables computers to learn and improve from data.",
    "Deep Learning is a subset of machine learning that uses multi-layer neural networks to handle complex problems."
]

# Insert documents
rag.insert(documents)

# Query using different retrieval modes
modes = ["naive", "local", "global", "hybrid"]
query = "Please explain the relationship between AI, Machine Learning, and Deep Learning"

for mode in modes:
    print(f"\nResults using {mode} mode:")
    result = rag.query(query, param=QueryParam(mode=mode))
    print(result)

Key advantages of using Ollama models:

Fully open-source, local deployment
Supports multiple open-source models
Customizable model parameters
No API key required

Note: Before using Ollama models, you need to install and start the Ollama service. For detailed installation instructions, refer to the Ollama documentation.

Optimizing Retrieval Performance

LightRAG provides multiple retrieval modes, recommended for different scenarios:

naive: Suitable for simple Q&A
local: Best for context-dependent questions
global: Ideal for questions requiring global knowledge
hybrid: Combines all advantages, recommended as default mode

# Example: Using different retrieval modes for different question types
# Simple fact query
fact_query = "What is machine learning?"
print(rag.query(fact_query, param=QueryParam(mode="naive")))

# Context-dependent question
context_query = "Why is deep learning considered a subset of machine learning?"
print(rag.query(context_query, param=QueryParam(mode="local")))

# Question requiring multiple knowledge points
complex_query = "How has AI technology evolved over time?"
print(rag.query(complex_query, param=QueryParam(mode="hybrid")))

Custom Knowledge Graph

LightRAG allows importing custom knowledge graphs, particularly useful for domain-specific knowledge:

# Build domain knowledge graph
domain_kg = {
    "entities": [
        {
            "entity_name": "Machine Learning",
            "entity_type": "Technology",
            "description": "Method enabling computers to learn from data",
            "source_id": "tech_doc_1"
        },
        {
            "entity_name": "Neural Network",
            "entity_type": "Model",
            "description": "Mathematical model simulating brain structure",
            "source_id": "tech_doc_2"
        }
    ],
    "relationships": [
        {
            "src_id": "Machine Learning",
            "tgt_id": "Neural Network",
            "description": "Contains",
            "keywords": "core technology,basic model",
            "weight": 1.0,
            "source_id": "tech_doc_1"
        }
    ]
}

# Import knowledge graph
rag.insert_custom_kg(domain_kg)

Entity Deletion

LightRAG supports deleting entities from the knowledge base by entity name:

# Initialize LightRAG instance
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=gpt_4o_mini_complete
)

# Delete specified entity
rag.delete_by_entity("entity_name_to_delete")

This feature is particularly useful for:

Removing outdated knowledge
Correcting incorrect information
Maintaining knowledge base accuracy

Multi-file Type Support

LightRAG integrates with the textract library to support various file formats:

import textract
from lightrag import LightRAG

# Initialize LightRAG
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=gpt_4o_mini_complete
)

# Supported file type examples
file_types = {
    'PDF': 'document.pdf',
    'Word': 'document.docx',
    'PowerPoint': 'presentation.pptx',
    'CSV': 'data.csv'
}

# Process different file types
for file_type, file_path in file_types.items():
    try:
        # Extract text content using textract
        text_content = textract.process(file_path)
        
        # Convert binary content to string and insert into LightRAG
        rag.insert(text_content.decode('utf-8'))
        print(f"Successfully processed {file_type} file: {file_path}")
    except Exception as e:
        print(f"Error processing {file_type} file: {e}")

Textract supports major file formats including:

PDF documents (.pdf)
Word documents (.doc, .docx)
PowerPoint presentations (.ppt, .pptx)
Excel spreadsheets (.xls, .xlsx)
CSV files (.csv)
Plain text files (.txt)
RTF documents (.rtf)

Note: Using textract requires installing relevant dependencies. Some systems may need additional system libraries to support specific file formats.

4. Performance Optimization Tips

Document Preprocessing:
- Appropriately chunk long documents
- Maintain text chunk coherence
- Remove irrelevant formatting information
Retrieval Parameter Tuning:
- Adjust top_k parameter to control retrieval quantity
- Balance accuracy and speed based on requirements
- Set appropriate similarity thresholds
System Configuration:
- Choose suitable embedding models
- Adjust cache size based on data scale
- Regularly clean unused index data

5. Knowledge Graph Visualization

LightRAG supports exporting knowledge graphs as interactive HTML pages or importing them into Neo4j for analysis.

HTML Visualization

Use the pyvis library to generate an interactive knowledge graph visualization:

import networkx as nx
from pyvis.network import Network
import random

# Load graph from GraphML file
G = nx.read_graphml("./my_rag_project/graph_chunk_entity_relation.graphml")

# Create Pyvis network
net = Network(height="100vh", notebook=True)

# Convert NetworkX graph to Pyvis network
net.from_nx(G)

# Add random colors to nodes
for node in net.nodes:
    node["color"] = "#{:06x}".format(random.randint(0, 0xFFFFFF))

# Save as HTML file
net.show("knowledge_graph.html")

Export to Neo4j

You can also export the knowledge graph to Neo4j database for deeper analysis:

import os
from neo4j import GraphDatabase
from lightrag.utils import xml_to_json

# Neo4j connection configuration
NEO4J_URI = "bolt://localhost:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "your_password"

# Batch processing parameters
BATCH_SIZE_NODES = 500
BATCH_SIZE_EDGES = 100

def export_to_neo4j():
    # Convert GraphML to JSON format
    xml_file = "./my_rag_project/graph_chunk_entity_relation.graphml"
    json_data = xml_to_json(xml_file)
    
    if not json_data:
        return
    
    # Get nodes and edges data
    nodes = json_data.get("nodes", [])
    edges = json_data.get("edges", [])
    
    # Create Neo4j driver
    driver = GraphDatabase.driver(
        NEO4J_URI, 
        auth=(NEO4J_USERNAME, NEO4J_PASSWORD)
    )
    
    try:
        with driver.session() as session:
            # Batch create nodes
            session.execute_write(
                lambda tx: tx.run("""
                    UNWIND $nodes AS node
                    MERGE (e:Entity {id: node.id})
                    SET e.entity_type = node.entity_type,
                        e.description = node.description,
                        e.source_id = node.source_id,
                        e.displayName = node.id
                    WITH e, node
                    CALL apoc.create.addLabels(e, [node.entity_type]) 
                    YIELD node AS labeledNode
                    RETURN count(*)
                """, {"nodes": nodes[:BATCH_SIZE_NODES]})
            )
            
            # Batch create relationships
            session.execute_write(
                lambda tx: tx.run("""
                    UNWIND $edges AS edge
                    MATCH (source {id: edge.source})
                    MATCH (target {id: edge.target})
                    WITH source, target, edge,
                    CASE
                        WHEN edge.keywords CONTAINS 'lead' THEN 'lead'
                        WHEN edge.keywords CONTAINS 'participate' THEN 'participate'
                        WHEN edge.keywords CONTAINS 'uses' THEN 'uses'
                        ELSE SPLIT(edge.keywords, ',')[0]
                    END AS relType
                    CALL apoc.create.relationship(source, relType, {
                        weight: edge.weight,
                        description: edge.description,
                        keywords: edge.keywords,
                        source_id: edge.source_id
                    }, target) YIELD rel
                    RETURN count(*)
                """, {"edges": edges[:BATCH_SIZE_EDGES]})
            )
            
    finally:
        driver.close()

if __name__ == "__main__":
    export_to_neo4j()

Practical Application Cases

LightRAG has demonstrated excellent performance in multiple domains:

Customer Service QA Systems:
- Rapid response to user inquiries
- Accurate product knowledge extraction
- Support for multi-turn dialogue
Document Retrieval Systems:
- Intelligent long document summarization
- Cross-document knowledge association
- Precise key information location
Knowledge Base Management:
- Automatic knowledge graph construction
- Knowledge relationship discovery
- Support for knowledge updates