SimpleRAG - Advanced

Current Configuration

Setting	Value
RAG Mode	Normal RAG
Embedding Model	Gemini Embedding API
Vector Database	Qdrant Cloud
Document Collection	simple_rag_docs
Graph Collection	simple_rag_graph
Preferred LLM	claude
Chunk Size	1000
Chunk Overlap	200
Results Count (Top K)	5
API Rate Limit	60 calls per minute
Embedding Cache	Enabled

RAG Mode Comparison

📚 Normal RAG

Speed: Fast processing
Accuracy: Good for direct facts
Use Case: Simple Q&A, document search
Storage: Single vector collection
Processing: Chunking + embedding

🕸️ Graph RAG

Speed: Slower, more thorough
Accuracy: Excellent for relationships
Use Case: Complex reasoning, connections
Storage: Two collections (docs + graph)
Processing: Entity extraction + graph building

Performance Features

1. Rate Limiting

SimpleRAG implements rate limiting for API calls to prevent exceeding API provider quotas and avoid service interruptions. This is especially important for Graph RAG which makes additional API calls for entity extraction.

2. Embedding Cache

To improve performance and reduce API costs, SimpleRAG caches embeddings locally. This is particularly beneficial for Graph RAG where similar entities might be processed multiple times.

3. Progress Tracking

For long-running operations like Graph RAG indexing, SimpleRAG provides detailed real-time progress indicators showing entity extraction, relationship mapping, and graph building status.

4. Dual Collection Storage

Graph RAG uses two Qdrant collections: one for document chunks and another for graph elements (entities and relationships), enabling hybrid search strategies.

How It Works

Normal RAG Process

Document parsing and text extraction
Split into overlapping chunks
Generate embeddings using Gemini API
Store in Qdrant vector database
Query with semantic similarity search
Generate answer with Claude LLM

Graph RAG Process

Document parsing and text extraction
Split into larger, context-rich chunks
Extract entities and relationships using Gemini
Build knowledge graph with NetworkX
Generate embeddings for graph elements
Store both docs and graph in Qdrant
Query both collections for hybrid results
Generate context-aware answer with Claude

Graph RAG Technical Details

Entity Extraction

Uses Gemini Pro to identify and categorize entities (PERSON, ORGANIZATION, CONCEPT, LOCATION, EVENT) from document text with descriptions and relationships.

Knowledge Graph Construction

Builds a NetworkX graph where entities are nodes and relationships are edges, enabling graph traversal and neighborhood analysis for enhanced context retrieval.

Hybrid Search Strategy

Combines traditional semantic search of document chunks with graph-based entity and relationship retrieval, providing both direct facts and contextual connections.

Enhanced Prompting

Graph RAG generates specialized prompts that include both document context and relevant entities/relationships, enabling more sophisticated reasoning and answer generation.

About Enhanced SimpleRAG

Enhanced SimpleRAG extends the original system with Graph RAG capabilities, providing two complementary approaches to document Q&A:

Normal RAG: Fast, efficient semantic search perfect for direct factual queries
Graph RAG: Advanced knowledge graph reasoning ideal for understanding relationships and complex connections

Both modes use the same Gemini API key for embeddings and Claude for answer generation, with Qdrant storing the vector representations. The system automatically handles the complexity of entity extraction, graph construction, and hybrid retrieval strategies.

This dual-mode approach ensures you get the best of both worlds: speed when you need it, and depth when your queries demand sophisticated reasoning about relationships and connections in your documents.

Advanced Information