A Production-Grade Retrieval-Augmented Generation Chatbot for Zero-Knowledge Proofs, Noir, and Tornado Cash
ZK Contextual RAG Bot is a production-grade Retrieval-Augmented Generation (RAG) chatbot that combines cutting-edge LLM technology with vector-based document retrieval. It provides instant, accurate answers to questions about Zero-Knowledge Proofs, Noir programming language, and Tornado Cash privacy protocols.
The bot uses LangChain Expression Language (LCEL) with modern imports and ChromaDB for efficient semantic search, making it ideal for knowledge-intensive applications.
- π― Intelligent RAG Pipeline: Combines LLM reasoning with document retrieval for accurate, contextual answers
- β‘ Production-Ready LCEL: Modern LangChain Expression Language with simplified, stable imports
- π Privacy-Focused Content: Specialized knowledge base on ZKPs, Noir, and Tornado Cash
- πΎ Vector Database: ChromaDB for fast semantic similarity search
- π¬ Conversational Memory: Maintains chat history for coherent multi-turn conversations
- π Cloud-Ready: Easily deployable on Google Colab with ngrok tunneling
- π¨ User-Friendly UI: Clean Streamlit interface with real-time feedback
- π Efficient Retrieval: Top-3 document retrieval with configurable parameters
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Query (Streamlit UI) β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β Query Processing & Context β
β (with Chat History) β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β Generate Embeddings β
β (OpenAI Embeddings) β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β Vector Similarity Search β
β (ChromaDB Retrieval) β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β LCEL Chain Execution β
β (RunnablePassthrough) β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β Format Prompt with Contextβ
β + Chat History β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β GPT-4 Response Generation β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββ
β Display Response (UI) β
β Store in Memory β
ββββββββββββββββββββββββββββββ
- Python 3.8 or higher
- OpenAI API Key
- ngrok token (for cloud deployment)
- Google Colab account (optional, for cloud hosting)
# Clone the repository
git clone https://github.com/solo938/ZKWhisper.git
cd ZKWhisper
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
echo "OPENAI_API_KEY=your_api_key_here" > .env
# Run the application
streamlit run app.py!pip install streamlit langchain langchain-community langchain-openai chromadb openai tiktoken pyngrok pysqlite3-binary -qknowledge_base_content = """
# Zero-Knowledge Proofs, Noir, and Tornado Cash Knowledge Base
# [Your knowledge base content here]
"""
with open("knowledge_base.md", "w") as f:
f.write(knowledge_base_content)
print("β
Knowledge base created successfully!")!streamlit run app.py &from pyngrok import ngrok
ngrok.set_auth_token("YOUR_NGROK_TOKEN")
ngrok.kill()
public_url = ngrok.connect(8501)
print(f"π View your RAG Bot here: {public_url}")
!streamlit run app.py &# OpenAI Configuration
OPENAI_API_KEY=sk-your-api-key-here
# Database Configuration
CHROMA_DB_PATH=./chroma_db
# Application Settings
MODEL_NAME=gpt-4
TEMPERATURE=0
MAX_TOKENS=2048
# Retrieval Settings
TOP_K_RESULTS=3
CHUNK_SIZE=1000
CHUNK_OVERLAP=200import os
from dotenv import load_dotenv
load_dotenv()
class Config:
# API Keys
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
# Database
CHROMA_DB_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_db")
# LLM Settings
MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4")
TEMPERATURE = float(os.getenv("TEMPERATURE", 0))
MAX_TOKENS = int(os.getenv("MAX_TOKENS", 2048))
# Retrieval Settings
TOP_K_RESULTS = int(os.getenv("TOP_K_RESULTS", 3))
CHUNK_SIZE = int(os.getenv("CHUNK_SIZE", 1000))
CHUNK_OVERLAP = int(os.getenv("CHUNK_OVERLAP", 200))from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
loader = TextLoader("knowledge_base.md")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
splits = splitter.split_documents(docs)from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(
splits,
embeddings,
persist_directory="./chroma_db"
)
vectorstore.persist()from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model_name="gpt-4", temperature=0)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
RunnablePassthrough.assign(
context=lambda x: format_docs(retriever.get_relevant_documents(x["input"]))
)
| qa_prompt
| llm
)from langchain_core.messages import HumanMessage, AIMessage
formatted_chat_history = []
for msg in st.session_state.messages[:-1]:
if msg["role"] == "user":
formatted_chat_history.append(HumanMessage(content=msg["content"]))
elif msg["role"] == "assistant":
formatted_chat_history.append(AIMessage(content=msg["content"]))# User asks a question
question = "What are Zero-Knowledge Proofs?"
# The bot retrieves relevant documents and generates a response
response = rag_chain.invoke({
"input": question,
"chat_history": chat_history
})# Question 1
user_input_1 = "Explain Noir programming language"
response_1 = rag_chain.invoke({
"input": user_input_1,
"chat_history": []
})
# Question 2 (with context from previous)
user_input_2 = "How is it used for ZKPs?"
response_2 = rag_chain.invoke({
"input": user_input_2,
"chat_history": [
HumanMessage(content=user_input_1),
AIMessage(content=response_1.content)
]
})# Run all tests
pytest tests/
# Run specific test file
pytest tests/test_rag.py -v
# Run with coverage
pytest tests/ --cov=. --cov-report=html
# Run integration tests
pytest tests/test_integration.py -v- Uses ngrok for public tunneling
- No infrastructure setup needed
- Free tier available
- See Colab section above
# Push to GitHub
git push origin main
# Connect repo to Streamlit Cloud
# Visit: https://share.streamlit.io# Build Docker image
docker build -t zkwhisper:latest .
# Run container
docker run -p 8501:8501 \
-e OPENAI_API_KEY=your_key \
zkwhisper:latestDeploy via cloud provider's container or serverless services.
| Metric | Value | Notes |
|---|---|---|
| Document Load Time | ~500ms | Cached after first run |
| Embedding Generation | ~1-2s | Per query |
| Retrieval Time | ~100ms | ChromaDB similarity search |
| LLM Response Time | ~5-15s | Depends on GPT-4 load |
| Total Response Time | ~8-20s | End-to-end |
- Cache Results: Use
@st.cache_resourcefor expensive operations - Reduce Chunk Size: Smaller chunks = faster retrieval
- Limit Top-K: Reduce
TOP_K_RESULTSfrom 3 to 1-2 - Use gpt-3.5-turbo: Faster & cheaper than gpt-4
- Batch Requests: Group multiple queries together
Solution: This project uses modern LCEL with RunnablePassthrough. The old chain imports don't work in newer LangChain versions. Ensure you're using the latest code.
Solution: The pysqlite3 fix at the top of app.py resolves this:
import pysqlite3
import sys
sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')Solution: Verify your OpenAI API key is valid and has sufficient credits. Check .env file format.
Solution:
- Reduce
CHUNK_SIZEin config - Switch to
gpt-3.5-turbo - Reduce
TOP_K_RESULTSto 1-2 - Enable caching
Solution:
- Verify ngrok token is correct
- Run
ngrok.kill()to close existing tunnels - Wait 10-15 seconds for new tunnel to establish
See TROUBLESHOOTING.md for more solutions.
- Support for PDF document uploads
- Multi-language support
- Fine-tuned models for domain-specific QA
- Real-time collaboration features
- Advanced filtering and faceted search
- Response caching and optimization
- Web3 integration (wallet authentication)
- Mobile app (React Native)
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit changes (
git commit -m 'Add AmazingFeature') - Push to branch (
git push origin feature/AmazingFeature) - Open a Pull Request
# Install development dependencies
pip install -r requirements-dev.txt
# Run code quality checks
black . && flake8 . && isort .
# Run tests before committing
pytest tests/- Setup Guide - Detailed installation instructions
- Architecture - Technical deep dive
- API Reference - Complete API documentation
- Troubleshooting - Common issues & solutions
- Deployment Guide - Production deployment
This project is licensed under the MIT License - see LICENSE file for details.
- LangChain - The orchestration framework
- Streamlit - The web UI framework
- ChromaDB - Vector database
- OpenAI - LLM provider
- Zero-Knowledge Proof research community
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Twitter: @solo938
β
Educational: Learn about ZKPs, Noir, and privacy protocols
β
Research: Quick reference for cryptographic concepts
β
Development: Q&A assistant for builders
β
Documentation: Always-available knowledge base
β
Integration: Embed RAG capabilities in your apps