WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .cursor/rules/backend-rag-conventions.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
description: Rules for Python backend development, Adalflow RAG pipeline, and API conventions.
globs:
- "api/**/*.py"
alwaysApply: false
---

# Backend & RAG Conventions

This project uses FastAPI and the `adalflow` framework to build a robust RAG pipeline.

## Python & FastAPI Patterns
- **Pydantic Models**: Use Pydantic for all request/response schemas. Define them in `api/api.py` or `api/websocket_wiki.py`.
- **Async Handlers**: Use `async def` for API endpoints and WebSocket handlers to maintain high concurrency.
- **Error Handling**: Use `fastapi.HTTPException` for REST errors. For WebSockets, send a descriptive text message before closing the connection.

## Adalflow & RAG Logic
- **Component Pattern**: The `RAG` class in `api/rag.py` must inherit from `adal.Component`.
- **Memory**: Use the `Memory` class (from `adalflow`) for conversation history management.
- **Data Pipeline**: Use `api/data_pipeline.py` for repository cloning and document parsing. Respect the exclusion filters defined in `repo.json`.
- **Embedders**: Always initialize embedders via `get_embedder` in `api/tools/embedder.py`.
- **Prompts**: Centralize all LLM prompts in `api/prompts.py`. Use templates for dynamic content.

## Vector Database (FAISS)
- Vector indices are managed by `DatabaseManager` in `api/data_pipeline.py`.
- Ensure embedding dimensions are validated before insertion (see `_validate_and_filter_embeddings` in `api/rag.py`).

## Anti-Patterns
- Do not hardcode API keys; use environment variables via `api/config.py`.
- Avoid blocking synchronous calls in async routes (use `run_in_threadpool` if necessary).
- Do not bypass the `DatabaseManager` for file system operations in `~/.adalflow`.
33 changes: 33 additions & 0 deletions .cursor/rules/frontend-conventions.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
description: Rules for Next.js frontend, TypeScript types, and UI component development.
globs:
- "src/**/*.{ts,tsx}"
alwaysApply: false
---

# Frontend & UI Conventions

The frontend is a Next.js application focused on repository visualization and AI interaction.

## Next.js & React Patterns
- **App Router**: Use the `src/app` directory for routing. Dynamic repository routes follow the `[owner]/[repo]` pattern.
- **Server vs. Client**: Use `"use client"` only for components requiring state, effects, or browser APIs (e.g., `Ask.tsx`, `WikiTreeView.tsx`).
- **TypeScript**: Strictly type all component props and API responses. Sync with backend Pydantic models.

## AI & Chat UI
- **WebSockets**: Use `src/utils/websocketClient.ts` for all chat interactions. Do not implement raw WebSocket logic in components.
- **Markdown Rendering**: Use the `Markdown` component for AI responses to support syntax highlighting and Mermaid diagrams.
- **Streaming**: Handle partial message updates in the UI to provide a "typing" effect.

## State & Data
- **Local Storage**: Cache generated wiki structures in `localStorage` to allow instant navigation.
- **i18n**: Use `next-intl` for all user-facing strings. Access via `useTranslations` hook.
- **Context**: Use `LanguageContext` for global language state.

## Styling
- **Tailwind CSS**: Use Tailwind for all styling. Follow the existing design system (dark/light mode support via `next-themes`).
- **Icons**: Use `lucide-react` for consistent iconography.

## Anti-Patterns
- Do not fetch data directly from the backend in components; use the Next.js API proxy routes in `src/app/api/`.
- Avoid large monolithic components; break down complex views like the Wiki page into smaller sub-components.
36 changes: 36 additions & 0 deletions .cursor/rules/project-overview.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
description: General project context, architecture, and tech stack overview.
globs:
- "**/*"
alwaysApply: true
---

# DeepWiki-Open Project Overview

DeepWiki-Open is a full-stack RAG (Retrieval-Augmented Generation) application designed to generate and interact with documentation for software repositories.

## Tech Stack
- **Frontend**: Next.js 15 (App Router), React 19, TypeScript, Tailwind CSS, Mermaid.js.
- **Backend**: FastAPI (Python 3.10+), Adalflow (RAG framework), FAISS (Vector DB).
- **AI**: Provider-agnostic (OpenAI, Google Gemini, Ollama, Azure, OpenRouter, etc.).

## Core Architecture
- **Modular Monolith**: The backend is organized into functional modules within `api/`.
- **RAG Pipeline**: Uses `adalflow` to orchestrate document ingestion, embedding, and retrieval.
- **Communication**: REST for configuration/metadata; WebSockets (`/ws/chat`) for real-time streaming chat.
- **Persistence**:
- Repositories: `~/.adalflow/repos/`
- Vector DBs: `~/.adalflow/databases/`
- Wiki Cache: `~/.adalflow/wikicache/` (JSON)

## Key Directories
- `api/`: FastAPI backend logic.
- `api/tools/`: Embedders and utility clients.
- `src/app/`: Next.js pages and API routes.
- `src/components/`: Reusable React components.
- `src/utils/`: Frontend utilities (WebSocket client, etc.).

## Development Principles
- **Provider Agnostic**: Always use the factory patterns in `api/config.py` and `api/tools/embedder.py` when adding LLM support.
- **Type Safety**: Maintain parity between Pydantic models in `api/` and TypeScript interfaces in `src/types`.
- **Streaming First**: Prioritize streaming responses for AI interactions to improve UX.
1 change: 0 additions & 1 deletion api/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
import json
from datetime import datetime
from pydantic import BaseModel, Field
import google.generativeai as genai
import asyncio

# Configure logging
Expand Down
5 changes: 3 additions & 2 deletions api/config/embedder.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@
"client_class": "GoogleEmbedderClient",
"batch_size": 100,
"model_kwargs": {
"model": "text-embedding-004",
"task_type": "SEMANTIC_SIMILARITY"
"model": "gemini-embedding-001",
"task_type": "SEMANTIC_SIMILARITY",
"output_dimensionality": 3072
}
},
"embedder_bedrock": {
Expand Down
8 changes: 7 additions & 1 deletion api/config/generator.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,15 @@
}
},
"google": {
"default_model": "gemini-2.5-flash",
"default_model": "gemini-3-flash-preview",
"supportsCustomModel": true,
"models": {
"gemini-3-flash-preview": {
"temperature": 1.0,
"top_p": 0.8,
"top_k": 20,
"thinking_level": "high"
},
"gemini-2.5-flash": {
"temperature": 1.0,
"top_p": 0.8,
Expand Down
Loading