diff --git a/.cursor/rules/backend-rag-conventions.mdc b/.cursor/rules/backend-rag-conventions.mdc new file mode 100644 index 00000000..dbf712ff --- /dev/null +++ b/.cursor/rules/backend-rag-conventions.mdc @@ -0,0 +1,31 @@ +--- +description: Rules for Python backend development, Adalflow RAG pipeline, and API conventions. +globs: + - "api/**/*.py" +alwaysApply: false +--- + +# Backend & RAG Conventions + +This project uses FastAPI and the `adalflow` framework to build a robust RAG pipeline. + +## Python & FastAPI Patterns +- **Pydantic Models**: Use Pydantic for all request/response schemas. Define them in `api/api.py` or `api/websocket_wiki.py`. +- **Async Handlers**: Use `async def` for API endpoints and WebSocket handlers to maintain high concurrency. +- **Error Handling**: Use `fastapi.HTTPException` for REST errors. For WebSockets, send a descriptive text message before closing the connection. + +## Adalflow & RAG Logic +- **Component Pattern**: The `RAG` class in `api/rag.py` must inherit from `adal.Component`. +- **Memory**: Use the `Memory` class (from `adalflow`) for conversation history management. +- **Data Pipeline**: Use `api/data_pipeline.py` for repository cloning and document parsing. Respect the exclusion filters defined in `repo.json`. +- **Embedders**: Always initialize embedders via `get_embedder` in `api/tools/embedder.py`. +- **Prompts**: Centralize all LLM prompts in `api/prompts.py`. Use templates for dynamic content. + +## Vector Database (FAISS) +- Vector indices are managed by `DatabaseManager` in `api/data_pipeline.py`. +- Ensure embedding dimensions are validated before insertion (see `_validate_and_filter_embeddings` in `api/rag.py`). + +## Anti-Patterns +- Do not hardcode API keys; use environment variables via `api/config.py`. +- Avoid blocking synchronous calls in async routes (use `run_in_threadpool` if necessary). +- Do not bypass the `DatabaseManager` for file system operations in `~/.adalflow`. \ No newline at end of file diff --git a/.cursor/rules/frontend-conventions.mdc b/.cursor/rules/frontend-conventions.mdc new file mode 100644 index 00000000..82181932 --- /dev/null +++ b/.cursor/rules/frontend-conventions.mdc @@ -0,0 +1,33 @@ +--- +description: Rules for Next.js frontend, TypeScript types, and UI component development. +globs: + - "src/**/*.{ts,tsx}" +alwaysApply: false +--- + +# Frontend & UI Conventions + +The frontend is a Next.js application focused on repository visualization and AI interaction. + +## Next.js & React Patterns +- **App Router**: Use the `src/app` directory for routing. Dynamic repository routes follow the `[owner]/[repo]` pattern. +- **Server vs. Client**: Use `"use client"` only for components requiring state, effects, or browser APIs (e.g., `Ask.tsx`, `WikiTreeView.tsx`). +- **TypeScript**: Strictly type all component props and API responses. Sync with backend Pydantic models. + +## AI & Chat UI +- **WebSockets**: Use `src/utils/websocketClient.ts` for all chat interactions. Do not implement raw WebSocket logic in components. +- **Markdown Rendering**: Use the `Markdown` component for AI responses to support syntax highlighting and Mermaid diagrams. +- **Streaming**: Handle partial message updates in the UI to provide a "typing" effect. + +## State & Data +- **Local Storage**: Cache generated wiki structures in `localStorage` to allow instant navigation. +- **i18n**: Use `next-intl` for all user-facing strings. Access via `useTranslations` hook. +- **Context**: Use `LanguageContext` for global language state. + +## Styling +- **Tailwind CSS**: Use Tailwind for all styling. Follow the existing design system (dark/light mode support via `next-themes`). +- **Icons**: Use `lucide-react` for consistent iconography. + +## Anti-Patterns +- Do not fetch data directly from the backend in components; use the Next.js API proxy routes in `src/app/api/`. +- Avoid large monolithic components; break down complex views like the Wiki page into smaller sub-components. \ No newline at end of file diff --git a/.cursor/rules/project-overview.mdc b/.cursor/rules/project-overview.mdc new file mode 100644 index 00000000..89106306 --- /dev/null +++ b/.cursor/rules/project-overview.mdc @@ -0,0 +1,36 @@ +--- +description: General project context, architecture, and tech stack overview. +globs: + - "**/*" +alwaysApply: true +--- + +# DeepWiki-Open Project Overview + +DeepWiki-Open is a full-stack RAG (Retrieval-Augmented Generation) application designed to generate and interact with documentation for software repositories. + +## Tech Stack +- **Frontend**: Next.js 15 (App Router), React 19, TypeScript, Tailwind CSS, Mermaid.js. +- **Backend**: FastAPI (Python 3.10+), Adalflow (RAG framework), FAISS (Vector DB). +- **AI**: Provider-agnostic (OpenAI, Google Gemini, Ollama, Azure, OpenRouter, etc.). + +## Core Architecture +- **Modular Monolith**: The backend is organized into functional modules within `api/`. +- **RAG Pipeline**: Uses `adalflow` to orchestrate document ingestion, embedding, and retrieval. +- **Communication**: REST for configuration/metadata; WebSockets (`/ws/chat`) for real-time streaming chat. +- **Persistence**: + - Repositories: `~/.adalflow/repos/` + - Vector DBs: `~/.adalflow/databases/` + - Wiki Cache: `~/.adalflow/wikicache/` (JSON) + +## Key Directories +- `api/`: FastAPI backend logic. +- `api/tools/`: Embedders and utility clients. +- `src/app/`: Next.js pages and API routes. +- `src/components/`: Reusable React components. +- `src/utils/`: Frontend utilities (WebSocket client, etc.). + +## Development Principles +- **Provider Agnostic**: Always use the factory patterns in `api/config.py` and `api/tools/embedder.py` when adding LLM support. +- **Type Safety**: Maintain parity between Pydantic models in `api/` and TypeScript interfaces in `src/types`. +- **Streaming First**: Prioritize streaming responses for AI interactions to improve UX. \ No newline at end of file diff --git a/api/api.py b/api/api.py index d40e73f9..4decd724 100644 --- a/api/api.py +++ b/api/api.py @@ -7,7 +7,6 @@ import json from datetime import datetime from pydantic import BaseModel, Field -import google.generativeai as genai import asyncio # Configure logging diff --git a/api/config/embedder.json b/api/config/embedder.json index 0101ac08..b87e01c7 100644 --- a/api/config/embedder.json +++ b/api/config/embedder.json @@ -18,8 +18,9 @@ "client_class": "GoogleEmbedderClient", "batch_size": 100, "model_kwargs": { - "model": "text-embedding-004", - "task_type": "SEMANTIC_SIMILARITY" + "model": "gemini-embedding-001", + "task_type": "SEMANTIC_SIMILARITY", + "output_dimensionality": 3072 } }, "embedder_bedrock": { diff --git a/api/config/generator.json b/api/config/generator.json index f8817909..a6c20e61 100644 --- a/api/config/generator.json +++ b/api/config/generator.json @@ -20,9 +20,15 @@ } }, "google": { - "default_model": "gemini-2.5-flash", + "default_model": "gemini-3-flash-preview", "supportsCustomModel": true, "models": { + "gemini-3-flash-preview": { + "temperature": 1.0, + "top_p": 0.8, + "top_k": 20, + "thinking_level": "high" + }, "gemini-2.5-flash": { "temperature": 1.0, "top_p": 0.8, diff --git a/api/google_embedder_client.py b/api/google_embedder_client.py index b604fd8e..54da4049 100644 --- a/api/google_embedder_client.py +++ b/api/google_embedder_client.py @@ -1,4 +1,4 @@ -"""Google AI Embeddings ModelClient integration.""" +"""Google AI Embeddings ModelClient integration using the new google-genai SDK.""" import os import logging @@ -9,10 +9,10 @@ from adalflow.core.types import ModelType, EmbedderOutput try: - import google.generativeai as genai - from google.generativeai.types.text_types import EmbeddingDict, BatchEmbeddingDict + from google import genai + from google.genai import types as genai_types except ImportError: - raise ImportError("google-generativeai is required. Install it with 'pip install google-generativeai'") + raise ImportError("google-genai is required. Install it with 'pip install google-genai'") log = logging.getLogger(__name__) @@ -20,9 +20,9 @@ class GoogleEmbedderClient(ModelClient): __doc__ = r"""A component wrapper for Google AI Embeddings API client. - This client provides access to Google's embedding models through the Google AI API. - It supports text embeddings for various tasks including semantic similarity, - retrieval, and classification. + This client provides access to Google's embedding models through the Google AI API + using the new google-genai SDK. It supports text embeddings for various tasks + including semantic similarity, retrieval, and classification. Args: api_key (Optional[str]): Google AI API key. Defaults to None. @@ -39,15 +39,16 @@ class GoogleEmbedderClient(ModelClient): embedder = adal.Embedder( model_client=client, model_kwargs={ - "model": "text-embedding-004", - "task_type": "SEMANTIC_SIMILARITY" + "model": "gemini-embedding-001", + "task_type": "SEMANTIC_SIMILARITY", + "output_dimensionality": 3072 } ) ``` References: - Google AI Embeddings: https://ai.google.dev/gemini-api/docs/embeddings - - Available models: text-embedding-004, embedding-001 + - Available models: gemini-embedding-001, text-embedding-004 """ def __init__( @@ -56,7 +57,7 @@ def __init__( env_api_key_name: str = "GOOGLE_API_KEY", ): """Initialize Google AI Embeddings client. - + Args: api_key: Google AI API key. If not provided, uses environment variable. env_api_key_name: Name of environment variable containing API key. @@ -64,6 +65,7 @@ def __init__( super().__init__() self._api_key = api_key self._env_api_key_name = env_api_key_name + self._client = None self._initialize_client() def _initialize_client(self): @@ -73,58 +75,35 @@ def _initialize_client(self): raise ValueError( f"Environment variable {self._env_api_key_name} must be set" ) - genai.configure(api_key=api_key) + self._client = genai.Client(api_key=api_key) def parse_embedding_response(self, response) -> EmbedderOutput: """Parse Google AI embedding response to EmbedderOutput format. - + Args: - response: Google AI embedding response (EmbeddingDict or BatchEmbeddingDict) - + response: Google AI embedding response from embed_content + Returns: EmbedderOutput with parsed embeddings """ try: from adalflow.core.types import Embedding - + embedding_data = [] - - if isinstance(response, dict): - if 'embedding' in response: - embedding_value = response['embedding'] - if isinstance(embedding_value, list) and len(embedding_value) > 0: - # Check if it's a single embedding (list of floats) or batch (list of lists) - if isinstance(embedding_value[0], (int, float)): - # Single embedding response: {'embedding': [float, ...]} - embedding_data = [Embedding(embedding=embedding_value, index=0)] - else: - # Batch embeddings response: {'embedding': [[float, ...], [float, ...], ...]} - embedding_data = [ - Embedding(embedding=emb_list, index=i) - for i, emb_list in enumerate(embedding_value) - ] + + # New SDK returns response with .embeddings attribute + # Each embedding has .values which is a list of floats + if hasattr(response, 'embeddings') and response.embeddings: + for i, emb in enumerate(response.embeddings): + if hasattr(emb, 'values'): + embedding_data.append(Embedding(embedding=list(emb.values), index=i)) + elif isinstance(emb, (list, tuple)): + embedding_data.append(Embedding(embedding=list(emb), index=i)) else: - log.warning(f"Empty or invalid embedding data: {embedding_value}") - embedding_data = [] - elif 'embeddings' in response: - # Alternative batch format: {'embeddings': [{'embedding': [float, ...]}, ...]} - embedding_data = [ - Embedding(embedding=item['embedding'], index=i) - for i, item in enumerate(response['embeddings']) - ] - else: - log.warning(f"Unexpected response structure: {response.keys()}") - embedding_data = [] - elif hasattr(response, 'embeddings'): - # Custom batch response object from our implementation - embedding_data = [ - Embedding(embedding=emb, index=i) - for i, emb in enumerate(response.embeddings) - ] + log.warning(f"Unexpected embedding format at index {i}: {type(emb)}") else: - log.warning(f"Unexpected response type: {type(response)}") - embedding_data = [] - + log.warning(f"Unexpected response structure: {type(response)}") + return EmbedderOutput( data=embedding_data, error=None, @@ -145,43 +124,44 @@ def convert_inputs_to_api_kwargs( model_type: ModelType = ModelType.UNDEFINED, ) -> Dict: """Convert inputs to Google AI API format. - + Args: input: Text input(s) to embed - model_kwargs: Model parameters including model name and task_type + model_kwargs: Model parameters including model name, task_type, output_dimensionality model_type: Should be ModelType.EMBEDDER for this client - + Returns: Dict: API kwargs for Google AI embedding call """ if model_type != ModelType.EMBEDDER: raise ValueError(f"GoogleEmbedderClient only supports EMBEDDER model type, got {model_type}") - + # Ensure input is a list if isinstance(input, str): - content = [input] + contents = [input] elif isinstance(input, Sequence): - content = list(input) + contents = list(input) else: raise TypeError("input must be a string or sequence of strings") - - final_model_kwargs = model_kwargs.copy() - - # Handle single vs batch embedding - if len(content) == 1: - final_model_kwargs["content"] = content[0] - else: - final_model_kwargs["contents"] = content - - # Set default task type if not provided - if "task_type" not in final_model_kwargs: - final_model_kwargs["task_type"] = "SEMANTIC_SIMILARITY" - - # Set default model if not provided - if "model" not in final_model_kwargs: - final_model_kwargs["model"] = "text-embedding-004" - - return final_model_kwargs + + final_kwargs = { + "model": model_kwargs.get("model", "gemini-embedding-001"), + "contents": contents, + } + + # Build EmbedContentConfig if we have additional parameters + config_params = {} + + if "task_type" in model_kwargs: + config_params["task_type"] = model_kwargs["task_type"] + + if "output_dimensionality" in model_kwargs: + config_params["output_dimensionality"] = model_kwargs["output_dimensionality"] + + if config_params: + final_kwargs["config"] = genai_types.EmbedContentConfig(**config_params) + + return final_kwargs @backoff.on_exception( backoff.expo, @@ -190,42 +170,33 @@ def convert_inputs_to_api_kwargs( ) def call(self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED): """Call Google AI embedding API. - + Args: api_kwargs: API parameters model_type: Should be ModelType.EMBEDDER - + Returns: Google AI embedding response """ if model_type != ModelType.EMBEDDER: raise ValueError(f"GoogleEmbedderClient only supports EMBEDDER model type") - - log.info(f"Google AI Embeddings API kwargs: {api_kwargs}") - + + log.info(f"Google AI Embeddings API call with model: {api_kwargs.get('model')}") + try: - # Use embed_content for single text or batch embedding - if "content" in api_kwargs: - # Single embedding - response = genai.embed_content(**api_kwargs) - elif "contents" in api_kwargs: - # Batch embedding - Google AI supports batch natively - contents = api_kwargs.pop("contents") - response = genai.embed_content(content=contents, **api_kwargs) - else: - raise ValueError("Either 'content' or 'contents' must be provided") - + # Use the new SDK client.models.embed_content + response = self._client.models.embed_content(**api_kwargs) return response - + except Exception as e: log.error(f"Error calling Google AI Embeddings API: {e}") raise async def acall(self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED): """Async call to Google AI embedding API. - - Note: Google AI Python client doesn't have async support yet, - so this falls back to synchronous call. + + Note: Using synchronous call as the google-genai SDK async support + may vary. For true async, consider using asyncio.to_thread. """ - # Google AI client doesn't have async support yet - return self.call(api_kwargs, model_type) \ No newline at end of file + import asyncio + return await asyncio.to_thread(self.call, api_kwargs, model_type) diff --git a/api/main.py b/api/main.py index fe083f55..63e4fd5c 100644 --- a/api/main.py +++ b/api/main.py @@ -50,14 +50,12 @@ def patched_watch(*args, **kwargs): logger.warning(f"Missing environment variables: {', '.join(missing_vars)}") logger.warning("Some functionality may not work correctly without these variables.") -# Configure Google Generative AI -import google.generativeai as genai +# Note: Google GenAI configuration is now done at client instantiation time +# with the new google-genai SDK. The API key is passed to genai.Client() from api.config import GOOGLE_API_KEY -if GOOGLE_API_KEY: - genai.configure(api_key=GOOGLE_API_KEY) -else: - logger.warning("GOOGLE_API_KEY not configured") +if not GOOGLE_API_KEY: + logger.warning("GOOGLE_API_KEY not configured - Google Gemini models will not work") if __name__ == "__main__": # Get port from environment variable or use default diff --git a/api/poetry.lock b/api/poetry.lock index a2446bba..e07df94a 100644 --- a/api/poetry.lock +++ b/api/poetry.lock @@ -1,4 +1,4 @@ -# This file is automatically @generated by Poetry 2.0.1 and should not be changed by hand. +# This file is automatically @generated by Poetry 2.2.1 and should not be changed by hand. [[package]] name = "adalflow" @@ -37,7 +37,7 @@ faiss-cpu = ["faiss-cpu (>=1.8.0)"] google-generativeai = ["google-generativeai (>=0.7.2)"] groq = ["groq (>=0.9.0)"] lancedb = ["lancedb (>=0.5.2)"] -mcp = ["mcp (>=1.9.4,<2.0.0)"] +mcp = ["mcp (>=1.9.4,<2.0.0) ; python_version >= \"3.10\""] ollama = ["ollama (>=0.2.1)"] openai = ["openai (>=1.97.1)"] pgvector = ["pgvector (>=0.3.1)"] @@ -197,7 +197,7 @@ propcache = ">=0.2.0" yarl = ">=1.17.0,<2.0" [package.extras] -speedups = ["Brotli", "aiodns (>=3.3.0)", "backports.zstd", "brotlicffi"] +speedups = ["Brotli ; platform_python_implementation == \"CPython\"", "aiodns (>=3.3.0)", "backports.zstd ; platform_python_implementation == \"CPython\" and python_version < \"3.14\"", "brotlicffi ; platform_python_implementation != \"CPython\""] [[package]] name = "aiosignal" @@ -261,8 +261,8 @@ files = [ [package.extras] doc = ["sphinx", "sphinxcontrib-trio"] -test = ["black", "coverage", "flake8", "flake8-2020", "flake8-bugbear", "mypy", "pytest", "pytest-cov"] -typetest = ["mypy", "pyright", "typing-extensions"] +test = ["black ; implementation_name == \"cpython\"", "coverage", "flake8", "flake8-2020", "flake8-bugbear", "mypy ; implementation_name == \"cpython\"", "pytest", "pytest-cov"] +typetest = ["mypy ; implementation_name == \"cpython\"", "pyright", "typing-extensions"] [[package]] name = "attrs" @@ -705,7 +705,7 @@ files = [ ] [package.dependencies] -cffi = {version = ">=2.0.0", markers = "python_full_version >= \"3.9\" and platform_python_implementation != \"PyPy\""} +cffi = {version = ">=2.0.0", markers = "python_full_version >= \"3.9.0\" and platform_python_implementation != \"PyPy\""} [package.extras] docs = ["sphinx (>=5.3.0)", "sphinx-inline-tabs", "sphinx-rtd-theme (>=3.0.0)"] @@ -1000,51 +1000,51 @@ requests = ">=2.18.0,<3.0.0" [package.extras] async-rest = ["google-auth[aiohttp] (>=2.35.0,<3.0.0)"] -grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0)", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0)"] +grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0) ; python_version >= \"3.11\"", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0) ; python_version >= \"3.11\""] grpcgcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"] grpcio-gcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"] [[package]] name = "google-api-core" -version = "2.26.0" +version = "2.28.1" description = "Google API client core library" optional = false python-versions = ">=3.7" groups = ["main"] -markers = "python_version < \"3.14\"" +markers = "python_version <= \"3.13\"" files = [ - {file = "google_api_core-2.26.0-py3-none-any.whl", hash = "sha256:2b204bd0da2c81f918e3582c48458e24c11771f987f6258e6e227212af78f3ed"}, - {file = "google_api_core-2.26.0.tar.gz", hash = "sha256:e6e6d78bd6cf757f4aee41dcc85b07f485fbb069d5daa3afb126defba1e91a62"}, + {file = "google_api_core-2.28.1-py3-none-any.whl", hash = "sha256:4021b0f8ceb77a6fb4de6fde4502cecab45062e66ff4f2895169e0b35bc9466c"}, + {file = "google_api_core-2.28.1.tar.gz", hash = "sha256:2b405df02d68e68ce0fbc138559e6036559e685159d148ae5861013dc201baf8"}, ] [package.dependencies] google-auth = ">=2.14.1,<3.0.0" googleapis-common-protos = ">=1.56.2,<2.0.0" grpcio = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\" and python_version < \"3.14\""} -grpcio-status = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\" and python_version < \"3.14\""} +grpcio-status = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\""} proto-plus = [ {version = ">=1.25.0,<2.0.0", markers = "python_version >= \"3.13\""}, - {version = ">=1.22.3,<2.0.0", markers = "python_version < \"3.13\""}, + {version = ">=1.22.3,<2.0.0"}, ] protobuf = ">=3.19.5,<3.20.0 || >3.20.0,<3.20.1 || >3.20.1,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<7.0.0" requests = ">=2.18.0,<3.0.0" [package.extras] async-rest = ["google-auth[aiohttp] (>=2.35.0,<3.0.0)"] -grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0)", "grpcio (>=1.75.1,<2.0.0)", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0)", "grpcio-status (>=1.75.1,<2.0.0)"] +grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0) ; python_version >= \"3.11\"", "grpcio (>=1.75.1,<2.0.0) ; python_version >= \"3.14\"", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0) ; python_version >= \"3.11\"", "grpcio-status (>=1.75.1,<2.0.0) ; python_version >= \"3.14\""] grpcgcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"] grpcio-gcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"] [[package]] name = "google-api-python-client" -version = "2.185.0" +version = "2.187.0" description = "Google API Client Library for Python" optional = false python-versions = ">=3.7" groups = ["main"] files = [ - {file = "google_api_python_client-2.185.0-py3-none-any.whl", hash = "sha256:00fe173a4b346d2397fbe0d37ac15368170dfbed91a0395a66ef2558e22b93fc"}, - {file = "google_api_python_client-2.185.0.tar.gz", hash = "sha256:aa1b338e4bb0f141c2df26743f6b46b11f38705aacd775b61971cbc51da089c3"}, + {file = "google_api_python_client-2.187.0-py3-none-any.whl", hash = "sha256:d8d0f6d85d7d1d10bdab32e642312ed572bdc98919f72f831b44b9a9cebba32f"}, + {file = "google_api_python_client-2.187.0.tar.gz", hash = "sha256:e98e8e8f49e1b5048c2f8276473d6485febc76c9c47892a8b4d1afa2c9ec8278"}, ] [package.dependencies] @@ -1056,56 +1056,86 @@ uritemplate = ">=3.0.1,<5" [[package]] name = "google-auth" -version = "2.41.1" +version = "2.45.0" description = "Google Authentication Library" optional = false python-versions = ">=3.7" groups = ["main"] files = [ - {file = "google_auth-2.41.1-py2.py3-none-any.whl", hash = "sha256:754843be95575b9a19c604a848a41be03f7f2afd8c019f716dc1f51ee41c639d"}, - {file = "google_auth-2.41.1.tar.gz", hash = "sha256:b76b7b1f9e61f0cb7e88870d14f6a94aeef248959ef6992670efee37709cbfd2"}, + {file = "google_auth-2.45.0-py2.py3-none-any.whl", hash = "sha256:82344e86dc00410ef5382d99be677c6043d72e502b625aa4f4afa0bdacca0f36"}, + {file = "google_auth-2.45.0.tar.gz", hash = "sha256:90d3f41b6b72ea72dd9811e765699ee491ab24139f34ebf1ca2b9cc0c38708f3"}, ] [package.dependencies] cachetools = ">=2.0.0,<7.0" pyasn1-modules = ">=0.2.1" +requests = {version = ">=2.20.0,<3.0.0", optional = true, markers = "extra == \"requests\""} rsa = ">=3.1.4,<5" [package.extras] aiohttp = ["aiohttp (>=3.6.2,<4.0.0)", "requests (>=2.20.0,<3.0.0)"] +cryptography = ["cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)"] enterprise-cert = ["cryptography", "pyopenssl"] -pyjwt = ["cryptography (<39.0.0)", "cryptography (>=38.0.3)", "pyjwt (>=2.0)"] -pyopenssl = ["cryptography (<39.0.0)", "cryptography (>=38.0.3)", "pyopenssl (>=20.0.0)"] +pyjwt = ["cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)", "pyjwt (>=2.0)"] +pyopenssl = ["cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)", "pyopenssl (>=20.0.0)"] reauth = ["pyu2f (>=0.1.5)"] requests = ["requests (>=2.20.0,<3.0.0)"] -testing = ["aiohttp (<3.10.0)", "aiohttp (>=3.6.2,<4.0.0)", "aioresponses", "cryptography (<39.0.0)", "cryptography (<39.0.0)", "cryptography (>=38.0.3)", "cryptography (>=38.0.3)", "flask", "freezegun", "grpcio", "mock", "oauth2client", "packaging", "pyjwt (>=2.0)", "pyopenssl (<24.3.0)", "pyopenssl (>=20.0.0)", "pytest", "pytest-asyncio", "pytest-cov", "pytest-localserver", "pyu2f (>=0.1.5)", "requests (>=2.20.0,<3.0.0)", "responses", "urllib3"] +testing = ["aiohttp (<3.10.0)", "aiohttp (>=3.6.2,<4.0.0)", "aioresponses", "cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)", "cryptography (>=38.0.3)", "flask", "freezegun", "grpcio", "mock", "oauth2client", "packaging", "pyjwt (>=2.0)", "pyopenssl (<24.3.0)", "pyopenssl (>=20.0.0)", "pytest", "pytest-asyncio", "pytest-cov", "pytest-localserver", "pyu2f (>=0.1.5)", "requests (>=2.20.0,<3.0.0)", "responses", "urllib3"] urllib3 = ["packaging", "urllib3"] [[package]] name = "google-auth-httplib2" -version = "0.2.0" +version = "0.3.0" description = "Google Authentication Library: httplib2 transport" optional = false -python-versions = "*" +python-versions = ">=3.7" +groups = ["main"] +files = [ + {file = "google_auth_httplib2-0.3.0-py3-none-any.whl", hash = "sha256:426167e5df066e3f5a0fc7ea18768c08e7296046594ce4c8c409c2457dd1f776"}, + {file = "google_auth_httplib2-0.3.0.tar.gz", hash = "sha256:177898a0175252480d5ed916aeea183c2df87c1f9c26705d74ae6b951c268b0b"}, +] + +[package.dependencies] +google-auth = ">=1.32.0,<3.0.0" +httplib2 = ">=0.19.0,<1.0.0" + +[[package]] +name = "google-genai" +version = "1.56.0" +description = "GenAI Python SDK" +optional = false +python-versions = ">=3.10" groups = ["main"] files = [ - {file = "google-auth-httplib2-0.2.0.tar.gz", hash = "sha256:38aa7badf48f974f1eb9861794e9c0cb2a0511a4ec0679b1f886d108f5640e05"}, - {file = "google_auth_httplib2-0.2.0-py2.py3-none-any.whl", hash = "sha256:b65a0a2123300dd71281a7bf6e64d65a0759287df52729bdd1ae2e47dc311a3d"}, + {file = "google_genai-1.56.0-py3-none-any.whl", hash = "sha256:9e6b11e0c105ead229368cb5849a480e4d0185519f8d9f538d61ecfcf193b052"}, + {file = "google_genai-1.56.0.tar.gz", hash = "sha256:0491af33c375f099777ae207d9621f044e27091fafad4c50e617eba32165e82f"}, ] [package.dependencies] -google-auth = "*" -httplib2 = ">=0.19.0" +anyio = ">=4.8.0,<5.0.0" +distro = ">=1.7.0,<2" +google-auth = {version = ">=2.45.0,<3.0.0", extras = ["requests"]} +httpx = ">=0.28.1,<1.0.0" +pydantic = ">=2.9.0,<3.0.0" +requests = ">=2.28.1,<3.0.0" +sniffio = "*" +tenacity = ">=8.2.3,<9.2.0" +typing-extensions = ">=4.11.0,<5.0.0" +websockets = ">=13.0.0,<15.1.0" + +[package.extras] +aiohttp = ["aiohttp (<3.13.3)"] +local-tokenizer = ["protobuf", "sentencepiece (>=0.2.0)"] [[package]] name = "google-generativeai" -version = "0.8.5" +version = "0.8.6" description = "Google Generative AI High level API client library and tools." optional = false python-versions = ">=3.9" groups = ["main"] files = [ - {file = "google_generativeai-0.8.5-py3-none-any.whl", hash = "sha256:22b420817fb263f8ed520b33285f45976d5b21e904da32b80d4fd20c055123a2"}, + {file = "google_generativeai-0.8.6-py3-none-any.whl", hash = "sha256:37a0eaaa95e5bbf888828e20a4a1b2c196cc9527d194706e58a68ff388aeb0fa"}, ] [package.dependencies] @@ -1123,14 +1153,14 @@ dev = ["Pillow", "absl-py", "black", "ipython", "nose2", "pandas", "pytype", "py [[package]] name = "googleapis-common-protos" -version = "1.71.0" +version = "1.72.0" description = "Common protobufs used in Google APIs" optional = false python-versions = ">=3.7" groups = ["main"] files = [ - {file = "googleapis_common_protos-1.71.0-py3-none-any.whl", hash = "sha256:59034a1d849dc4d18971997a72ac56246570afdd17f9369a0ff68218d50ab78c"}, - {file = "googleapis_common_protos-1.71.0.tar.gz", hash = "sha256:1aec01e574e29da63c80ba9f7bbf1ccfaacf1da877f23609fe236ca7c72a2e2e"}, + {file = "googleapis_common_protos-1.72.0-py3-none-any.whl", hash = "sha256:4299c5a82d5ae1a9702ada957347726b167f9f8d1fc352477702a1e851ff4038"}, + {file = "googleapis_common_protos-1.72.0.tar.gz", hash = "sha256:e55a601c1b32b52d7a3e65f43563e2aa61bcd737998ee672ac9b951cd49319f5"}, ] [package.dependencies] @@ -1354,7 +1384,7 @@ httpcore = "==1.*" idna = "*" [package.extras] -brotli = ["brotli", "brotlicffi"] +brotli = ["brotli ; platform_python_implementation == \"CPython\"", "brotlicffi ; platform_python_implementation != \"CPython\""] cli = ["click (==8.*)", "pygments (==2.*)", "rich (>=10,<14)"] http2 = ["h2 (>=3,<5)"] socks = ["socksio (==1.*)"] @@ -1675,7 +1705,7 @@ PyJWT = {version = ">=1.0.0,<3", extras = ["crypto"]} requests = ">=2.0.0,<3" [package.extras] -broker = ["pymsalruntime (>=0.14,<0.19)", "pymsalruntime (>=0.17,<0.19)", "pymsalruntime (>=0.18,<0.19)"] +broker = ["pymsalruntime (>=0.14,<0.19) ; python_version >= \"3.6\" and platform_system == \"Windows\"", "pymsalruntime (>=0.17,<0.19) ; python_version >= \"3.8\" and platform_system == \"Darwin\"", "pymsalruntime (>=0.18,<0.19) ; python_version >= \"3.8\" and platform_system == \"Linux\""] [[package]] name = "msal-extensions" @@ -2153,14 +2183,14 @@ files = [ [[package]] name = "proto-plus" -version = "1.26.1" +version = "1.27.0" description = "Beautiful, Pythonic protocol buffers" optional = false python-versions = ">=3.7" groups = ["main"] files = [ - {file = "proto_plus-1.26.1-py3-none-any.whl", hash = "sha256:13285478c2dcf2abb829db158e1047e2f1e8d63a077d94263c2b88b043c75a66"}, - {file = "proto_plus-1.26.1.tar.gz", hash = "sha256:21a515a4c4c0088a773899e23c7bbade3d18f9c66c73edd4c7ee3816bc96a012"}, + {file = "proto_plus-1.27.0-py3-none-any.whl", hash = "sha256:1baa7f81cf0f8acb8bc1f6d085008ba4171eaf669629d1b6d1673b21ed1c0a82"}, + {file = "proto_plus-1.27.0.tar.gz", hash = "sha256:873af56dd0d7e91836aee871e5799e1c6f1bda86ac9a983e0bb9f0c266a568c4"}, ] [package.dependencies] @@ -2250,7 +2280,7 @@ typing-inspection = ">=0.4.2" [package.extras] email = ["email-validator (>=2.0.0)"] -timezone = ["tzdata"] +timezone = ["tzdata ; python_version >= \"3.9\" and platform_system == \"Windows\""] [[package]] name = "pydantic-core" @@ -2811,6 +2841,22 @@ typing-extensions = {version = ">=4.10.0", markers = "python_version < \"3.13\"" [package.extras] full = ["httpx (>=0.27.0,<0.29.0)", "itsdangerous", "jinja2", "python-multipart (>=0.0.18)", "pyyaml"] +[[package]] +name = "tenacity" +version = "9.1.2" +description = "Retry code until it succeeds" +optional = false +python-versions = ">=3.9" +groups = ["main"] +files = [ + {file = "tenacity-9.1.2-py3-none-any.whl", hash = "sha256:f77bf36710d8b73a50b2dd155c97b870017ad21afe6ab300326b0371b3b05138"}, + {file = "tenacity-9.1.2.tar.gz", hash = "sha256:1169d376c297e7de388d18b4481760d478b0e99a777cad3a9c86e556f4b697cb"}, +] + +[package.extras] +doc = ["reno", "sphinx"] +test = ["pytest", "tornado (>=4.5)", "typeguard"] + [[package]] name = "tiktoken" version = "0.12.0" @@ -2959,7 +3005,7 @@ files = [ ] [package.extras] -brotli = ["brotli (>=1.0.9)", "brotlicffi (>=0.8.0)"] +brotli = ["brotli (>=1.0.9) ; platform_python_implementation == \"CPython\"", "brotlicffi (>=0.8.0) ; platform_python_implementation != \"CPython\""] h2 = ["h2 (>=4,<5)"] socks = ["pysocks (>=1.5.6,!=1.5.7,<2.0)"] zstd = ["zstandard (>=0.18.0)"] @@ -2983,12 +3029,12 @@ h11 = ">=0.8" httptools = {version = ">=0.6.3", optional = true, markers = "extra == \"standard\""} python-dotenv = {version = ">=0.13", optional = true, markers = "extra == \"standard\""} pyyaml = {version = ">=5.1", optional = true, markers = "extra == \"standard\""} -uvloop = {version = ">=0.15.1", optional = true, markers = "(sys_platform != \"win32\" and sys_platform != \"cygwin\") and platform_python_implementation != \"PyPy\" and extra == \"standard\""} +uvloop = {version = ">=0.15.1", optional = true, markers = "sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\" and extra == \"standard\""} watchfiles = {version = ">=0.13", optional = true, markers = "extra == \"standard\""} websockets = {version = ">=10.4", optional = true, markers = "extra == \"standard\""} [package.extras] -standard = ["colorama (>=0.4)", "httptools (>=0.6.3)", "python-dotenv (>=0.13)", "pyyaml (>=5.1)", "uvloop (>=0.15.1)", "watchfiles (>=0.13)", "websockets (>=10.4)"] +standard = ["colorama (>=0.4) ; sys_platform == \"win32\"", "httptools (>=0.6.3)", "python-dotenv (>=0.13)", "pyyaml (>=5.1)", "uvloop (>=0.15.1) ; sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\"", "watchfiles (>=0.13)", "websockets (>=10.4)"] [[package]] name = "uvloop" @@ -2997,7 +3043,7 @@ description = "Fast implementation of asyncio event loop on top of libuv" optional = false python-versions = ">=3.8.1" groups = ["main"] -markers = "(sys_platform != \"win32\" and sys_platform != \"cygwin\") and platform_python_implementation != \"PyPy\"" +markers = "sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\"" files = [ {file = "uvloop-0.22.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:ef6f0d4cc8a9fa1f6a910230cd53545d9a14479311e87e3cb225495952eb672c"}, {file = "uvloop-0.22.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:7cd375a12b71d33d46af85a3343b35d98e8116134ba404bd657b3b1d15988792"}, @@ -3404,4 +3450,4 @@ propcache = ">=0.2.1" [metadata] lock-version = "2.1" python-versions = "^3.11" -content-hash = "b558e94d5d8bdcc4273f47c52c8bfa6f4e003df0cf754f56340b8b98283d4a8d" +content-hash = "1933fdd12b1645fa83d6e66b8f1683ca2d927836a0bacd2a631f04adefb115d2" diff --git a/api/pyproject.toml b/api/pyproject.toml index 09760f8b..5c6d9e6c 100644 --- a/api/pyproject.toml +++ b/api/pyproject.toml @@ -12,7 +12,8 @@ python = "^3.11" fastapi = ">=0.95.0" uvicorn = { extras = ["standard"], version = ">=0.21.1" } pydantic = ">=2.0.0" -google-generativeai = ">=0.3.0" +google-genai = ">=1.0.0" +google-generativeai = ">=0.8.0" # Required by adalflow's GoogleGenAIClient - temporary until adalflow migrates to google-genai tiktoken = ">=0.5.0" adalflow = ">=0.1.0" numpy = ">=1.24.0" diff --git a/api/simple_chat.py b/api/simple_chat.py index 41a184ed..f5f771d0 100644 --- a/api/simple_chat.py +++ b/api/simple_chat.py @@ -3,7 +3,8 @@ from typing import List, Optional from urllib.parse import unquote -import google.generativeai as genai +from google import genai +from google.genai import types as genai_types from adalflow.components.model_client.ollama_client import OllamaClient from adalflow.core.types import ModelType from fastapi import FastAPI, HTTPException @@ -11,7 +12,7 @@ from fastapi.responses import StreamingResponse from pydantic import BaseModel, Field -from api.config import get_model_config, configs, OPENROUTER_API_KEY, OPENAI_API_KEY, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY +from api.config import get_model_config, configs, OPENROUTER_API_KEY, OPENAI_API_KEY, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, GOOGLE_API_KEY from api.data_pipeline import count_tokens, get_file_content from api.openai_client import OpenAIClient from api.openrouter_client import OpenRouterClient @@ -450,14 +451,23 @@ async def chat_completions_stream(request: ChatCompletionRequest): model_type=ModelType.LLM, ) else: - # Initialize Google Generative AI model (default provider) - model = genai.GenerativeModel( - model_name=model_config["model"], - generation_config={ - "temperature": model_config["temperature"], - "top_p": model_config["top_p"], - "top_k": model_config["top_k"], - }, + # Initialize Google Generative AI client (new google-genai SDK) + google_client = genai.Client(api_key=GOOGLE_API_KEY) + google_model_name = model_config["model"] + + # Build thinking_config if thinking_level is specified (for Gemini 3 models) + thinking_config = None + if "thinking_level" in model_config: + thinking_config = genai_types.ThinkingConfig( + thinking_level=model_config["thinking_level"].upper() + ) + logger.info(f"Using thinking_level: {model_config['thinking_level']}") + + google_generation_config = genai_types.GenerateContentConfig( + temperature=model_config["temperature"], + top_p=model_config["top_p"], + top_k=model_config["top_k"], + thinking_config=thinking_config ) # Create a streaming response @@ -550,10 +560,14 @@ async def response_stream(): "DASHSCOPE_WORKSPACE_ID) environment variables with valid values." ) else: - # Google Generative AI (default provider) - response = model.generate_content(prompt, stream=True) + # Google Generative AI (default provider) - using new google-genai SDK + response = google_client.models.generate_content_stream( + model=google_model_name, + contents=prompt, + config=google_generation_config + ) for chunk in response: - if hasattr(chunk, "text"): + if hasattr(chunk, "text") and chunk.text: yield chunk.text except Exception as e_outer: @@ -711,22 +725,31 @@ async def response_stream(): "DASHSCOPE_WORKSPACE_ID) environment variables with valid values." ) else: - # Google Generative AI fallback (default provider) + # Google Generative AI fallback (default provider) - using new google-genai SDK model_config = get_model_config(request.provider, request.model) - fallback_model = genai.GenerativeModel( - model_name=model_config["model_kwargs"]["model"], - generation_config={ - "temperature": model_config["model_kwargs"].get("temperature", 0.7), - "top_p": model_config["model_kwargs"].get("top_p", 0.8), - "top_k": model_config["model_kwargs"].get("top_k", 40), - }, + + # Build thinking_config if thinking_level is specified + fallback_thinking_config = None + if "thinking_level" in model_config["model_kwargs"]: + fallback_thinking_config = genai_types.ThinkingConfig( + thinking_level=model_config["model_kwargs"]["thinking_level"].upper() + ) + + fallback_generation_config = genai_types.GenerateContentConfig( + temperature=model_config["model_kwargs"].get("temperature", 0.7), + top_p=model_config["model_kwargs"].get("top_p", 0.8), + top_k=model_config["model_kwargs"].get("top_k", 40), + thinking_config=fallback_thinking_config ) - - fallback_response = fallback_model.generate_content( - simplified_prompt, stream=True + + fallback_client = genai.Client(api_key=GOOGLE_API_KEY) + fallback_response = fallback_client.models.generate_content_stream( + model=model_config["model_kwargs"]["model"], + contents=simplified_prompt, + config=fallback_generation_config ) for chunk in fallback_response: - if hasattr(chunk, "text"): + if hasattr(chunk, "text") and chunk.text: yield chunk.text except Exception as e2: logger.error(f"Error in fallback streaming response: {str(e2)}") diff --git a/api/websocket_wiki.py b/api/websocket_wiki.py index a6acac50..c7204de3 100644 --- a/api/websocket_wiki.py +++ b/api/websocket_wiki.py @@ -3,12 +3,14 @@ from typing import List, Optional, Dict, Any from urllib.parse import unquote -import google.generativeai as genai +from google import genai +from google.genai import types as genai_types from adalflow.components.model_client.ollama_client import OllamaClient from adalflow.core.types import ModelType from fastapi import WebSocket, WebSocketDisconnect, HTTPException from pydantic import BaseModel, Field +from api.config import get_model_config, configs, OPENROUTER_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY from api.config import ( get_model_config, configs, @@ -559,14 +561,23 @@ async def handle_websocket_chat(websocket: WebSocket): model_type=ModelType.LLM ) else: - # Initialize Google Generative AI model - model = genai.GenerativeModel( - model_name=model_config["model"], - generation_config={ - "temperature": model_config["temperature"], - "top_p": model_config["top_p"], - "top_k": model_config["top_k"] - } + # Initialize Google Generative AI client (new google-genai SDK) + google_client = genai.Client(api_key=GOOGLE_API_KEY) + google_model_name = model_config["model"] + + # Build thinking_config if thinking_level is specified (for Gemini 3 models) + thinking_config = None + if "thinking_level" in model_config: + thinking_config = genai_types.ThinkingConfig( + thinking_level=model_config["thinking_level"].upper() + ) + logger.info(f"Using thinking_level: {model_config['thinking_level']}") + + google_generation_config = genai_types.GenerateContentConfig( + temperature=model_config["temperature"], + top_p=model_config["top_p"], + top_k=model_config["top_k"], + thinking_config=thinking_config ) # Process the response based on the provider @@ -685,10 +696,14 @@ async def handle_websocket_chat(websocket: WebSocket): # Close the WebSocket connection after sending the error message await websocket.close() else: - # Google Generative AI (default provider) - response = model.generate_content(prompt, stream=True) + # Google Generative AI (default provider) - using new google-genai SDK + response = google_client.models.generate_content_stream( + model=google_model_name, + contents=prompt, + config=google_generation_config + ) for chunk in response: - if hasattr(chunk, 'text'): + if hasattr(chunk, 'text') and chunk.text: await websocket.send_text(chunk.text) await websocket.close() @@ -856,22 +871,31 @@ async def handle_websocket_chat(websocket: WebSocket): ) await websocket.send_text(error_msg) else: - # Google Generative AI fallback (default provider) + # Google Generative AI fallback (default provider) - using new google-genai SDK model_config = get_model_config(request.provider, request.model) - fallback_model = genai.GenerativeModel( - model_name=model_config["model_kwargs"]["model"], - generation_config={ - "temperature": model_config["model_kwargs"].get("temperature", 0.7), - "top_p": model_config["model_kwargs"].get("top_p", 0.8), - "top_k": model_config["model_kwargs"].get("top_k", 40), - }, + + # Build thinking_config if thinking_level is specified + fallback_thinking_config = None + if "thinking_level" in model_config["model_kwargs"]: + fallback_thinking_config = genai_types.ThinkingConfig( + thinking_level=model_config["model_kwargs"]["thinking_level"].upper() + ) + + fallback_generation_config = genai_types.GenerateContentConfig( + temperature=model_config["model_kwargs"].get("temperature", 0.7), + top_p=model_config["model_kwargs"].get("top_p", 0.8), + top_k=model_config["model_kwargs"].get("top_k", 40), + thinking_config=fallback_thinking_config ) - - fallback_response = fallback_model.generate_content( - simplified_prompt, stream=True + + fallback_client = genai.Client(api_key=GOOGLE_API_KEY) + fallback_response = fallback_client.models.generate_content_stream( + model=model_config["model_kwargs"]["model"], + contents=simplified_prompt, + config=fallback_generation_config ) for chunk in fallback_response: - if hasattr(chunk, "text"): + if hasattr(chunk, "text") and chunk.text: await websocket.send_text(chunk.text) except Exception as e2: logger.error(f"Error in fallback streaming response: {str(e2)}")