diff --git a/.cursor/rules/backend-rag-conventions.mdc b/.cursor/rules/backend-rag-conventions.mdc
new file mode 100644
index 00000000..dbf712ff
--- /dev/null
+++ b/.cursor/rules/backend-rag-conventions.mdc
@@ -0,0 +1,31 @@
+---
+description: Rules for Python backend development, Adalflow RAG pipeline, and API conventions.
+globs:
+  - "api/**/*.py"
+alwaysApply: false
+---
+
+# Backend & RAG Conventions
+
+This project uses FastAPI and the `adalflow` framework to build a robust RAG pipeline.
+
+## Python & FastAPI Patterns
+- **Pydantic Models**: Use Pydantic for all request/response schemas. Define them in `api/api.py` or `api/websocket_wiki.py`.
+- **Async Handlers**: Use `async def` for API endpoints and WebSocket handlers to maintain high concurrency.
+- **Error Handling**: Use `fastapi.HTTPException` for REST errors. For WebSockets, send a descriptive text message before closing the connection.
+
+## Adalflow & RAG Logic
+- **Component Pattern**: The `RAG` class in `api/rag.py` must inherit from `adal.Component`.
+- **Memory**: Use the `Memory` class (from `adalflow`) for conversation history management.
+- **Data Pipeline**: Use `api/data_pipeline.py` for repository cloning and document parsing. Respect the exclusion filters defined in `repo.json`.
+- **Embedders**: Always initialize embedders via `get_embedder` in `api/tools/embedder.py`.
+- **Prompts**: Centralize all LLM prompts in `api/prompts.py`. Use templates for dynamic content.
+
+## Vector Database (FAISS)
+- Vector indices are managed by `DatabaseManager` in `api/data_pipeline.py`.
+- Ensure embedding dimensions are validated before insertion (see `_validate_and_filter_embeddings` in `api/rag.py`).
+
+## Anti-Patterns
+- Do not hardcode API keys; use environment variables via `api/config.py`.
+- Avoid blocking synchronous calls in async routes (use `run_in_threadpool` if necessary).
+- Do not bypass the `DatabaseManager` for file system operations in `~/.adalflow`.
\ No newline at end of file
diff --git a/.cursor/rules/frontend-conventions.mdc b/.cursor/rules/frontend-conventions.mdc
new file mode 100644
index 00000000..82181932
--- /dev/null
+++ b/.cursor/rules/frontend-conventions.mdc
@@ -0,0 +1,33 @@
+---
+description: Rules for Next.js frontend, TypeScript types, and UI component development.
+globs:
+  - "src/**/*.{ts,tsx}"
+alwaysApply: false
+---
+
+# Frontend & UI Conventions
+
+The frontend is a Next.js application focused on repository visualization and AI interaction.
+
+## Next.js & React Patterns
+- **App Router**: Use the `src/app` directory for routing. Dynamic repository routes follow the `[owner]/[repo]` pattern.
+- **Server vs. Client**: Use `"use client"` only for components requiring state, effects, or browser APIs (e.g., `Ask.tsx`, `WikiTreeView.tsx`).
+- **TypeScript**: Strictly type all component props and API responses. Sync with backend Pydantic models.
+
+## AI & Chat UI
+- **WebSockets**: Use `src/utils/websocketClient.ts` for all chat interactions. Do not implement raw WebSocket logic in components.
+- **Markdown Rendering**: Use the `Markdown` component for AI responses to support syntax highlighting and Mermaid diagrams.
+- **Streaming**: Handle partial message updates in the UI to provide a "typing" effect.
+
+## State & Data
+- **Local Storage**: Cache generated wiki structures in `localStorage` to allow instant navigation.
+- **i18n**: Use `next-intl` for all user-facing strings. Access via `useTranslations` hook.
+- **Context**: Use `LanguageContext` for global language state.
+
+## Styling
+- **Tailwind CSS**: Use Tailwind for all styling. Follow the existing design system (dark/light mode support via `next-themes`).
+- **Icons**: Use `lucide-react` for consistent iconography.
+
+## Anti-Patterns
+- Do not fetch data directly from the backend in components; use the Next.js API proxy routes in `src/app/api/`.
+- Avoid large monolithic components; break down complex views like the Wiki page into smaller sub-components.
\ No newline at end of file
diff --git a/.cursor/rules/project-overview.mdc b/.cursor/rules/project-overview.mdc
new file mode 100644
index 00000000..89106306
--- /dev/null
+++ b/.cursor/rules/project-overview.mdc
@@ -0,0 +1,36 @@
+---
+description: General project context, architecture, and tech stack overview.
+globs:
+  - "**/*"
+alwaysApply: true
+---
+
+# DeepWiki-Open Project Overview
+
+DeepWiki-Open is a full-stack RAG (Retrieval-Augmented Generation) application designed to generate and interact with documentation for software repositories.
+
+## Tech Stack
+- **Frontend**: Next.js 15 (App Router), React 19, TypeScript, Tailwind CSS, Mermaid.js.
+- **Backend**: FastAPI (Python 3.10+), Adalflow (RAG framework), FAISS (Vector DB).
+- **AI**: Provider-agnostic (OpenAI, Google Gemini, Ollama, Azure, OpenRouter, etc.).
+
+## Core Architecture
+- **Modular Monolith**: The backend is organized into functional modules within `api/`.
+- **RAG Pipeline**: Uses `adalflow` to orchestrate document ingestion, embedding, and retrieval.
+- **Communication**: REST for configuration/metadata; WebSockets (`/ws/chat`) for real-time streaming chat.
+- **Persistence**: 
+  - Repositories: `~/.adalflow/repos/`
+  - Vector DBs: `~/.adalflow/databases/`
+  - Wiki Cache: `~/.adalflow/wikicache/` (JSON)
+
+## Key Directories
+- `api/`: FastAPI backend logic.
+- `api/tools/`: Embedders and utility clients.
+- `src/app/`: Next.js pages and API routes.
+- `src/components/`: Reusable React components.
+- `src/utils/`: Frontend utilities (WebSocket client, etc.).
+
+## Development Principles
+- **Provider Agnostic**: Always use the factory patterns in `api/config.py` and `api/tools/embedder.py` when adding LLM support.
+- **Type Safety**: Maintain parity between Pydantic models in `api/` and TypeScript interfaces in `src/types`.
+- **Streaming First**: Prioritize streaming responses for AI interactions to improve UX.
\ No newline at end of file
diff --git a/api/api.py b/api/api.py
index d40e73f9..4decd724 100644
--- a/api/api.py
+++ b/api/api.py
@@ -7,7 +7,6 @@
 import json
 from datetime import datetime
 from pydantic import BaseModel, Field
-import google.generativeai as genai
 import asyncio
 
 # Configure logging
diff --git a/api/config/embedder.json b/api/config/embedder.json
index 0101ac08..b87e01c7 100644
--- a/api/config/embedder.json
+++ b/api/config/embedder.json
@@ -18,8 +18,9 @@
     "client_class": "GoogleEmbedderClient",
     "batch_size": 100,
     "model_kwargs": {
-      "model": "text-embedding-004",
-      "task_type": "SEMANTIC_SIMILARITY"
+      "model": "gemini-embedding-001",
+      "task_type": "SEMANTIC_SIMILARITY",
+      "output_dimensionality": 3072
     }
   },
   "embedder_bedrock": {
diff --git a/api/config/generator.json b/api/config/generator.json
index f8817909..a6c20e61 100644
--- a/api/config/generator.json
+++ b/api/config/generator.json
@@ -20,9 +20,15 @@
       }
     },
     "google": {
-      "default_model": "gemini-2.5-flash",
+      "default_model": "gemini-3-flash-preview",
       "supportsCustomModel": true,
       "models": {
+        "gemini-3-flash-preview": {
+          "temperature": 1.0,
+          "top_p": 0.8,
+          "top_k": 20,
+          "thinking_level": "high"
+        },
         "gemini-2.5-flash": {
           "temperature": 1.0,
           "top_p": 0.8,
diff --git a/api/google_embedder_client.py b/api/google_embedder_client.py
index b604fd8e..54da4049 100644
--- a/api/google_embedder_client.py
+++ b/api/google_embedder_client.py
@@ -1,4 +1,4 @@
-"""Google AI Embeddings ModelClient integration."""
+"""Google AI Embeddings ModelClient integration using the new google-genai SDK."""
 
 import os
 import logging
@@ -9,10 +9,10 @@
 from adalflow.core.types import ModelType, EmbedderOutput
 
 try:
-    import google.generativeai as genai
-    from google.generativeai.types.text_types import EmbeddingDict, BatchEmbeddingDict
+    from google import genai
+    from google.genai import types as genai_types
 except ImportError:
-    raise ImportError("google-generativeai is required. Install it with 'pip install google-generativeai'")
+    raise ImportError("google-genai is required. Install it with 'pip install google-genai'")
 
 log = logging.getLogger(__name__)
 
@@ -20,9 +20,9 @@
 class GoogleEmbedderClient(ModelClient):
     __doc__ = r"""A component wrapper for Google AI Embeddings API client.
 
-    This client provides access to Google's embedding models through the Google AI API.
-    It supports text embeddings for various tasks including semantic similarity,
-    retrieval, and classification.
+    This client provides access to Google's embedding models through the Google AI API
+    using the new google-genai SDK. It supports text embeddings for various tasks
+    including semantic similarity, retrieval, and classification.
 
     Args:
         api_key (Optional[str]): Google AI API key. Defaults to None.
@@ -39,15 +39,16 @@ class GoogleEmbedderClient(ModelClient):
         embedder = adal.Embedder(
             model_client=client,
             model_kwargs={
-                "model": "text-embedding-004",
-                "task_type": "SEMANTIC_SIMILARITY"
+                "model": "gemini-embedding-001",
+                "task_type": "SEMANTIC_SIMILARITY",
+                "output_dimensionality": 3072
             }
         )
         ```
 
     References:
         - Google AI Embeddings: https://ai.google.dev/gemini-api/docs/embeddings
-        - Available models: text-embedding-004, embedding-001
+        - Available models: gemini-embedding-001, text-embedding-004
     """
 
     def __init__(
@@ -56,7 +57,7 @@ def __init__(
         env_api_key_name: str = "GOOGLE_API_KEY",
     ):
         """Initialize Google AI Embeddings client.
-        
+
         Args:
             api_key: Google AI API key. If not provided, uses environment variable.
             env_api_key_name: Name of environment variable containing API key.
@@ -64,6 +65,7 @@ def __init__(
         super().__init__()
         self._api_key = api_key
         self._env_api_key_name = env_api_key_name
+        self._client = None
         self._initialize_client()
 
     def _initialize_client(self):
@@ -73,58 +75,35 @@ def _initialize_client(self):
             raise ValueError(
                 f"Environment variable {self._env_api_key_name} must be set"
             )
-        genai.configure(api_key=api_key)
+        self._client = genai.Client(api_key=api_key)
 
     def parse_embedding_response(self, response) -> EmbedderOutput:
         """Parse Google AI embedding response to EmbedderOutput format.
-        
+
         Args:
-            response: Google AI embedding response (EmbeddingDict or BatchEmbeddingDict)
-            
+            response: Google AI embedding response from embed_content
+
         Returns:
             EmbedderOutput with parsed embeddings
         """
         try:
             from adalflow.core.types import Embedding
-            
+
             embedding_data = []
-            
-            if isinstance(response, dict):
-                if 'embedding' in response:
-                    embedding_value = response['embedding']
-                    if isinstance(embedding_value, list) and len(embedding_value) > 0:
-                        # Check if it's a single embedding (list of floats) or batch (list of lists)
-                        if isinstance(embedding_value[0], (int, float)):
-                            # Single embedding response: {'embedding': [float, ...]}
-                            embedding_data = [Embedding(embedding=embedding_value, index=0)]
-                        else:
-                            # Batch embeddings response: {'embedding': [[float, ...], [float, ...], ...]}
-                            embedding_data = [
-                                Embedding(embedding=emb_list, index=i) 
-                                for i, emb_list in enumerate(embedding_value)
-                            ]
+
+            # New SDK returns response with .embeddings attribute
+            # Each embedding has .values which is a list of floats
+            if hasattr(response, 'embeddings') and response.embeddings:
+                for i, emb in enumerate(response.embeddings):
+                    if hasattr(emb, 'values'):
+                        embedding_data.append(Embedding(embedding=list(emb.values), index=i))
+                    elif isinstance(emb, (list, tuple)):
+                        embedding_data.append(Embedding(embedding=list(emb), index=i))
                     else:
-                        log.warning(f"Empty or invalid embedding data: {embedding_value}")
-                        embedding_data = []
-                elif 'embeddings' in response:
-                    # Alternative batch format: {'embeddings': [{'embedding': [float, ...]}, ...]}
-                    embedding_data = [
-                        Embedding(embedding=item['embedding'], index=i) 
-                        for i, item in enumerate(response['embeddings'])
-                    ]
-                else:
-                    log.warning(f"Unexpected response structure: {response.keys()}")
-                    embedding_data = []
-            elif hasattr(response, 'embeddings'):
-                # Custom batch response object from our implementation
-                embedding_data = [
-                    Embedding(embedding=emb, index=i) 
-                    for i, emb in enumerate(response.embeddings)
-                ]
+                        log.warning(f"Unexpected embedding format at index {i}: {type(emb)}")
             else:
-                log.warning(f"Unexpected response type: {type(response)}")
-                embedding_data = []
-            
+                log.warning(f"Unexpected response structure: {type(response)}")
+
             return EmbedderOutput(
                 data=embedding_data,
                 error=None,
@@ -145,43 +124,44 @@ def convert_inputs_to_api_kwargs(
         model_type: ModelType = ModelType.UNDEFINED,
     ) -> Dict:
         """Convert inputs to Google AI API format.
-        
+
         Args:
             input: Text input(s) to embed
-            model_kwargs: Model parameters including model name and task_type
+            model_kwargs: Model parameters including model name, task_type, output_dimensionality
             model_type: Should be ModelType.EMBEDDER for this client
-            
+
         Returns:
             Dict: API kwargs for Google AI embedding call
         """
         if model_type != ModelType.EMBEDDER:
             raise ValueError(f"GoogleEmbedderClient only supports EMBEDDER model type, got {model_type}")
-        
+
         # Ensure input is a list
         if isinstance(input, str):
-            content = [input]
+            contents = [input]
         elif isinstance(input, Sequence):
-            content = list(input)
+            contents = list(input)
         else:
             raise TypeError("input must be a string or sequence of strings")
-        
-        final_model_kwargs = model_kwargs.copy()
-        
-        # Handle single vs batch embedding
-        if len(content) == 1:
-            final_model_kwargs["content"] = content[0]
-        else:
-            final_model_kwargs["contents"] = content
-            
-        # Set default task type if not provided
-        if "task_type" not in final_model_kwargs:
-            final_model_kwargs["task_type"] = "SEMANTIC_SIMILARITY"
-            
-        # Set default model if not provided
-        if "model" not in final_model_kwargs:
-            final_model_kwargs["model"] = "text-embedding-004"
-            
-        return final_model_kwargs
+
+        final_kwargs = {
+            "model": model_kwargs.get("model", "gemini-embedding-001"),
+            "contents": contents,
+        }
+
+        # Build EmbedContentConfig if we have additional parameters
+        config_params = {}
+
+        if "task_type" in model_kwargs:
+            config_params["task_type"] = model_kwargs["task_type"]
+
+        if "output_dimensionality" in model_kwargs:
+            config_params["output_dimensionality"] = model_kwargs["output_dimensionality"]
+
+        if config_params:
+            final_kwargs["config"] = genai_types.EmbedContentConfig(**config_params)
+
+        return final_kwargs
 
     @backoff.on_exception(
         backoff.expo,
@@ -190,42 +170,33 @@ def convert_inputs_to_api_kwargs(
     )
     def call(self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED):
         """Call Google AI embedding API.
-        
+
         Args:
             api_kwargs: API parameters
             model_type: Should be ModelType.EMBEDDER
-            
+
         Returns:
             Google AI embedding response
         """
         if model_type != ModelType.EMBEDDER:
             raise ValueError(f"GoogleEmbedderClient only supports EMBEDDER model type")
-            
-        log.info(f"Google AI Embeddings API kwargs: {api_kwargs}")
-        
+
+        log.info(f"Google AI Embeddings API call with model: {api_kwargs.get('model')}")
+
         try:
-            # Use embed_content for single text or batch embedding
-            if "content" in api_kwargs:
-                # Single embedding
-                response = genai.embed_content(**api_kwargs)
-            elif "contents" in api_kwargs:
-                # Batch embedding - Google AI supports batch natively
-                contents = api_kwargs.pop("contents")
-                response = genai.embed_content(content=contents, **api_kwargs)
-            else:
-                raise ValueError("Either 'content' or 'contents' must be provided")
-                
+            # Use the new SDK client.models.embed_content
+            response = self._client.models.embed_content(**api_kwargs)
             return response
-            
+
         except Exception as e:
             log.error(f"Error calling Google AI Embeddings API: {e}")
             raise
 
     async def acall(self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED):
         """Async call to Google AI embedding API.
-        
-        Note: Google AI Python client doesn't have async support yet,
-        so this falls back to synchronous call.
+
+        Note: Using synchronous call as the google-genai SDK async support
+        may vary. For true async, consider using asyncio.to_thread.
         """
-        # Google AI client doesn't have async support yet
-        return self.call(api_kwargs, model_type)
\ No newline at end of file
+        import asyncio
+        return await asyncio.to_thread(self.call, api_kwargs, model_type)
diff --git a/api/main.py b/api/main.py
index fe083f55..63e4fd5c 100644
--- a/api/main.py
+++ b/api/main.py
@@ -50,14 +50,12 @@ def patched_watch(*args, **kwargs):
     logger.warning(f"Missing environment variables: {', '.join(missing_vars)}")
     logger.warning("Some functionality may not work correctly without these variables.")
 
-# Configure Google Generative AI
-import google.generativeai as genai
+# Note: Google GenAI configuration is now done at client instantiation time
+# with the new google-genai SDK. The API key is passed to genai.Client()
 from api.config import GOOGLE_API_KEY
 
-if GOOGLE_API_KEY:
-    genai.configure(api_key=GOOGLE_API_KEY)
-else:
-    logger.warning("GOOGLE_API_KEY not configured")
+if not GOOGLE_API_KEY:
+    logger.warning("GOOGLE_API_KEY not configured - Google Gemini models will not work")
 
 if __name__ == "__main__":
     # Get port from environment variable or use default
diff --git a/api/poetry.lock b/api/poetry.lock
index a2446bba..e07df94a 100644
--- a/api/poetry.lock
+++ b/api/poetry.lock
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 2.0.1 and should not be changed by hand.
+# This file is automatically @generated by Poetry 2.2.1 and should not be changed by hand.
 
 [[package]]
 name = "adalflow"
@@ -37,7 +37,7 @@ faiss-cpu = ["faiss-cpu (>=1.8.0)"]
 google-generativeai = ["google-generativeai (>=0.7.2)"]
 groq = ["groq (>=0.9.0)"]
 lancedb = ["lancedb (>=0.5.2)"]
-mcp = ["mcp (>=1.9.4,<2.0.0)"]
+mcp = ["mcp (>=1.9.4,<2.0.0) ; python_version >= \"3.10\""]
 ollama = ["ollama (>=0.2.1)"]
 openai = ["openai (>=1.97.1)"]
 pgvector = ["pgvector (>=0.3.1)"]
@@ -197,7 +197,7 @@ propcache = ">=0.2.0"
 yarl = ">=1.17.0,<2.0"
 
 [package.extras]
-speedups = ["Brotli", "aiodns (>=3.3.0)", "backports.zstd", "brotlicffi"]
+speedups = ["Brotli ; platform_python_implementation == \"CPython\"", "aiodns (>=3.3.0)", "backports.zstd ; platform_python_implementation == \"CPython\" and python_version < \"3.14\"", "brotlicffi ; platform_python_implementation != \"CPython\""]
 
 [[package]]
 name = "aiosignal"
@@ -261,8 +261,8 @@ files = [
 
 [package.extras]
 doc = ["sphinx", "sphinxcontrib-trio"]
-test = ["black", "coverage", "flake8", "flake8-2020", "flake8-bugbear", "mypy", "pytest", "pytest-cov"]
-typetest = ["mypy", "pyright", "typing-extensions"]
+test = ["black ; implementation_name == \"cpython\"", "coverage", "flake8", "flake8-2020", "flake8-bugbear", "mypy ; implementation_name == \"cpython\"", "pytest", "pytest-cov"]
+typetest = ["mypy ; implementation_name == \"cpython\"", "pyright", "typing-extensions"]
 
 [[package]]
 name = "attrs"
@@ -705,7 +705,7 @@ files = [
 ]
 
 [package.dependencies]
-cffi = {version = ">=2.0.0", markers = "python_full_version >= \"3.9\" and platform_python_implementation != \"PyPy\""}
+cffi = {version = ">=2.0.0", markers = "python_full_version >= \"3.9.0\" and platform_python_implementation != \"PyPy\""}
 
 [package.extras]
 docs = ["sphinx (>=5.3.0)", "sphinx-inline-tabs", "sphinx-rtd-theme (>=3.0.0)"]
@@ -1000,51 +1000,51 @@ requests = ">=2.18.0,<3.0.0"
 
 [package.extras]
 async-rest = ["google-auth[aiohttp] (>=2.35.0,<3.0.0)"]
-grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0)", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0)"]
+grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0) ; python_version >= \"3.11\"", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0) ; python_version >= \"3.11\""]
 grpcgcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"]
 grpcio-gcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"]
 
 [[package]]
 name = "google-api-core"
-version = "2.26.0"
+version = "2.28.1"
 description = "Google API client core library"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
-markers = "python_version < \"3.14\""
+markers = "python_version <= \"3.13\""
 files = [
-    {file = "google_api_core-2.26.0-py3-none-any.whl", hash = "sha256:2b204bd0da2c81f918e3582c48458e24c11771f987f6258e6e227212af78f3ed"},
-    {file = "google_api_core-2.26.0.tar.gz", hash = "sha256:e6e6d78bd6cf757f4aee41dcc85b07f485fbb069d5daa3afb126defba1e91a62"},
+    {file = "google_api_core-2.28.1-py3-none-any.whl", hash = "sha256:4021b0f8ceb77a6fb4de6fde4502cecab45062e66ff4f2895169e0b35bc9466c"},
+    {file = "google_api_core-2.28.1.tar.gz", hash = "sha256:2b405df02d68e68ce0fbc138559e6036559e685159d148ae5861013dc201baf8"},
 ]
 
 [package.dependencies]
 google-auth = ">=2.14.1,<3.0.0"
 googleapis-common-protos = ">=1.56.2,<2.0.0"
 grpcio = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\" and python_version < \"3.14\""}
-grpcio-status = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\" and python_version < \"3.14\""}
+grpcio-status = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\""}
 proto-plus = [
     {version = ">=1.25.0,<2.0.0", markers = "python_version >= \"3.13\""},
-    {version = ">=1.22.3,<2.0.0", markers = "python_version < \"3.13\""},
+    {version = ">=1.22.3,<2.0.0"},
 ]
 protobuf = ">=3.19.5,<3.20.0 || >3.20.0,<3.20.1 || >3.20.1,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<7.0.0"
 requests = ">=2.18.0,<3.0.0"
 
 [package.extras]
 async-rest = ["google-auth[aiohttp] (>=2.35.0,<3.0.0)"]
-grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0)", "grpcio (>=1.75.1,<2.0.0)", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0)", "grpcio-status (>=1.75.1,<2.0.0)"]
+grpc = ["grpcio (>=1.33.2,<2.0.0)", "grpcio (>=1.49.1,<2.0.0) ; python_version >= \"3.11\"", "grpcio (>=1.75.1,<2.0.0) ; python_version >= \"3.14\"", "grpcio-status (>=1.33.2,<2.0.0)", "grpcio-status (>=1.49.1,<2.0.0) ; python_version >= \"3.11\"", "grpcio-status (>=1.75.1,<2.0.0) ; python_version >= \"3.14\""]
 grpcgcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"]
 grpcio-gcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"]
 
 [[package]]
 name = "google-api-python-client"
-version = "2.185.0"
+version = "2.187.0"
 description = "Google API Client Library for Python"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "google_api_python_client-2.185.0-py3-none-any.whl", hash = "sha256:00fe173a4b346d2397fbe0d37ac15368170dfbed91a0395a66ef2558e22b93fc"},
-    {file = "google_api_python_client-2.185.0.tar.gz", hash = "sha256:aa1b338e4bb0f141c2df26743f6b46b11f38705aacd775b61971cbc51da089c3"},
+    {file = "google_api_python_client-2.187.0-py3-none-any.whl", hash = "sha256:d8d0f6d85d7d1d10bdab32e642312ed572bdc98919f72f831b44b9a9cebba32f"},
+    {file = "google_api_python_client-2.187.0.tar.gz", hash = "sha256:e98e8e8f49e1b5048c2f8276473d6485febc76c9c47892a8b4d1afa2c9ec8278"},
 ]
 
 [package.dependencies]
@@ -1056,56 +1056,86 @@ uritemplate = ">=3.0.1,<5"
 
 [[package]]
 name = "google-auth"
-version = "2.41.1"
+version = "2.45.0"
 description = "Google Authentication Library"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "google_auth-2.41.1-py2.py3-none-any.whl", hash = "sha256:754843be95575b9a19c604a848a41be03f7f2afd8c019f716dc1f51ee41c639d"},
-    {file = "google_auth-2.41.1.tar.gz", hash = "sha256:b76b7b1f9e61f0cb7e88870d14f6a94aeef248959ef6992670efee37709cbfd2"},
+    {file = "google_auth-2.45.0-py2.py3-none-any.whl", hash = "sha256:82344e86dc00410ef5382d99be677c6043d72e502b625aa4f4afa0bdacca0f36"},
+    {file = "google_auth-2.45.0.tar.gz", hash = "sha256:90d3f41b6b72ea72dd9811e765699ee491ab24139f34ebf1ca2b9cc0c38708f3"},
 ]
 
 [package.dependencies]
 cachetools = ">=2.0.0,<7.0"
 pyasn1-modules = ">=0.2.1"
+requests = {version = ">=2.20.0,<3.0.0", optional = true, markers = "extra == \"requests\""}
 rsa = ">=3.1.4,<5"
 
 [package.extras]
 aiohttp = ["aiohttp (>=3.6.2,<4.0.0)", "requests (>=2.20.0,<3.0.0)"]
+cryptography = ["cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)"]
 enterprise-cert = ["cryptography", "pyopenssl"]
-pyjwt = ["cryptography (<39.0.0)", "cryptography (>=38.0.3)", "pyjwt (>=2.0)"]
-pyopenssl = ["cryptography (<39.0.0)", "cryptography (>=38.0.3)", "pyopenssl (>=20.0.0)"]
+pyjwt = ["cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)", "pyjwt (>=2.0)"]
+pyopenssl = ["cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)", "pyopenssl (>=20.0.0)"]
 reauth = ["pyu2f (>=0.1.5)"]
 requests = ["requests (>=2.20.0,<3.0.0)"]
-testing = ["aiohttp (<3.10.0)", "aiohttp (>=3.6.2,<4.0.0)", "aioresponses", "cryptography (<39.0.0)", "cryptography (<39.0.0)", "cryptography (>=38.0.3)", "cryptography (>=38.0.3)", "flask", "freezegun", "grpcio", "mock", "oauth2client", "packaging", "pyjwt (>=2.0)", "pyopenssl (<24.3.0)", "pyopenssl (>=20.0.0)", "pytest", "pytest-asyncio", "pytest-cov", "pytest-localserver", "pyu2f (>=0.1.5)", "requests (>=2.20.0,<3.0.0)", "responses", "urllib3"]
+testing = ["aiohttp (<3.10.0)", "aiohttp (>=3.6.2,<4.0.0)", "aioresponses", "cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (<39.0.0) ; python_version < \"3.8\"", "cryptography (>=38.0.3)", "cryptography (>=38.0.3)", "flask", "freezegun", "grpcio", "mock", "oauth2client", "packaging", "pyjwt (>=2.0)", "pyopenssl (<24.3.0)", "pyopenssl (>=20.0.0)", "pytest", "pytest-asyncio", "pytest-cov", "pytest-localserver", "pyu2f (>=0.1.5)", "requests (>=2.20.0,<3.0.0)", "responses", "urllib3"]
 urllib3 = ["packaging", "urllib3"]
 
 [[package]]
 name = "google-auth-httplib2"
-version = "0.2.0"
+version = "0.3.0"
 description = "Google Authentication Library: httplib2 transport"
 optional = false
-python-versions = "*"
+python-versions = ">=3.7"
+groups = ["main"]
+files = [
+    {file = "google_auth_httplib2-0.3.0-py3-none-any.whl", hash = "sha256:426167e5df066e3f5a0fc7ea18768c08e7296046594ce4c8c409c2457dd1f776"},
+    {file = "google_auth_httplib2-0.3.0.tar.gz", hash = "sha256:177898a0175252480d5ed916aeea183c2df87c1f9c26705d74ae6b951c268b0b"},
+]
+
+[package.dependencies]
+google-auth = ">=1.32.0,<3.0.0"
+httplib2 = ">=0.19.0,<1.0.0"
+
+[[package]]
+name = "google-genai"
+version = "1.56.0"
+description = "GenAI Python SDK"
+optional = false
+python-versions = ">=3.10"
 groups = ["main"]
 files = [
-    {file = "google-auth-httplib2-0.2.0.tar.gz", hash = "sha256:38aa7badf48f974f1eb9861794e9c0cb2a0511a4ec0679b1f886d108f5640e05"},
-    {file = "google_auth_httplib2-0.2.0-py2.py3-none-any.whl", hash = "sha256:b65a0a2123300dd71281a7bf6e64d65a0759287df52729bdd1ae2e47dc311a3d"},
+    {file = "google_genai-1.56.0-py3-none-any.whl", hash = "sha256:9e6b11e0c105ead229368cb5849a480e4d0185519f8d9f538d61ecfcf193b052"},
+    {file = "google_genai-1.56.0.tar.gz", hash = "sha256:0491af33c375f099777ae207d9621f044e27091fafad4c50e617eba32165e82f"},
 ]
 
 [package.dependencies]
-google-auth = "*"
-httplib2 = ">=0.19.0"
+anyio = ">=4.8.0,<5.0.0"
+distro = ">=1.7.0,<2"
+google-auth = {version = ">=2.45.0,<3.0.0", extras = ["requests"]}
+httpx = ">=0.28.1,<1.0.0"
+pydantic = ">=2.9.0,<3.0.0"
+requests = ">=2.28.1,<3.0.0"
+sniffio = "*"
+tenacity = ">=8.2.3,<9.2.0"
+typing-extensions = ">=4.11.0,<5.0.0"
+websockets = ">=13.0.0,<15.1.0"
+
+[package.extras]
+aiohttp = ["aiohttp (<3.13.3)"]
+local-tokenizer = ["protobuf", "sentencepiece (>=0.2.0)"]
 
 [[package]]
 name = "google-generativeai"
-version = "0.8.5"
+version = "0.8.6"
 description = "Google Generative AI High level API client library and tools."
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "google_generativeai-0.8.5-py3-none-any.whl", hash = "sha256:22b420817fb263f8ed520b33285f45976d5b21e904da32b80d4fd20c055123a2"},
+    {file = "google_generativeai-0.8.6-py3-none-any.whl", hash = "sha256:37a0eaaa95e5bbf888828e20a4a1b2c196cc9527d194706e58a68ff388aeb0fa"},
 ]
 
 [package.dependencies]
@@ -1123,14 +1153,14 @@ dev = ["Pillow", "absl-py", "black", "ipython", "nose2", "pandas", "pytype", "py
 
 [[package]]
 name = "googleapis-common-protos"
-version = "1.71.0"
+version = "1.72.0"
 description = "Common protobufs used in Google APIs"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "googleapis_common_protos-1.71.0-py3-none-any.whl", hash = "sha256:59034a1d849dc4d18971997a72ac56246570afdd17f9369a0ff68218d50ab78c"},
-    {file = "googleapis_common_protos-1.71.0.tar.gz", hash = "sha256:1aec01e574e29da63c80ba9f7bbf1ccfaacf1da877f23609fe236ca7c72a2e2e"},
+    {file = "googleapis_common_protos-1.72.0-py3-none-any.whl", hash = "sha256:4299c5a82d5ae1a9702ada957347726b167f9f8d1fc352477702a1e851ff4038"},
+    {file = "googleapis_common_protos-1.72.0.tar.gz", hash = "sha256:e55a601c1b32b52d7a3e65f43563e2aa61bcd737998ee672ac9b951cd49319f5"},
 ]
 
 [package.dependencies]
@@ -1354,7 +1384,7 @@ httpcore = "==1.*"
 idna = "*"
 
 [package.extras]
-brotli = ["brotli", "brotlicffi"]
+brotli = ["brotli ; platform_python_implementation == \"CPython\"", "brotlicffi ; platform_python_implementation != \"CPython\""]
 cli = ["click (==8.*)", "pygments (==2.*)", "rich (>=10,<14)"]
 http2 = ["h2 (>=3,<5)"]
 socks = ["socksio (==1.*)"]
@@ -1675,7 +1705,7 @@ PyJWT = {version = ">=1.0.0,<3", extras = ["crypto"]}
 requests = ">=2.0.0,<3"
 
 [package.extras]
-broker = ["pymsalruntime (>=0.14,<0.19)", "pymsalruntime (>=0.17,<0.19)", "pymsalruntime (>=0.18,<0.19)"]
+broker = ["pymsalruntime (>=0.14,<0.19) ; python_version >= \"3.6\" and platform_system == \"Windows\"", "pymsalruntime (>=0.17,<0.19) ; python_version >= \"3.8\" and platform_system == \"Darwin\"", "pymsalruntime (>=0.18,<0.19) ; python_version >= \"3.8\" and platform_system == \"Linux\""]
 
 [[package]]
 name = "msal-extensions"
@@ -2153,14 +2183,14 @@ files = [
 
 [[package]]
 name = "proto-plus"
-version = "1.26.1"
+version = "1.27.0"
 description = "Beautiful, Pythonic protocol buffers"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "proto_plus-1.26.1-py3-none-any.whl", hash = "sha256:13285478c2dcf2abb829db158e1047e2f1e8d63a077d94263c2b88b043c75a66"},
-    {file = "proto_plus-1.26.1.tar.gz", hash = "sha256:21a515a4c4c0088a773899e23c7bbade3d18f9c66c73edd4c7ee3816bc96a012"},
+    {file = "proto_plus-1.27.0-py3-none-any.whl", hash = "sha256:1baa7f81cf0f8acb8bc1f6d085008ba4171eaf669629d1b6d1673b21ed1c0a82"},
+    {file = "proto_plus-1.27.0.tar.gz", hash = "sha256:873af56dd0d7e91836aee871e5799e1c6f1bda86ac9a983e0bb9f0c266a568c4"},
 ]
 
 [package.dependencies]
@@ -2250,7 +2280,7 @@ typing-inspection = ">=0.4.2"
 
 [package.extras]
 email = ["email-validator (>=2.0.0)"]
-timezone = ["tzdata"]
+timezone = ["tzdata ; python_version >= \"3.9\" and platform_system == \"Windows\""]
 
 [[package]]
 name = "pydantic-core"
@@ -2811,6 +2841,22 @@ typing-extensions = {version = ">=4.10.0", markers = "python_version < \"3.13\""
 [package.extras]
 full = ["httpx (>=0.27.0,<0.29.0)", "itsdangerous", "jinja2", "python-multipart (>=0.0.18)", "pyyaml"]
 
+[[package]]
+name = "tenacity"
+version = "9.1.2"
+description = "Retry code until it succeeds"
+optional = false
+python-versions = ">=3.9"
+groups = ["main"]
+files = [
+    {file = "tenacity-9.1.2-py3-none-any.whl", hash = "sha256:f77bf36710d8b73a50b2dd155c97b870017ad21afe6ab300326b0371b3b05138"},
+    {file = "tenacity-9.1.2.tar.gz", hash = "sha256:1169d376c297e7de388d18b4481760d478b0e99a777cad3a9c86e556f4b697cb"},
+]
+
+[package.extras]
+doc = ["reno", "sphinx"]
+test = ["pytest", "tornado (>=4.5)", "typeguard"]
+
 [[package]]
 name = "tiktoken"
 version = "0.12.0"
@@ -2959,7 +3005,7 @@ files = [
 ]
 
 [package.extras]
-brotli = ["brotli (>=1.0.9)", "brotlicffi (>=0.8.0)"]
+brotli = ["brotli (>=1.0.9) ; platform_python_implementation == \"CPython\"", "brotlicffi (>=0.8.0) ; platform_python_implementation != \"CPython\""]
 h2 = ["h2 (>=4,<5)"]
 socks = ["pysocks (>=1.5.6,!=1.5.7,<2.0)"]
 zstd = ["zstandard (>=0.18.0)"]
@@ -2983,12 +3029,12 @@ h11 = ">=0.8"
 httptools = {version = ">=0.6.3", optional = true, markers = "extra == \"standard\""}
 python-dotenv = {version = ">=0.13", optional = true, markers = "extra == \"standard\""}
 pyyaml = {version = ">=5.1", optional = true, markers = "extra == \"standard\""}
-uvloop = {version = ">=0.15.1", optional = true, markers = "(sys_platform != \"win32\" and sys_platform != \"cygwin\") and platform_python_implementation != \"PyPy\" and extra == \"standard\""}
+uvloop = {version = ">=0.15.1", optional = true, markers = "sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\" and extra == \"standard\""}
 watchfiles = {version = ">=0.13", optional = true, markers = "extra == \"standard\""}
 websockets = {version = ">=10.4", optional = true, markers = "extra == \"standard\""}
 
 [package.extras]
-standard = ["colorama (>=0.4)", "httptools (>=0.6.3)", "python-dotenv (>=0.13)", "pyyaml (>=5.1)", "uvloop (>=0.15.1)", "watchfiles (>=0.13)", "websockets (>=10.4)"]
+standard = ["colorama (>=0.4) ; sys_platform == \"win32\"", "httptools (>=0.6.3)", "python-dotenv (>=0.13)", "pyyaml (>=5.1)", "uvloop (>=0.15.1) ; sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\"", "watchfiles (>=0.13)", "websockets (>=10.4)"]
 
 [[package]]
 name = "uvloop"
@@ -2997,7 +3043,7 @@ description = "Fast implementation of asyncio event loop on top of libuv"
 optional = false
 python-versions = ">=3.8.1"
 groups = ["main"]
-markers = "(sys_platform != \"win32\" and sys_platform != \"cygwin\") and platform_python_implementation != \"PyPy\""
+markers = "sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\""
 files = [
     {file = "uvloop-0.22.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:ef6f0d4cc8a9fa1f6a910230cd53545d9a14479311e87e3cb225495952eb672c"},
     {file = "uvloop-0.22.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:7cd375a12b71d33d46af85a3343b35d98e8116134ba404bd657b3b1d15988792"},
@@ -3404,4 +3450,4 @@ propcache = ">=0.2.1"
 [metadata]
 lock-version = "2.1"
 python-versions = "^3.11"
-content-hash = "b558e94d5d8bdcc4273f47c52c8bfa6f4e003df0cf754f56340b8b98283d4a8d"
+content-hash = "1933fdd12b1645fa83d6e66b8f1683ca2d927836a0bacd2a631f04adefb115d2"
diff --git a/api/pyproject.toml b/api/pyproject.toml
index 09760f8b..5c6d9e6c 100644
--- a/api/pyproject.toml
+++ b/api/pyproject.toml
@@ -12,7 +12,8 @@ python = "^3.11"
 fastapi = ">=0.95.0"
 uvicorn = { extras = ["standard"], version = ">=0.21.1" }
 pydantic = ">=2.0.0"
-google-generativeai = ">=0.3.0"
+google-genai = ">=1.0.0"
+google-generativeai = ">=0.8.0"  # Required by adalflow's GoogleGenAIClient - temporary until adalflow migrates to google-genai
 tiktoken = ">=0.5.0"
 adalflow = ">=0.1.0"
 numpy = ">=1.24.0"
diff --git a/api/simple_chat.py b/api/simple_chat.py
index 41a184ed..f5f771d0 100644
--- a/api/simple_chat.py
+++ b/api/simple_chat.py
@@ -3,7 +3,8 @@
 from typing import List, Optional
 from urllib.parse import unquote
 
-import google.generativeai as genai
+from google import genai
+from google.genai import types as genai_types
 from adalflow.components.model_client.ollama_client import OllamaClient
 from adalflow.core.types import ModelType
 from fastapi import FastAPI, HTTPException
@@ -11,7 +12,7 @@
 from fastapi.responses import StreamingResponse
 from pydantic import BaseModel, Field
 
-from api.config import get_model_config, configs, OPENROUTER_API_KEY, OPENAI_API_KEY, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
+from api.config import get_model_config, configs, OPENROUTER_API_KEY, OPENAI_API_KEY, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, GOOGLE_API_KEY
 from api.data_pipeline import count_tokens, get_file_content
 from api.openai_client import OpenAIClient
 from api.openrouter_client import OpenRouterClient
@@ -450,14 +451,23 @@ async def chat_completions_stream(request: ChatCompletionRequest):
                 model_type=ModelType.LLM,
             )
         else:
-            # Initialize Google Generative AI model (default provider)
-            model = genai.GenerativeModel(
-                model_name=model_config["model"],
-                generation_config={
-                    "temperature": model_config["temperature"],
-                    "top_p": model_config["top_p"],
-                    "top_k": model_config["top_k"],
-                },
+            # Initialize Google Generative AI client (new google-genai SDK)
+            google_client = genai.Client(api_key=GOOGLE_API_KEY)
+            google_model_name = model_config["model"]
+            
+            # Build thinking_config if thinking_level is specified (for Gemini 3 models)
+            thinking_config = None
+            if "thinking_level" in model_config:
+                thinking_config = genai_types.ThinkingConfig(
+                    thinking_level=model_config["thinking_level"].upper()
+                )
+                logger.info(f"Using thinking_level: {model_config['thinking_level']}")
+            
+            google_generation_config = genai_types.GenerateContentConfig(
+                temperature=model_config["temperature"],
+                top_p=model_config["top_p"],
+                top_k=model_config["top_k"],
+                thinking_config=thinking_config
             )
 
         # Create a streaming response
@@ -550,10 +560,14 @@ async def response_stream():
                             "DASHSCOPE_WORKSPACE_ID) environment variables with valid values."
                         )
                 else:
-                    # Google Generative AI (default provider)
-                    response = model.generate_content(prompt, stream=True)
+                    # Google Generative AI (default provider) - using new google-genai SDK
+                    response = google_client.models.generate_content_stream(
+                        model=google_model_name,
+                        contents=prompt,
+                        config=google_generation_config
+                    )
                     for chunk in response:
-                        if hasattr(chunk, "text"):
+                        if hasattr(chunk, "text") and chunk.text:
                             yield chunk.text
 
             except Exception as e_outer:
@@ -711,22 +725,31 @@ async def response_stream():
                                     "DASHSCOPE_WORKSPACE_ID) environment variables with valid values."
                                 )
                         else:
-                            # Google Generative AI fallback (default provider)
+                            # Google Generative AI fallback (default provider) - using new google-genai SDK
                             model_config = get_model_config(request.provider, request.model)
-                            fallback_model = genai.GenerativeModel(
-                                model_name=model_config["model_kwargs"]["model"],
-                                generation_config={
-                                    "temperature": model_config["model_kwargs"].get("temperature", 0.7),
-                                    "top_p": model_config["model_kwargs"].get("top_p", 0.8),
-                                    "top_k": model_config["model_kwargs"].get("top_k", 40),
-                                },
+                            
+                            # Build thinking_config if thinking_level is specified
+                            fallback_thinking_config = None
+                            if "thinking_level" in model_config["model_kwargs"]:
+                                fallback_thinking_config = genai_types.ThinkingConfig(
+                                    thinking_level=model_config["model_kwargs"]["thinking_level"].upper()
+                                )
+                            
+                            fallback_generation_config = genai_types.GenerateContentConfig(
+                                temperature=model_config["model_kwargs"].get("temperature", 0.7),
+                                top_p=model_config["model_kwargs"].get("top_p", 0.8),
+                                top_k=model_config["model_kwargs"].get("top_k", 40),
+                                thinking_config=fallback_thinking_config
                             )
-
-                            fallback_response = fallback_model.generate_content(
-                                simplified_prompt, stream=True
+                            
+                            fallback_client = genai.Client(api_key=GOOGLE_API_KEY)
+                            fallback_response = fallback_client.models.generate_content_stream(
+                                model=model_config["model_kwargs"]["model"],
+                                contents=simplified_prompt,
+                                config=fallback_generation_config
                             )
                             for chunk in fallback_response:
-                                if hasattr(chunk, "text"):
+                                if hasattr(chunk, "text") and chunk.text:
                                     yield chunk.text
                     except Exception as e2:
                         logger.error(f"Error in fallback streaming response: {str(e2)}")
diff --git a/api/websocket_wiki.py b/api/websocket_wiki.py
index a6acac50..c7204de3 100644
--- a/api/websocket_wiki.py
+++ b/api/websocket_wiki.py
@@ -3,12 +3,14 @@
 from typing import List, Optional, Dict, Any
 from urllib.parse import unquote
 
-import google.generativeai as genai
+from google import genai
+from google.genai import types as genai_types
 from adalflow.components.model_client.ollama_client import OllamaClient
 from adalflow.core.types import ModelType
 from fastapi import WebSocket, WebSocketDisconnect, HTTPException
 from pydantic import BaseModel, Field
 
+from api.config import get_model_config, configs, OPENROUTER_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY
 from api.config import (
     get_model_config,
     configs,
@@ -559,14 +561,23 @@ async def handle_websocket_chat(websocket: WebSocket):
                 model_type=ModelType.LLM
             )
         else:
-            # Initialize Google Generative AI model
-            model = genai.GenerativeModel(
-                model_name=model_config["model"],
-                generation_config={
-                    "temperature": model_config["temperature"],
-                    "top_p": model_config["top_p"],
-                    "top_k": model_config["top_k"]
-                }
+            # Initialize Google Generative AI client (new google-genai SDK)
+            google_client = genai.Client(api_key=GOOGLE_API_KEY)
+            google_model_name = model_config["model"]
+            
+            # Build thinking_config if thinking_level is specified (for Gemini 3 models)
+            thinking_config = None
+            if "thinking_level" in model_config:
+                thinking_config = genai_types.ThinkingConfig(
+                    thinking_level=model_config["thinking_level"].upper()
+                )
+                logger.info(f"Using thinking_level: {model_config['thinking_level']}")
+            
+            google_generation_config = genai_types.GenerateContentConfig(
+                temperature=model_config["temperature"],
+                top_p=model_config["top_p"],
+                top_k=model_config["top_k"],
+                thinking_config=thinking_config
             )
 
         # Process the response based on the provider
@@ -685,10 +696,14 @@ async def handle_websocket_chat(websocket: WebSocket):
                     # Close the WebSocket connection after sending the error message
                     await websocket.close()
             else:
-                # Google Generative AI (default provider)
-                response = model.generate_content(prompt, stream=True)
+                # Google Generative AI (default provider) - using new google-genai SDK
+                response = google_client.models.generate_content_stream(
+                    model=google_model_name,
+                    contents=prompt,
+                    config=google_generation_config
+                )
                 for chunk in response:
-                    if hasattr(chunk, 'text'):
+                    if hasattr(chunk, 'text') and chunk.text:
                         await websocket.send_text(chunk.text)
                 await websocket.close()
 
@@ -856,22 +871,31 @@ async def handle_websocket_chat(websocket: WebSocket):
                             )
                             await websocket.send_text(error_msg)
                     else:
-                        # Google Generative AI fallback (default provider)
+                        # Google Generative AI fallback (default provider) - using new google-genai SDK
                         model_config = get_model_config(request.provider, request.model)
-                        fallback_model = genai.GenerativeModel(
-                            model_name=model_config["model_kwargs"]["model"],
-                            generation_config={
-                                "temperature": model_config["model_kwargs"].get("temperature", 0.7),
-                                "top_p": model_config["model_kwargs"].get("top_p", 0.8),
-                                "top_k": model_config["model_kwargs"].get("top_k", 40),
-                            },
+                        
+                        # Build thinking_config if thinking_level is specified
+                        fallback_thinking_config = None
+                        if "thinking_level" in model_config["model_kwargs"]:
+                            fallback_thinking_config = genai_types.ThinkingConfig(
+                                thinking_level=model_config["model_kwargs"]["thinking_level"].upper()
+                            )
+                        
+                        fallback_generation_config = genai_types.GenerateContentConfig(
+                            temperature=model_config["model_kwargs"].get("temperature", 0.7),
+                            top_p=model_config["model_kwargs"].get("top_p", 0.8),
+                            top_k=model_config["model_kwargs"].get("top_k", 40),
+                            thinking_config=fallback_thinking_config
                         )
-
-                        fallback_response = fallback_model.generate_content(
-                            simplified_prompt, stream=True
+                        
+                        fallback_client = genai.Client(api_key=GOOGLE_API_KEY)
+                        fallback_response = fallback_client.models.generate_content_stream(
+                            model=model_config["model_kwargs"]["model"],
+                            contents=simplified_prompt,
+                            config=fallback_generation_config
                         )
                         for chunk in fallback_response:
-                            if hasattr(chunk, "text"):
+                            if hasattr(chunk, "text") and chunk.text:
                                 await websocket.send_text(chunk.text)
                 except Exception as e2:
                     logger.error(f"Error in fallback streaming response: {str(e2)}")