WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 4 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,7 @@
# querychat: Chat with your data in any language
# querychat <a href="https://posit-dev.github.io/querychat/"><img src="pkg-r/man/figures/logo.png" align="right" height="138" alt="querychat website" /></a>

querychat is a multilingual package that allows you to chat with your data using natural language queries. It's available for:
QueryChat facilitates safe and reliable natural language exploration of tabular data, powered by SQL and large language models (LLMs).

- [R - Shiny](pkg-r/README.md)
- [Python - Shiny for Python](pkg-py/README.md)
To get started, see the [official website](https://posit-dev.github.io/querychat/).

## Overview

Imagine typing questions like these directly into your dashboard, and seeing the results in realtime:

* "Show only penguins that are not species Gentoo and have a bill length greater than 50mm."
* "Show only blue states with an incidence rate greater than 100 per 100,000 people."
* "What is the average mpg of cars with 6 cylinders?"

querychat is a drop-in component for Shiny that allows users to query a data frame using natural language. The results are available as a reactive data frame, so they can be easily used from Shiny outputs, reactive expressions, downloads, etc.

| ![Animation of a dashboard being filtered by a chatbot in the sidebar](animation.gif) |
|-|

[Live demo](https://jcheng.shinyapps.io/sidebot/)

**This is not as terrible an idea as you might think!** We need to be very careful when bringing LLMs into data analysis, as we all know that they are prone to hallucinations and other classes of errors. querychat is designed to excel in reliability, transparency, and reproducibility by using this one technique: denying it raw access to the data, and forcing it to write SQL queries instead.

## How it works

### Powered by LLMs

querychat's natural language chat experience is powered by LLMs (like GPT-4o, Claude 3.5 Sonnet, etc.) that support function/tool calling capabilities.

### Powered by SQL

querychat doesn't send the raw data to the LLM, asking it to guess summary statistics. Instead, the LLM generates precise SQL queries to filter the data or directly calculate statistics. This is crucial for ensuring relability, transparency, and reproducibility:

- **Reliability:** Today's LLMs are excellent at writing SQL, but bad at direct calculation.
- **Transparency:** querychat always displays the SQL to the user, so it can be vetted instead of blindly trusted.
- **Reproducibility:** The SQL query can be easily copied and reused.

Currently, querychat uses DuckDB for its SQL engine when working with data frames. For database sources, it uses the native SQL dialect of the connected database.

## Language-specific Documentation

For detailed information on how to use querychat in your preferred language, see the language-specific READMEs:

- [R Documentation](pkg-r/README.md)
- [Python Documentation](pkg-py/README.md)
Or, the README for [R](pkg-r/README.md) and [Python](pkg-py/README.md).
4 changes: 2 additions & 2 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -222,9 +222,9 @@
</a>
</div>
<h1 class="package-title">querychat</h1>
<p class="package-subtitle">Chat with your data in any language</p>
<p class="package-subtitle">Explore data using natural language queries</p>
<p class="package-description">
A drop-in component for Shiny that allows you to chat with your data using natural language queries.
QueryChat facilitates safe and reliable natural language exploration of tabular data, powered by SQL and large language models (LLMs).
Available for both R and Python.
</p>
<img src="animation.gif"
Expand Down
46 changes: 38 additions & 8 deletions pkg-py/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,50 @@
# querychat for Python
# querychat <a href="https://posit-dev.github.io/querychat/py/"><img src="https://posit-dev.github.io/querychat/images/querychat.png" align="right" height="138" alt="querychat website" /></a>

Please see [the package documentation site](https://posit-dev.github.io/querychat/py/index.html) for installation, setup, and usage.
<p>
<!-- badges start -->
<a href="https://pypi.org/project/querychat/"><img alt="PyPI" src="https://img.shields.io/pypi/v/querychat?logo=python&logoColor=white&color=orange"></a>
<a href="https://choosealicense.com/licenses/mit/"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="MIT License"></a>
<a href="https://pypi.org/project/querychat"><img src="https://img.shields.io/pypi/pyversions/querychat.svg" alt="versions"></a>
<a href="https://github.com/posit-dev/querychat"><img src="https://github.com/posit-dev/querychat/actions/workflows/test.yml/badge.svg?branch=main" alt="Python Tests"></a>
<!-- badges end -->
</p>

If you are looking for querychat python examples,
you can find them in the `examples/` directory.

QueryChat facilitates safe and reliable natural language exploration of tabular data, powered by SQL and large language models (LLMs). For analysts, it offers an intuitive web application where they can quickly ask questions of their data and receive verifiable data-driven answers. For software developers, QueryChat provides a comprehensive Python API to access core functionality -- including chat UI, generated SQL statements, resulting data, and more. This capability enables the seamless integration of natural language querying into bespoke data applications.

## Installation

You can install the package from PyPI using pip:
Install the latest stable release [from PyPI](https://pypi.org/project/querychat/):

```bash
pip install querychat
```

Or you can install querychat directly from GitHub:
## Quick start

```bash
pip install "querychat @ git+https://github.com/posit-dev/querychat"
The main entry point is the [`QueryChat` class](https://posit-dev.github.io/querychat/py/reference/QueryChat.html). It requires a [data source](https://posit-dev.github.io/querychat/py/data-sources.html) (e.g., pandas, polars, etc) and a name for the data.

```python
from querychat import QueryChat
from querychat.data import titanic

qc = QueryChat(titanic(), "titanic")
app = qc.app()
# app.run()
```

<p align="center">
<img src="docs/images/quickstart.png" alt="QueryChat interface showing natural language queries" width="85%">
</p>

## Custom apps

Build your own custom web apps with natural language querying capabilities, such as [this one](https://github.com/posit-conf-2025/llm/blob/main/_solutions/25_querychat/25_querychat_02-end-app.R) which provides a bespoke interface for exploring Airbnb listings:

<p align="center">
<img src="docs/images/airbnb.png" alt="A custom app for exploring Airbnb listings, powered by QueryChat." width="85%">
</p>

## Learn more

See the [website](https://posit-dev.github.io/querychat/py) to learn more.
2 changes: 1 addition & 1 deletion pkg-py/docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ quartodoc:
sidebar: reference/_sidebar.yml
css: reference/_styles-quartodoc.css
sections:
- title: The Querychat class
- title: The QueryChat class
desc: The starting point for any QueryChat session
contents:
- name: QueryChat
Expand Down
141 changes: 20 additions & 121 deletions pkg-py/docs/build.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -346,9 +346,25 @@ def _():

This is equivalent to the user asking the LLM to "reset" or "show all data".

## Multiple datasets
## Multiple tables

You can use multiple QueryChat instances in a single app to explore different datasets. Just ensure each instance has a different table name (or `id` which derives the table name) to avoid conflicts. Here's an example with two datasets:
Currently, you have two options for exploring multiple tables in QueryChat:

1. Join the tables into a single table before passing to QueryChat
2. Use multiple QueryChat instances in the same app

The first option makes it possible to chat with multiple tables inside a single chat interface, whereas the second option requires a separate chat interface for each table.

::: {.callout-note}
### Multiple filtered tables

We plan to support multiple filtered tables in a future release -- if you're interested in this feature, please up vote [the relevant issue](https://github.com/posit-dev/querychat/issues/6)
:::

Here's an example of the second approach, using two separate `QueryChat` instances to explore both the `titanic` and `penguins` datasets within the same app:

<details>
<summary> <code>app.py </code> </summary>

```python
from seaborn import load_dataset
Expand Down Expand Up @@ -384,127 +400,10 @@ ui.page_opts(
)
```

![](/images/multiple-datasets.png){fig-alt="Screenshot of a querychat app with two datasets: titanic and penguins." class="lightbox shadow rounded mb-3"}


## Complete example

Here's a complete example bringing together multiple concepts - a Titanic survival analysis dashboard with natural language exploration, coordinated visualizations, and custom controls:

```python
from shiny.express import render, ui
from querychat.express import QueryChat
from querychat.data import titanic
import plotly.express as px

# Create QueryChat
qc = QueryChat(
titanic(),
"titanic",
data_description="Titanic passenger data with survival outcomes",
)

# Page configuration
ui.page_opts(
title="Titanic Survival Analysis",
fillable=True,
class_="bslib-page-dashboard",
)

# Create sidebar with chat
with ui.sidebar(width=400):
qc.ui()
ui.hr()
ui.input_action_button("reset", "Reset Filters", class_="w-100")

# Summary cards
with ui.layout_columns():
with ui.value_box(showcase=ui.icon("users")):
"Passengers"

@render.text
def count():
return str(len(qc.df()))

with ui.value_box(showcase=ui.icon("heart")):
"Survival Rate"

@render.text
def survival():
rate = qc.df()['survived'].mean() * 100
return f"{rate:.1f}%"

with ui.value_box(showcase=ui.icon("coins")):
"Avg Fare"

@render.text
def fare():
avg = qc.df()['fare'].mean()
return f"${avg:.2f}"

# Main content area with visualizations
with ui.layout_columns():
with ui.card():
with ui.card_header():
"Data Table"

@render.text
def table_title():
return f" - {qc.title()}" if qc.title() else ""

@render.data_frame
def data_table():
return qc.df()

with ui.card():
ui.card_header("Survival by Class")

@render.plot
def survival_by_class():
df = qc.df()
summary = df.groupby('pclass')['survived'].mean().reset_index()
fig = px.bar(
summary,
x='pclass',
y='survived',
labels={'pclass': 'Class', 'survived': 'Survival Rate'},
)
return fig

with ui.layout_columns():
with ui.card():
ui.card_header("Age Distribution")

@render.plot
def age_dist():
df = qc.df()
fig = px.histogram(df, x='age', nbins=30)
return fig

with ui.card():
ui.card_header("Fare by Class")

@render.plot
def fare_by_class():
df = qc.df()
fig = px.box(df, x='pclass', y='fare', color='survived')
return fig
</details>

# Reset button handler
@reactive.effect
@reactive.event(input.reset)
def handle_reset():
qc.sql("")
qc.title(None)
ui.notification_show("Filters cleared", type="message")
```
![](/images/multiple-datasets.png){fig-alt="Screenshot of a querychat app with two datasets: titanic and penguins." class="lightbox shadow rounded mb-3"}

This dashboard demonstrates:
- Natural language filtering through chat
- Multiple coordinated views (cards, table, plots)
- Custom reset button alongside natural language
- Dynamic titles reflecting current state
- Responsive layout that updates together

## See also

Expand Down
8 changes: 4 additions & 4 deletions pkg-py/docs/context.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
title: Provide context
---

To improve the LLM's ability to accurately translate natural language queries into SQL, it often helps to provide relevant metadata. Querychat automatically provides things like column names and data types to the LLM, but you can enhance this further with additional context like [data descriptions](#data-description). You can also provide [custom instructions](#extra-instructions) to add additional behaviors and even supply a fully [custom prompt template](#custom-template), if desired.
To improve the LLM's ability to accurately translate natural language queries into SQL, it often helps to provide relevant metadata. QueryChat automatically provides things like column names and data types to the LLM, but you can enhance this further with additional context like [data descriptions](#data-description). You can also provide [custom instructions](#extra-instructions) to add additional behaviors and even supply a fully [custom prompt template](#custom-template), if desired.

All of this information is provided to the LLM as part of the **system prompt** -- a string of text containing instructions and context for the LLM to consider when responding to user queries.

## Default prompt

For full visibility into the full system prompt that Querychat generates for the LLM, see the `system_prompt` property. This is useful for debugging and understanding exactly what context the LLM is using:
For full visibility into the full system prompt that QueryChat generates for the LLM, see the `system_prompt` property. This is useful for debugging and understanding exactly what context the LLM is using:

```python
from querychat import QueryChat
Expand All @@ -32,7 +32,7 @@ By default, the system prompt contains the following components:

## Data description {#data-description}

If your column names are descriptive, Querychat may already work well without additional context. However, if your columns are named `x`, `V1`, `value`, etc., you should provide a data description. Use the `data_description` parameter for this:
If your column names are descriptive, QueryChat may already work well without additional context. However, if your columns are named `x`, `V1`, `value`, etc., you should provide a data description. Use the `data_description` parameter for this:

```{.python filename="titanic-app.py"}
from pathlib import Path
Expand All @@ -46,7 +46,7 @@ qc = QueryChat(
app = qc.app()
```

Querychat doesn't need this information in any particular format -- just provide what a human would find helpful:
QueryChat doesn't need this information in any particular format -- just provide what a human would find helpful:

```{.markdown filename="data_description.md"}
This dataset contains information about Titanic passengers, collected for predicting survival.
Expand Down
Binary file added pkg-py/docs/images/airbnb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/multiple-datasets.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/plotly-data-view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/querychat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/quickstart-filter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/quickstart-summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/quickstart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified pkg-py/docs/images/rich-data-views.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed pkg-py/docs/images/sidebot.png
Binary file not shown.
Loading