WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Releases: llamastack/llama-stack

v0.3.4

03 Dec 19:05

Choose a tag to compare

What's Changed

Full Changelog: v0.3.3...v0.3.4

v0.3.3

24 Nov 21:21

Choose a tag to compare

What's Changed

  • fix: allowed_models config did not filter models (backport #4030) by @mergify[bot] in #4223
  • fix: Vector store persistence across server restarts (backport #3977) by @mergify[bot] in #4225
  • fix: enable SQLite WAL mode to prevent database locking errors (backport #4048) by @mergify[bot] in #4226
  • fix(docs): fix glob vulnerability (backport #4193) by @mergify[bot] in #4227
  • fix: enforce allowed_models during inference requests (backport #4197) by @mergify[bot] in #4228
  • fix: update hard-coded google model names (backport #4212) by @mergify[bot] in #4229

Full Changelog: v0.3.2...v0.3.3

v0.3.2

12 Nov 23:22

Choose a tag to compare

What's Changed

  • fix: only set UV_INDEX_STRATEGY when UV_EXTRA_INDEX_URL is present by @ashwinb in #4017
  • fix(ci): export UV_INDEX_STRATEGY to current shell before running uv sync by @ashwinb in #4019
  • fix: print help for list-deps if no args (backport #4078) by @mergify[bot] in #4083
  • docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (backport #4122) by @mergify[bot] in #4136
  • docs: clarify model identification uses provider_model_id not model_id (backport #4128) by @mergify[bot] in #4137
  • chore(ci): remove unused recordings (backport #4074) by @mergify[bot] in #4141
  • fix: harden storage semantics (backport #4118) by @mergify[bot] in #4138
  • fix(inference): enable routing of models with provider_data alone (backport #3928) by @mergify[bot] in #4142

Full Changelog: v0.3.1...v0.3.2

v0.3.1

31 Oct 23:05

Choose a tag to compare

What's Changed

  • feat(cherry-pick): fixes for 0.3.1 release by @ashwinb in #3998
  • fix(ci): install client from release branch before uv sync by @ashwinb in #4002
  • chore(release-0.3.x): handle missing external_providers_dir by @ashwinb in #4011
  • fix(ci): unset empty UV index env vars to prevent uv errors by @ashwinb in #4013
  • feat: support workers in run config by @ashwinb in #4014
  • docs: A getting started notebook featuring simple agent examples by @ashwinb in #4015

Full Changelog: v0.3.0...v0.3.1

v0.3.0

22 Oct 19:21

Choose a tag to compare

Highlights

  • Stable OpenAI-Compatible APIs
  • Llama Stack now separates APIs into stable (/v1/), experimental (/v1alpha/ and /v1beta/) and deprecated (deprecated = True.)
  • extra_body/metadata support for APIs which support extra functionality compared to the OpenAI implementation
  • Documentation overhaul: Migration to Docusaurus, modern formatting, and improved API docs

What's Changed

Read more

v0.2.23

26 Sep 21:41

Choose a tag to compare

Highlights

  • Overhauls documentation with Docusaurus migration and modern formatting.
  • Standardizes Ollama and Fireworks provider with OpenAI compatibility layer.
  • Combines dynamic model discovery with static embedding metadata for better model information.
  • Refactors server.main for better code organization.
  • Introduces API leveling with post_training and eval promoted to v1alpha.

What's Changed

New Contributors

Full Changelog: v0.2.22...v0.2.23

v0.2.22

16 Sep 20:15

Choose a tag to compare

Highlights

  • Migrated to unified "setups" system for test config
  • Added default inference store automatically during llama stack build
  • Introduced write queue for inference store
  • Proposed API leveling framework
  • Enhanced Together provider with embedding and dynamic model support

What's Changed

  • feat(tests): migrate to global "setups" system for test configuration by @ashwinb in #3390
  • chore: remove unused variable by @ehhuang in #3389
  • feat: include a default inference store during llama stack build by @mattf in #3373
  • feat: Add vector_db_id to chunk metadata by @are-ces in #3304
  • fix: Add missing files_api parameter to MemoryToolRuntimeImpl test by @akram in #3394
  • fix: pre-commit issues: non executable shebang file and removal of @pytest.mark.asyncio decorator by @akram in #3397
  • chore: update the vertexai inference impl to use openai-python for openai-compat functions by @mattf in #3377
  • ci: Re-enable pre-commit to fail by @leseb in #3399
  • fix: Fireworks chat completion broken due to telemetry by @slekkala1 in #3392
  • chore: logging perf improvments by @ehhuang in #3393
  • revert: Fireworks chat completion broken due to telemetry by @franciscojavierarceo in #3402
  • fix: unbound variable error in schedule-record-workflow.sh by @derekhiggins in #3401
  • chore: introduce write queue for inference_store by @ehhuang in #3383
  • docs: horizontal nav bar by @reluctantfuturist in #3407
  • chore(python-deps): bump pytest from 8.4.1 to 8.4.2 by @dependabot[bot] in #3359
  • chore(python-deps): bump locust from 2.39.1 to 2.40.1 by @dependabot[bot] in #3358
  • chore(python-deps): bump openai from 1.102.0 to 1.106.1 by @dependabot[bot] in #3356
  • chore(ui-deps): bump tailwindcss from 4.1.6 to 4.1.13 in /llama_stack/ui by @dependabot[bot] in #3362
  • chore: telemetry test by @ehhuang in #3405
  • chore: move benchmarking related code by @ehhuang in #3406
  • fix(inference_store): on duplicate chat completion IDs, replace by @ashwinb in #3408
  • chore: remove openai dependency from providers by @leseb in #3398
  • fix: AWS Bedrock inference profile ID conversion for region-specific endpoints by @skamenan7 in #3386
  • chore(replay): improve replay robustness with un-validated construction by @mattf in #3414
  • feat: add Azure OpenAI inference provider support by @leseb in #3396
  • chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG by @franciscojavierarceo in #3367
  • chore: update the vLLM inference impl to use OpenAIMixin for openai-compat functions by @mattf in #3404
  • chore(unit tests): remove network use, update async test by @mattf in #3418
  • feat: Add langchain llamastack Integration example notebook by @slekkala1 in #3314
  • fix: oasdiff enhancements and stability by @cdoern in #3419
  • fix: Improve pre-commit workflow error handling and feedback by @akram in #3400
  • feat: migrate to FIPS-validated cryptographic algorithms by @rhdedgar in #3423
  • chore(recorder, tests): add test for openai /v1/models by @mattf in #3426
  • chore(tests): always show slowest tests by @mattf in #3431
  • chore(recorder): add support for NOT_GIVEN by @mattf in #3430
  • chore(ui-deps): bump next from 15.3.3 to 15.5.3 in /llama_stack/ui by @dependabot[bot] in #3438
  • chore(ui-deps): bump @radix-ui/react-select from 2.2.5 to 2.2.6 in /llama_stack/ui by @dependabot[bot] in #3437
  • chore(recorder): update mocks to be closer to non-mock environment by @mattf in #3442
  • feat: create HTTP DELETE API endpoints to unregister ScoringFn and Benchmark resources in Llama Stack by @r3v5 in #3371
  • feat: add dynamic model registration support to TGI inference by @mattf in #3417
  • chore: various watsonx fixes by @leseb in #3428
  • feat: introduce api leveling proposal by @cdoern in #3317
  • fix: docker failing to start container[pydantic] by @slekkala1 in #3460
  • feat: add embedding and dynamic model support to Together inference adapter by @mattf in #3458

New Contributors

Full Changelog: v0.2.21...v0.2.22

v0.2.21

08 Sep 22:30

Choose a tag to compare

Highlights

  • Testing infrastructure improvements and fixes
  • Backwards compatibility tests for core APIs
  • Added OpenAI Prompts API
  • Updated RAG Tool to use Files API and Vector Stores API
  • Descriptive MCP server connection errors

What's Changed

  • feat(files, s3, expiration): add expires_after support to S3 files provider by @mattf in #3283
  • docs: add VLM NIM example by @jiayin-nvidia in #3277
  • chore(migrate apis): move VectorDBWithIndex from embeddings to openai_embeddings by @mattf in #3294
  • chore(ui-deps): bump framer-motion from 11.18.2 to 12.23.12 in /llama_stack/ui by @dependabot[bot] in #3291
  • chore(ui-deps): bump @types/node from 20.17.47 to 24.3.0 in /llama_stack/ui by @dependabot[bot] in #3290
  • chore(ui-deps): bump eslint-config-next from 15.3.2 to 15.5.2 in /llama_stack/ui by @dependabot[bot] in #3288
  • chore(ui-deps): bump prettier from 3.5.3 to 3.6.2 in /llama_stack/ui by @dependabot[bot] in #3289
  • chore(ui-deps): bump @radix-ui/react-tooltip from 1.2.6 to 1.2.8 in /llama_stack/ui by @dependabot[bot] in #3287
  • chore(python-deps): bump locust from 2.39.0 to 2.39.1 by @dependabot[bot] in #3284
  • refactor: remove lama-api-client from pyproject.toml by @r3v5 in #3299
  • chore(python-deps): bump pymilvus from 2.6.0 to 2.6.1 by @dependabot[bot] in #3285
  • refactor: use generic WeightedInMemoryAggregator for hybrid search in SQLiteVecIndex by @r3v5 in #3303
  • fix: Fix mock vector DB schema in Qdrant tests by @varshaprasad96 in #3295
  • chore(python-deps): replace ibm_watson_machine_learning with ibm_watsonx_ai by @are-ces in #3302
  • chore: Improve error message for missing provider dependencies by @ehhuang in #3315
  • feat(tests): auto-merge all model list responses and unify recordings by @ashwinb in #3320
  • fix(tests): set inference mode to be replay by default by @ashwinb in #3326
  • chore: handle missing finish_reason by @ehhuang in #3328
  • fix: distro-codegen pre-commit hook file pattern by @derekhiggins in #3337
  • refactor(server): remove hardcoded 409 and 404 status codes in server.py using httpx constants by @r3v5 in #3333
  • fix: Make SentenceTransformer embedding operations non-blocking by @derekhiggins in #3335
  • chore: async inference store write by @ehhuang in #3318
  • fix: Move to older version for docker container failure [fireworks-ai] by @slekkala1 in #3338
  • fix: show descriptive MCP server connection errors instead of generic 500s by @skamenan7 in #3256
  • chore: unbreak inference store test by @ehhuang in #3340
  • fix: use lambda pattern for bedrock config env vars by @skamenan7 in #3307
  • fix: Fix locations of distrubution runtime directories by @derekhiggins in #3336
  • feat!: Migrate Vector DB IDs to Vector Store IDs (breaking change) by @franciscojavierarceo in #3253
  • feat(batches, completions): add /v1/completions support to /v1/batches by @mattf in #3309
  • chore(sambanova test): skip with_n tests for sambanova, it is not implemented server-side by @mattf in #3342
  • feat(tests): introduce a test "suite" concept to encompass dirs, options by @ashwinb in #3339
  • feat: Updating Rag Tool to use Files API and Vector Stores API by @franciscojavierarceo in #3344
  • chore: update the gemini inference impl to use openai-python for openai-compat functions by @mattf in #3351
  • chore(gemini, tests): add skips for n and completions, gemini api does not support them by @mattf in #3350
  • chore: update the sambanova inference impl to use openai-python for openai-compat functions by @mattf in #3345
  • chore(groq test): skip with_n tests for groq, it is not supported server-side by @mattf in #3346
  • test: introduce api conformance test by @cdoern in #3257
  • chore: update the groq inference impl to use openai-python for openai-compat functions by @mattf in #3348
  • chore(groq test): skip completions tests for groq, api is not supported server-side by @mattf in #3347
  • chore: update the anthropic inference impl to use openai-python for openai-compat functions by @mattf in #3366
  • chore(ui-deps): bump react-dom and @types/react-dom in /llama_stack/ui by @dependabot[bot] in #3360
  • chore(ui-deps): bump sonner from 2.0.6 to 2.0.7 in /llama_stack/ui by @dependabot[bot] in #3364
  • chore(ui-deps): bump lucide-react from 0.510.0 to 0.542.0 in /llama_stack/ui by @dependabot[bot] in #3363
  • chore(ui-deps): bump @radix-ui/react-dropdown-menu from 2.1.14 to 2.1.16 in /llama_stack/ui by @dependabot[bot] in #3361
  • chore(github-deps): bump astral-sh/setup-uv from 6.6.0 to 6.6.1 by @dependabot[bot] in #3355
  • docs: Update changelog by @terrytangyuan in #3343
  • chore(github-deps): bump actions/stale from 9.1.0 to 10.0.0 by @dependabot[bot] in #3352
  • chore(github-deps): bump actions/setup-node from 4.4.0 to 5.0.0 by @dependabot[bot] in #3353
  • chore(github-deps): bump actions/setup-python from 5.6.0 to 6.0.0 by @dependabot[bot] in #3354
  • chore(github-deps): bump actions/checkout from 4.1.7 to 5.0.0 by @dependabot[bot] in #3357
  • feat: Add Kubernetes auth provider to use SelfSubjectReview and kubernetes api server by @akram in #2559
  • docs: add MongoDB to external provider list by @mohammaddaoudfarooqi in #3369
  • feat: Adding OpenAI Prompts API by @franciscojavierarceo in #3319
  • fix: environment variable typo in inference recorder error message by @derekhiggins in #3374
  • fix: use dataset version 4.0.0 or above by @slekkala1 in #3379
  • fix: pre-commit failing by @slekkala1 in #3381
  • fix(deps): bump datasets versions for all providers by @ashwinb in #3382

New Contributors

Full Changelog: v0.2.20...v0.2.21

v0.2.20

29 Aug 22:25

Choose a tag to compare

Here are some key changes that are coming as part of this release.

Build and Environment

  • Environment improvements: fixed env var replacement to preserve types.
  • Docker stability: fixed container startup failures for Fireworks AI provider.
  • Removed absolute paths in build for better portability.

Features

  • UI Enhancements: Implemented file upload and VectorDB creation/configuration directly in UI.
  • Vector Store Improvements: Added keyword, vector, and hybrid search inside vector store.
  • Added S3 authorization support for file providers.
  • SQL Store: Added inequality support to where clause.

Documentation

  • Fixed post-training docs.
  • Added Contributor Guidelines for creating Internal vs. External providers.

Fixes

  • Removed unsupported bfcl scoring function.
  • Multiple reliability and configuration fixes for providers and environment handling.

Engineering / Chores

  • Cleaner internal development setup with consistent paths.
  • Incremental improvements to provider integration and vector store behavior.

What's Changed

  • docs: fix post_training docs by @cdoern in #3262
  • chore: remove absolute paths by @raghotham in #3263
  • docs: Contributor guidelines for creating Internal or External providers by @kelbrown20 in #3111
  • feat(UI): Implementing File Upload and VectorDB Creation/Configuration in Playground by @franciscojavierarceo in #3266
  • fix(env): env var replacement preserve types by @omertuc in #3270
  • fix: docker failing to start container [fireworks-ai] by @slekkala1 in #3267
  • chore(dev): add inequality support to sqlstore where clause by @mattf in #3272
  • feat(s3 auth): add authorization support for s3 files provider by @mattf in #3265
  • feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider by @r3v5 in #3064
  • fix: Remove bfcl scoring function as not supported by @slekkala1 in #3281

Full Changelog: v0.2.19...v0.2.20

New Contributors

  • @omertuc made their first contribution in #3270
  • @r3v5 made their first contribution in vector store hybrid search

v0.2.19

26 Aug 22:06

Choose a tag to compare

Highlights

What's Changed

  • chore: Faster npm pre-commit by @franciscojavierarceo in #3206
  • fix: disable ui-prettier & ui-eslint by @mattf in #3207
  • chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage by @Elbehery in #3061
  • fix: fix openai_embeddings for asymmetric embedding NIMs by @jiayin-nvidia in #3205
  • chore(files tests): update files integration tests and fix inline::localfs by @mattf in #3195
  • fix: Fix broken package-lock.json by @franciscojavierarceo in #3209
  • fix: Use pool_pre_ping=True in SQLAlchemy engine creation by @omertuc in #3208
  • fix: handle mcp tool calls in previous response correctly by @grs in #3155
  • chore: Update dependabot to capture package-lock.json by @franciscojavierarceo in #3212
  • chore(python-deps): bump weaviate-client from 4.16.5 to 4.16.9 by @dependabot[bot] in #3219
  • chore(python-deps): bump locust from 2.38.0 to 2.39.0 by @dependabot[bot] in #3221
  • chore(ui-deps): bump tailwind-merge from 3.3.0 to 3.3.1 in /llama_stack/ui by @dependabot[bot] in #3223
  • chore(ui-deps): bump @radix-ui/react-separator from 1.1.6 to 1.1.7 in /llama_stack/ui by @dependabot[bot] in #3222
  • chore(ui-deps): bump eslint-config-prettier from 10.1.5 to 10.1.8 in /llama_stack/ui by @dependabot[bot] in #3220
  • chore(ui-deps): bump @radix-ui/react-collapsible from 1.1.11 to 1.1.12 in /llama_stack/ui by @dependabot[bot] in #3218
  • chore(python-deps): bump chromadb from 1.0.16 to 1.0.20 by @dependabot[bot] in #3217
  • chore(ui-deps): bump typescript from 5.8.3 to 5.9.2 in /llama_stack/ui by @dependabot[bot] in #3216
  • chore(github-deps): bump actions/setup-node from 4.1.0 to 4.4.0 by @dependabot[bot] in #3214
  • chore(github-deps): bump amannn/action-semantic-pull-request from 5.5.3 to 6.1.0 by @dependabot[bot] in #3215
  • chore(python-deps): bump llama-api-client from 0.1.2 to 0.2.0 by @dependabot[bot] in #3173
  • chore(github-deps): bump actions/checkout from 4.2.2 to 5.0.0 by @dependabot[bot] in #3178
  • chore(github-deps): bump astral-sh/setup-uv from 6.4.3 to 6.5.0 by @dependabot[bot] in #3179
  • feat: Add CORS configuration support for server by @skamenan7 in #3201
  • feat: Remove initialize() Method from LlamaStackAsLibrary by @Elbehery in #2979
  • docs: update the docs for NVIDIA Inference provider by @jiayin-nvidia in #3227
  • fix: fix the error type in embedding test case by @jiayin-nvidia in #3197
  • refactor(logging): rename llama_stack logger categories by @Elbehery in #3065
  • feat(UI): Adding Conversation History by @franciscojavierarceo in #3203
  • feat(api): introduce /rerank by @ehhuang in #2940
  • feat: Add S3 Files Provider by @mattf in #3202
  • chore: Add UI linter back by @franciscojavierarceo in #3230
  • fix: ensure assistant message is followed by tool call message as expected by openai by @grs in #3224
  • chore: indicate to mypy that InferenceProvider.rerank is concrete by @mattf in #3238
  • chore: indicate to mypy that InferenceProvider.batch_completion/batch_chat_completion is concrete by @mattf in #3239
  • feat: implement query_metrics by @cdoern in #3074
  • feat(distro): fork off a starter-gpu distribution by @ashwinb in #3240
  • feat: Add optional idempotency support to batches API by @mattf in #3171
  • chore(github-deps): bump actions/setup-node from 4.1.0 to 4.4.0 by @dependabot[bot] in #3246
  • chore(ui-deps): bump remeda from 2.26.1 to 2.30.0 in /llama_stack/ui by @dependabot[bot] in #3242
  • chore(ui-deps): bump @testing-library/dom from 10.4.0 to 10.4.1 in /llama_stack/ui by @dependabot[bot] in #3244
  • chore(ui-deps): bump eslint-plugin-prettier from 5.4.0 to 5.5.4 in /llama_stack/ui by @dependabot[bot] in #3241
  • chore(github-deps): bump amannn/action-semantic-pull-request from 6.1.0 to 6.1.1 by @dependabot[bot] in #3248
  • chore(github-deps): bump astral-sh/setup-uv from 6.5.0 to 6.6.0 by @dependabot[bot] in #3247
  • chore(ui-deps): bump @testing-library/jest-dom from 6.6.3 to 6.8.0 in /llama_stack/ui by @dependabot[bot] in #3243
  • feat(testing): remove SQLite dependency from inference recorder by @derekhiggins in #3254
  • feat: Add example notebook for Langchain + LLAMAStack integration by @slekkala1 in #3228
  • chore: Add example notebook for Langchain + LLAMAStack integration (#3228) by @mattf in #3259
  • feat(distro): no huggingface provider for starter by @ashwinb in #3258

Full Changelog: v0.2.18...v0.2.19