feat(ci): add Bedrock integration tests with record/replay #4292

skamenan7 · 2025-12-03T19:08:02Z

What does this PR do?

Adds Bedrock integration tests to CI using a record/replay mechanism. Tests run against pre-recorded API responses, without the need for AWS credentials in CI.

The main challenge was that Bedrock's OpenAI-compatible API doesn't support everything - no tool calling, no embeddings, no dynamic model listing. So instead of running the full base suite (which would fail on ~40 tests), I created a dedicated bedrock suite with just the tests that actually work.

Changes:

New run-bedrock.yaml stack config with dummy API key for replay mode
bedrock suite in suites.py pointing to 3 test functions (6 parametrized tests total)
Config resolution fix for distro::file.yaml format in library mode
Added a docs/source/providers/inference/bedrock_recording_guide.md so contributors with AWS access can re-record tests when needed

Closes #4095

Test Plan

Run from tests/integration/inference:


  uv run pytest -v \
    test_openai_completion.py::test_openai_chat_completion_non_streaming \
    test_openai_completion.py::test_openai_chat_completion_streaming \
    test_openai_completion.py::test_inference_store \
    --setup=bedrock \
    --stack-config=ci-tests::run-bedrock.yaml \
    --inference-mode=replay \
    -k "client_with_models"

Expected: 6 passed

test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=bedrock/openai.gpt-oss-20b-1:0-inference:chat_com
 pletion:non_streaming_01] PASSED [ 16%]
 test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=bedrock/openai.gpt-oss-20b-1:0-inference:chat_complet
 ion:streaming_01] PASSED [ 33%]
 test_openai_completion.py::test_inference_store[client_with_models-txt=bedrock/openai.gpt-oss-20b-1:0-True] PASSED                          [
  50%]
 test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=bedrock/openai.gpt-oss-20b-1:0-inference:chat_com
 pletion:non_streaming_02] PASSED [ 66%]
 test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=bedrock/openai.gpt-oss-20b-1:0-inference:chat_complet
 ion:streaming_02] PASSED [ 83%]
 test_openai_completion.py::test_inference_store[client_with_models-txt=bedrock/openai.gpt-oss-20b-1:0-False] PASSED
 [100%]

derekhiggins · 2025-12-04T08:42:32Z

src/llama_stack/distributions/ci-tests/run-bedrock.yaml

+providers:
+  inference:
+    - provider_id: bedrock
+      provider_type: remote::bedrock


remote::bedrock is part of ci-tests/run.yaml , can this be used instead (i.e. no need for a new run.yaml)

Good point! I'll check if ci-tests/run.yaml can be used directly. The separate file was created to have a minimal bedrock-only config, but if the existing run.yaml already includes bedrock provider, I can simplify by using that instead.

docs/source/providers/inference/bedrock_recording_guide.md

src/llama_stack/testing/api_recorder.py

derekhiggins · 2025-12-04T09:01:41Z

tests/integration/ci_matrix.json

 {
  "default": [
    {"suite": "base", "setup": "ollama"},
+    {"suite": "bedrock", "setup": "bedrock", "allowed_clients": ["library"], "stack_config": "ci-tests::run-bedrock.yaml"},


I'm not sure we can just add this to CI here, anybody updating tests that need new recording would need a bedrock key (or we'd need a key in CI for new recordings, is that the plan?)

- Create dedicated bedrock suite with 3 compatible test functions - Add run-bedrock.yaml stack config for CI - Enable config resolution for distro::file.yaml format in library mode - Add test recordings for streaming, non-streaming, and inference store - Skip tool-calling tests for Bedrock (not supported by AWS) - Add recording guide documentation for contributors Tested with GPT-OSS model on us-west-2 region.

The bedrock suite uses specific test function paths like "test_file.py::test_function" in its roots. The pytest_ignore_collect hook was treating these as filesystem paths, causing 0 tests to be collected. Changes: - Strip "::test_function" suffix when checking file paths - Add pytest_collection_modifyitems to filter to specific tests Without this fix, cleanup_recordings.py marks bedrock recordings as unused and deletes them in CI.

- Skip embedding model validation for bedrock setup (empty string forces override) - Exclude stream_options from hash computation for consistent recordings The bedrock provider adds stream_options for telemetry but this was causing hash mismatches since recordings were made before this was added. Now infrastructure/telemetry fields are excluded from the hash computation.

- Remove separate run-bedrock.yaml in favor of modifying templates - Update bedrock provider config: dummy API key for replay mode, us-west-2 region - Pre-register bedrock model in ci-tests template (Bedrock /v1/models returns empty) - Update ci_matrix.json to use default stack config

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 3, 2025

skamenan7 force-pushed the feat/bedrock-ci-4095 branch 7 times, most recently from 305be8e to 0667de8 Compare December 3, 2025 22:46

derekhiggins reviewed Dec 4, 2025

View reviewed changes

skamenan7 force-pushed the feat/bedrock-ci-4095 branch 3 times, most recently from 69eb2c4 to 51121ae Compare December 4, 2025 20:13

skamenan7 marked this pull request as ready for review December 4, 2025 21:36

skamenan7 requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners December 4, 2025 21:36

skamenan7 requested a review from derekhiggins December 4, 2025 22:07

skamenan7 force-pushed the feat/bedrock-ci-4095 branch from d36bfe4 to 800ef86 Compare December 5, 2025 15:10

skamenan7 added 5 commits December 8, 2025 08:25

test: update bedrock config test to match new defaults

bd9f2dd

skamenan7 force-pushed the feat/bedrock-ci-4095 branch from 800ef86 to bd9f2dd Compare December 8, 2025 13:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ci): add Bedrock integration tests with record/replay #4292

feat(ci): add Bedrock integration tests with record/replay #4292

Uh oh!

skamenan7 commented Dec 3, 2025

Uh oh!

derekhiggins Dec 4, 2025

Uh oh!

skamenan7 Dec 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

derekhiggins Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(ci): add Bedrock integration tests with record/replay #4292

Are you sure you want to change the base?

feat(ci): add Bedrock integration tests with record/replay #4292

Uh oh!

Conversation

skamenan7 commented Dec 3, 2025

Uh oh!

derekhiggins Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

skamenan7 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

derekhiggins Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants