-
Notifications
You must be signed in to change notification settings - Fork 1.3k
fix(core): handle unescaped newlines in JSON structured output #10965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some LLMs output actual newline characters inside JSON string values instead of properly escaped \n sequences, which breaks JSON parsing. This adds preprocessing to escape unescaped control characters within JSON string values before parsing, making structured output more robust across different LLM providers. - Add escapeUnescapedControlCharsInJsonStrings() helper function - Integrate into preprocessText() in BaseFormatHandler - Add unit tests for the helper and integration tests for streaming
🦋 Changeset detectedLatest commit: 428bc30 The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughA new utility function Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🧰 Additional context used📓 Path-based instructions (5).changeset/*.md⚙️ CodeRabbit configuration file
Files:
**/*.{ts,tsx}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.{ts,tsx,js,jsx,json,md}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
packages/**/*.{ts,tsx}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.{test,spec}.{ts,tsx}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (4)📚 Learning: 2025-11-24T16:42:04.244ZApplied to files:
📚 Learning: 2025-11-24T16:42:04.244ZApplied to files:
📚 Learning: 2025-11-24T16:42:04.244ZApplied to files:
📚 Learning: 2025-11-24T16:42:04.244ZApplied to files:
🧬 Code graph analysis (2)packages/core/src/stream/base/output-format-handlers.ts (1)
packages/core/src/stream/base/output-format-handlers.test.ts (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and publish to npm yourself or [setup this action to publish automatically](https://github.com/changesets/action#with-publishing). If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ `main` is currently in **pre mode** so this branch has prereleases rather than normal releases. If you want to exit prereleases, run `changeset pre exit` on `main`.⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ # Releases ## @mastra/[email protected] ### Minor Changes - Add stored agents support ([#10953](#10953)) Agents can now be stored in the database and loaded at runtime. This lets you persist agent configurations and dynamically create executable Agent instances from storage. ```typescript import { Mastra } from '@mastra/core'; import { LibSQLStore } from '@mastra/libsql'; const mastra = new Mastra({ storage: new LibSQLStore({ url: ':memory:' }), tools: { myTool }, scorers: { myScorer }, }); // Create agent in storage via API or directly await mastra.getStorage().createAgent({ agent: { id: 'my-agent', name: 'My Agent', instructions: 'You are helpful', model: { provider: 'openai', name: 'gpt-4' }, tools: { myTool: {} }, scorers: { myScorer: { sampling: { type: 'ratio', rate: 0.5 } } }, }, }); // Load and use the agent const agent = await mastra.getStoredAgentById('my-agent'); const response = await agent.generate({ messages: 'Hello!' }); // List all stored agents with pagination const { agents, total, hasMore } = await mastra.listStoredAgents({ page: 0, perPage: 10, }); ``` Also adds a memory registry to Mastra so stored agents can reference memory instances by key. ### Patch Changes - Add agentId and agentName attributes to MODEL_GENERATION spans. This allows users to correlate gen_ai.usage metrics with specific agents when analyzing LLM operation spans. The attributes are exported as gen_ai.agent.id and gen_ai.agent.name in the OtelExporter. ([#10984](#10984)) - Fix JSON parsing errors when LLMs output unescaped newlines in structured output strings ([#10965](#10965)) Some LLMs (particularly when not using native JSON mode) output actual newline characters inside JSON string values instead of properly escaped `\n` sequences. This breaks JSON parsing and causes structured output to fail. This change adds preprocessing to escape unescaped control characters (`\n`, `\r`, `\t`) within JSON string values before parsing, making structured output more robust across different LLM providers. - Fix toolCallId propagation in agent network tool execution. The toolCallId property was undefined at runtime despite being required by TypeScript type definitions in AgentToolExecutionContext. Now properly passes the toolCallId through to the tool's context during network tool execution. ([#10951](#10951)) - Exports `convertFullStreamChunkToMastra` from the stream module for AI SDK stream chunk transformations. ([#10911](#10911)) ## @mastra/[email protected] ### Minor Changes - Add stored agents support ([#10953](#10953)) Agents can now be stored in the database and loaded at runtime. This lets you persist agent configurations and dynamically create executable Agent instances from storage. ```typescript import { Mastra } from '@mastra/core'; import { LibSQLStore } from '@mastra/libsql'; const mastra = new Mastra({ storage: new LibSQLStore({ url: ':memory:' }), tools: { myTool }, scorers: { myScorer }, }); // Create agent in storage via API or directly await mastra.getStorage().createAgent({ agent: { id: 'my-agent', name: 'My Agent', instructions: 'You are helpful', model: { provider: 'openai', name: 'gpt-4' }, tools: { myTool: {} }, scorers: { myScorer: { sampling: { type: 'ratio', rate: 0.5 } } }, }, }); // Load and use the agent const agent = await mastra.getStoredAgentById('my-agent'); const response = await agent.generate({ messages: 'Hello!' }); // List all stored agents with pagination const { agents, total, hasMore } = await mastra.listStoredAgents({ page: 0, perPage: 10, }); ``` Also adds a memory registry to Mastra so stored agents can reference memory instances by key. ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## @mastra/[email protected] ### Minor Changes - Add stored agents support ([#10953](#10953)) Agents can now be stored in the database and loaded at runtime. This lets you persist agent configurations and dynamically create executable Agent instances from storage. ```typescript import { Mastra } from '@mastra/core'; import { LibSQLStore } from '@mastra/libsql'; const mastra = new Mastra({ storage: new LibSQLStore({ url: ':memory:' }), tools: { myTool }, scorers: { myScorer }, }); // Create agent in storage via API or directly await mastra.getStorage().createAgent({ agent: { id: 'my-agent', name: 'My Agent', instructions: 'You are helpful', model: { provider: 'openai', name: 'gpt-4' }, tools: { myTool: {} }, scorers: { myScorer: { sampling: { type: 'ratio', rate: 0.5 } } }, }, }); // Load and use the agent const agent = await mastra.getStoredAgentById('my-agent'); const response = await agent.generate({ messages: 'Hello!' }); // List all stored agents with pagination const { agents, total, hasMore } = await mastra.listStoredAgents({ page: 0, perPage: 10, }); ``` Also adds a memory registry to Mastra so stored agents can reference memory instances by key. ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## @mastra/[email protected] ### Minor Changes - Add stored agents support ([#10953](#10953)) Agents can now be stored in the database and loaded at runtime. This lets you persist agent configurations and dynamically create executable Agent instances from storage. ```typescript import { Mastra } from '@mastra/core'; import { LibSQLStore } from '@mastra/libsql'; const mastra = new Mastra({ storage: new LibSQLStore({ url: ':memory:' }), tools: { myTool }, scorers: { myScorer }, }); // Create agent in storage via API or directly await mastra.getStorage().createAgent({ agent: { id: 'my-agent', name: 'My Agent', instructions: 'You are helpful', model: { provider: 'openai', name: 'gpt-4' }, tools: { myTool: {} }, scorers: { myScorer: { sampling: { type: 'ratio', rate: 0.5 } } }, }, }); // Load and use the agent const agent = await mastra.getStoredAgentById('my-agent'); const response = await agent.generate({ messages: 'Hello!' }); // List all stored agents with pagination const { agents, total, hasMore } = await mastra.listStoredAgents({ page: 0, perPage: 10, }); ``` Also adds a memory registry to Mastra so stored agents can reference memory instances by key. ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Return NetworkDataPart on each agent-execution-event and workflow-execution-event in network streams ([#10982](#10982)) - Fixed tool-call-suspended chunks being dropped in workflow-step-output when using AI SDK. Previously, when an agent inside a workflow step called a tool that got suspended, the tool-call-suspended chunk was not received on the frontend even though tool-input-available chunks were correctly received. ([#10987](#10987)) The issue occurred because tool-call-suspended was not included in the isMastraTextStreamChunk list, causing it to be filtered out in transformWorkflow. Now tool-call-suspended, tool-call-approval, object, and tripwire chunks are properly included in the text stream chunk list and will be transformed and passed through correctly. Fixes #10978 - Adds `withMastra()` for wrapping AI SDK models with Mastra processors and memory. ([#10911](#10911)) ```typescript import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; import { withMastra } from '@mastra/ai-sdk'; const model = withMastra(openai('gpt-4o'), { inputProcessors: [myGuardProcessor], outputProcessors: [myLoggingProcessor], memory: { storage, threadId: 'thread-123', resourceId: 'user-123', lastMessages: 10, }, }); const { text } = await generateText({ model, prompt: 'Hello!' }); ``` Works with `generateText`, `streamText`, `generateObject`, and `streamObject`. - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Fix `saveMessageToMemory` return type to match API response. The method now correctly returns `{ messages: (MastraMessageV1 | MastraDBMessage)[] }` instead of `(MastraMessageV1 | MastraDBMessage)[]` to align with the server endpoint response schema. ([#10996](#10996)) - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`5a1ede1`](5a1ede1)]: - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`7761c77`](7761c77), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`fdf5a82`](fdf5a82), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Add agentId and agentName attributes to MODEL_GENERATION spans. This allows users to correlate gen_ai.usage metrics with specific agents when analyzing LLM operation spans. The attributes are exported as gen_ai.agent.id and gen_ai.agent.name in the OtelExporter. ([#10984](#10984)) - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## [email protected] ### Patch Changes - Add `mastra studio` CLI command to serve the built playground as a static server ([#10283](#10283)) - Fix default value showing on workflow form after user submits ([#10983](#10983)) - Move to @posthog/react which is the actual way to use posthog in React. It also fixes ([#10967](#10967)) - Move useScorers down to trace page to trigger it once for all trace spans ([#10985](#10985)) - Update Observability Trace Spans list UI, so a user can expand/collapse span children/descendants and can filter the list by span type or name ([#10378](#10378)) - Fix workflow trigger form overflow ([#10986](#10986)) - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`7761c77`](7761c77), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Fixed a bug where `[native code]` was incorrectly added to the output ([#10971](#10971)) ## [email protected] ### Patch Changes - Fix default value showing on workflow form after user submits ([#10983](#10983)) - Move useScorers down to trace page to trigger it once for all trace spans ([#10985](#10985)) - Update Observability Trace Spans list UI, so a user can expand/collapse span children/descendants and can filter the list by span type or name ([#10378](#10378)) - Fix workflow trigger form overflow ([#10986](#10986)) ## @mastra/[email protected] ### Patch Changes - Fixed Docker build failure with Bun due to invalid `file://` URLs ([#10960](#10960)) - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Add "Not connected" error detection to MCP auto-reconnection ([#10994](#10994)) Enhanced the MCPClient auto-reconnection feature to also detect and handle "Not connected" protocol errors. When the MCP SDK's transport layer throws this error (typically when the connection is in a disconnected state), the client will now automatically reconnect and retry the operation. - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`fdf5a82`](fdf5a82), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Fix default value showing on workflow form after user submits ([#10983](#10983)) - Move useScorers down to trace page to trigger it once for all trace spans ([#10985](#10985)) - Update Observability Trace Spans list UI, so a user can expand/collapse span children/descendants and can filter the list by span type or name ([#10378](#10378)) - Add UI to match with the mastra studio command ([#10283](#10283)) - Fix workflow trigger form overflow ([#10986](#10986)) - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`a54793a`](a54793a), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`5a1ede1`](5a1ede1), [`92a2ab4`](92a2ab4), [`0bed332`](0bed332), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Add HonoApp interface to eliminate `as any` cast when passing Hono app to MastraServer. Users can now pass typed Hono apps directly without casting. ([#10846](#10846)) Fix example type issues in server-adapters - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] ## @mastra/[email protected] ### Patch Changes - Add HonoApp interface to eliminate `as any` cast when passing Hono app to MastraServer. Users can now pass typed Hono apps directly without casting. ([#10846](#10846)) Fix example type issues in server-adapters - Updated dependencies \[[`72df8ae`](72df8ae), [`9198899`](9198899), [`653e65a`](653e65a), [`c6fd6fe`](c6fd6fe), [`0bed332`](0bed332)]: - @mastra/[email protected] - @mastra/[email protected] Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
Problem
Some LLMs (particularly when not using native JSON mode) output actual newline characters inside JSON string values instead of properly escaped
\nsequences. This breaks JSON parsing and causes structured output to fail with errors like "invalid JSON".Example of problematic LLM output:
{"field": "line 1 line 2"}Instead of valid JSON:
{"field": "line 1\nline 2"}Solution
Add preprocessing in
BaseFormatHandler.preprocessText()to escape unescaped control characters (\n,\r,\t) within JSON string values before parsing.The fix:
\\nTesting
escapeUnescapedControlCharsInJsonStrings()helperNotes
We were unable to fully reproduce the issue with current Mistral/OpenAI models, but the fix is defensive and the unit tests validate it works correctly when malformed JSON is encountered.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.