Replies: 3 comments
-
|
Is this a difference in Completions API vs Responses API? |
Beta Was this translation helpful? Give feedback.
-
|
Hi @rngyn 👋 Why v5 feels slower: **Practical workarounds
2. Prefetch & cache outside the call
3. Reduce the count/detail
4. Stream the response
About Completions vs Responses This isn’t mainly an API difference; it’s about where images are resolved: If the provider accepts URLs and fetches server-side, first-byte can be faster (no local downloads). If the SDK has to attach bytes, v5 will download all images up front (one reason for the 10–15× slowdown you saw). TL;DR Make images URL-based + CDN when possible; if bytes are required, prefetch in parallel and shrink them. v5’s “image parts” are blocking because the SDK resolves assets before sending the request. |
Beta Was this translation helpful? Give feedback.
-
|
This discussion was automatically locked because it has not been updated in over 30 days. If you still have questions about this topic, please ask us at community.vercel.com/ai-sdk |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I recently upgraded from v4 to v5 and it unfortunately really impacted the performance of my application. My application relies heavily on image analysis and prompts may include upwards to 20 images. In v4, these images were attached as part of "experimental_attachments" and response time was fairly acceptable. The first chunk arrival and render was good.
However since upgrading to v5 and using image parts in messages to include the images, the same images could delay the response time by upwards to 15x! This has severely impacted the performance of my application.
Was there a fundamental architectural change between experimental_attachments and image parts? It looks like experimental_attachments was separate from the messages, and therefore was out-of-band, possibly unblocking and lazily used for the LLM and could respond quickly. However now that images are part of the messages, it seems to be a first-class citizen now and blocking?
Can someone explain this to me, and a possible work around? Do I have to implement a tool for dynamic image fetching now?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions