-
Notifications
You must be signed in to change notification settings - Fork 1k
fix(optimizer)!: query schema directly when type annotation fails for processing UNNEST source #6451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
can you fix the type annotation instead? |
|
Yeah, I second that suggestion in favor of avoiding making this more complicated than necessary. Keen to understand more if that is tricky to do; let me know and we can hop in a call and chat about this. |
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-BradfordPaskewitz keep in mind that you'll need to rebase and apply these changes to resolver.py due to 625654a.
3b0a522 to
fea0204
Compare
fea0204 to
41cfa9e
Compare
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-BradfordPaskewitz another quick round of comments from me, but this should be good to merge soon.
@tobymao wanna take a look as well? Another pair of eyes is a good idea for this one.
… processing UNNEST source
41cfa9e to
7428a39
Compare
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @fivetran-BradfordPaskewitz, looks reasonable 👍
Issue: https://fivetran.atlassian.net/browse/RD-1066808
Problem: In BigQuery, when you UNNEST an ARRAY<STRUCT<...>>, the struct fields can be referenced as unqualified columns (e.g., WHERE type = 'x'). This worked in simple queries but failed in correlated subqueries because sqlglot's type annotation couldn't resolve columns from outer scopes, so qualify_columns didn't know what struct fields to expose.
Solution: Add a fallback mechanism in qualify_columns.py that queries the schema directly when type annotation fails. When processing an UNNEST source, if the expression isn't typed, traverse parent scopes to find the column's type definition in the schema (handling both table sources and CTEs), then extract and expose the struct field names as available columns in that scope.