-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat(schema): Migrate json and proto converters to use HoodieSchema #17740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(schema): Migrate json and proto converters to use HoodieSchema #17740
Conversation
229119a to
c70a3e1
Compare
balaji-varadarajan-ai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM
| bigDecimal = new BigDecimal(obj.toString(), new MathContext(logicalType.getPrecision(), RoundingMode.UNNECESSARY)).setScale(logicalType.getScale(), RoundingMode.UNNECESSARY); | ||
| bigDecimal = new BigDecimal(obj.toString(), new MathContext(schema.getPrecision(), RoundingMode.UNNECESSARY)).setScale(schema.getScale(), RoundingMode.UNNECESSARY); | ||
| } | ||
| } catch (java.lang.NumberFormatException | ArithmeticException ignored) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Possible to import this?
| protected static boolean isValidDecimalTypeConfig(Schema schema) { | ||
| LogicalTypes.Decimal decimalType = (LogicalTypes.Decimal) schema.getLogicalType(); | ||
| protected static boolean isValidDecimalTypeConfig(HoodieSchema schema) { | ||
| LogicalTypes.Decimal decimalType = (LogicalTypes.Decimal) schema.toAvroSchema().getLogicalType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is still under the Avro package. I think converting everything to avro make sense.
During my test, IIRC, via the HoodieSchema#createDecimal helper, but creating a decimal that is backed by fix with an improper fixed size will fail silently.
A HoodieSchema will still be created, but it will not be an instance of HoodieSchema.Decimal. This might be a validation that we need to add in that helper method. Just something orthogonal that i happened to think of when reading this part of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really following this comment, is there anything needed in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope nothing needed. Just adding a comment here for future reference.
| fieldTypeProcessors.put(HoodieSchemaType.ENUM, generateEnumTypeHandler()); | ||
| fieldTypeProcessors.put(HoodieSchemaType.MAP, generateMapTypeHandler()); | ||
| fieldTypeProcessors.put(HoodieSchemaType.BYTES, generateBytesTypeHandler()); | ||
| fieldTypeProcessors.put(HoodieSchemaType.FIXED, generateFixedTypeHandler()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~~Is it possible to shift some of the KV pairs in getLogicalFieldTypeProcessors to this scope?
e.g. DECIMAL, TIME, DATE, etc.~~
Edit: On second thoughts, ignore my comments above. We are in the Avro scope. So, concepts should be Avro-first instead of HoodieSchema-first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ok as it is then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, it's okay.
| // if incoming message does not contain the field, fieldDescriptor will be null | ||
| // if the field schema is a union, it is nullable | ||
| if (fieldSchema.getType() == Schema.Type.UNION && (fieldDescriptor == null || (!fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)))) { | ||
| if (fieldSchema.getType() == HoodieSchemaType.UNION && (fieldDescriptor == null || (!fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Check if fieldSchema#isNullable?
5a69440 to
5161f8f
Compare
voonhous
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
CI is green, merging this in. |
Describe the issue this Pull Request addresses
Updates the json and proto conversion utilities to use HoodieSchema instead of the Avro Schema when possible
Summary and Changelog
Impact
Removes reliance on Avro's schema class and replaces it with Hoodie's own schema system
Risk Level
Low
Documentation Update
Contributor's checklist