WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| hoodie.write.record.merge.mode | EVENT_TIME_ORDERING (when ordering field is set)<br />COMMIT_TIME_ORDERING (when ordering field is not set) | Determines the logic of merging different records with the same record key. Valid values: (1) `COMMIT_TIME_ORDERING`: use commit time to merge records, i.e., the record from later commit overwrites the earlier record with the same key. (2) `EVENT_TIME_ORDERING`: use event time as the ordering to merge records, i.e., the record with the larger event time overwrites the record with the smaller event time on the same key, regardless of commit time. The event time or preCombine field needs to be specified by the user. This is the default when an ordering field is configured. (3) `CUSTOM`: use custom merging logic specified by the user.<br />`Config Param: RECORD_MERGE_MODE`<br />`Since Version: 1.0.0`|
134
-
| hoodie.write.record.merge.strategy.id | N/A (Optional) | ID of record merge strategy. Hudi will pick `HoodieRecordMerger` implementations from `hoodie.write.record.merge.custom.implementation.classes` that have the same merge strategy ID. When using custom merge logic, you need to specify both this config and `hoodie.write.record.merge.custom.implementation.classes`.<br />`Config Param: RECORD_MERGE_STRATEGY_ID`<br />`Since Version: 0.13.0`<br />`Alternative: hoodie.datasource.write.record.merger.strategy` (deprecated) |
135
-
| hoodie.write.record.merge.custom.implementation.classes | N/A (Optional) | List of `HoodieRecordMerger` implementations constituting Hudi's merging strategy based on the engine used. Hudi selects the first implementation from this list that matches the following criteria: (1) has the same merge strategy ID as specified in `hoodie.write.record.merge.strategy.id` (if provided), (2) is compatible with the execution engine (e.g., SPARK merger for Spark, FLINK merger for Flink, AVRO for Java/Hive). The order in the list matters - place your preferred implementation first. Engine-specific implementations (SPARK, FLINK) are more efficient as they avoid Avro serialization/deserialization overhead.<br />`Config Param: RECORD_MERGE_IMPL_CLASSES`<br />`Since Version: 0.13.0`<br />`Alternative: hoodie.datasource.write.record.merger.impls` (deprecated) |
| hoodie.write.record.merge.mode | EVENT_TIME_ORDERING (when ordering field is set)<br />COMMIT_TIME_ORDERING (when ordering field is not set) | Determines the logic of merging different records with the same record key. Valid values: (1) `COMMIT_TIME_ORDERING`: use commit time to merge records, i.e., the record from later commit overwrites the earlier record with the same key. (2) `EVENT_TIME_ORDERING`: use event time as the ordering to merge records, i.e., the record with the larger event time overwrites the record with the smaller event time on the same key, regardless of commit time. The event time or ordering fields need to be specified by the user. This is the default when an ordering field is configured. (3) `CUSTOM`: use custom merging logic specified by the user.<br />`Config Param: RECORD_MERGE_MODE`<br />`Since Version: 1.0.0`|
134
+
| hoodie.write.record.merge.strategy.id | N/A (Optional) | ID of record merge strategy. Hudi will pick `HoodieRecordMerger` implementations from `hoodie.write.record.merge.custom.implementation.classes` that have the same merge strategy ID. When using custom merge logic, you need to specify both this config and `hoodie.write.record.merge.custom.implementation.classes`.<br />`Config Param: RECORD_MERGE_STRATEGY_ID`<br />`Since Version: 0.13.0`<br />`Alternative: hoodie.datasource.write.record.merger.strategy` (deprecated) |
135
+
| hoodie.write.record.merge.custom.implementation.classes | N/A (Optional) | List of `HoodieRecordMerger` implementations constituting Hudi's merging strategy based on the engine used. Hudi selects the first implementation from this list that matches the following criteria: (1) has the same merge strategy ID as specified in `hoodie.write.record.merge.strategy.id` (if provided), (2) is compatible with the execution engine (e.g., SPARK merger for Spark, FLINK merger for Flink, AVRO for Java/Hive). The order in the list matters - place your preferred implementation first. Engine-specific implementations (SPARK, FLINK) are more efficient as they avoid Avro serialization/deserialization overhead.<br />`Config Param: RECORD_MERGE_IMPL_CLASSES`<br />`Since Version: 0.13.0`<br />`Alternative: hoodie.datasource.write.record.merger.impls` (deprecated) |
| type | cow | The table type to create. `type = 'cow'` creates a COPY-ON-WRITE table, while `type = 'mor'` creates a MERGE-ON-READ table. Same as `hoodie.datasource.write.table.type`. More details can be found [here](table_types.md)|
581
581
| primaryKey | uuid | The primary key field names of the table separated by commas. Same as `hoodie.datasource.write.recordkey.field`. If this config is ignored, hudi will auto-generate primary keys. If explicitly set, primary key generation will honor user configuration. |
582
-
|preCombineField|| The ordering field(s) of the table. It is used for resolving the final version of the record among multiple versions. Generally, `event time` or another similar column will be used for ordering purposes. Hudi will be able to handle out-of-order data using the ordering field value. |
582
+
|orderingFields|| The ordering field(s) of the table. It is used for resolving the final version of the record among multiple versions. Generally, `event time` or another similar column will be used for ordering purposes. Hudi will be able to handle out-of-order data using the ordering field value. |
583
583
584
584
:::note
585
-
`primaryKey`, `preCombineField`, and `type` and other properties are case-sensitive.
585
+
`primaryKey`, `orderingFields`, and `type` and other properties are case-sensitive.
586
586
:::
587
587
588
588
#### Passing Lock Providers for Concurrent Writers
Copy file name to clipboardExpand all lines: website/docs/sql_dml.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,7 +51,7 @@ INSERT INTO hudi_cow_pt_tbl PARTITION(dt, hh) SELECT 1 AS id, 'a1' AS name, 1000
51
51
:::note Mapping to write operations
52
52
Hudi offers flexibility in choosing the underlying [write operation](write_operations.md) of a `INSERT INTO` statement using
53
53
the `hoodie.spark.sql.insert.into.operation` configuration. Possible options include *"bulk_insert"* (large inserts), *"insert"* (with small file management),
54
-
and *"upsert"* (with deduplication/merging). If ordering fields are not set, *"insert"* is chosen as the default. For a table with ordering fields set (via `preCombineField`),
54
+
and *"upsert"* (with deduplication/merging). If ordering fields are not set, *"insert"* is chosen as the default. For a table with ordering fields set (via `orderingFields`),
55
55
*"upsert"* is chosen as the default operation.
56
56
:::
57
57
@@ -101,7 +101,7 @@ update hudi_cow_pt_tbl set ts = 1001 where name = 'a1';
101
101
```
102
102
103
103
:::info
104
-
The `UPDATE` operation requires the specification of ordering fields (via `preCombineField`).
104
+
The `UPDATE` operation requires the specification of ordering fields (via `orderingFields`).
105
105
:::
106
106
107
107
### Merge Into
@@ -138,7 +138,7 @@ For a Hudi table with user configured primary keys, the join condition and the `
138
138
139
139
For a table where Hudi auto generates primary keys, the join condition in `MERGE INTO` can be on any arbitrary data columns.
140
140
141
-
if the `hoodie.record.merge.mode` is set to `EVENT_TIME_ORDERING`, ordering fields (via `preCombineField`) are required to be set with value in the `UPDATE`/`INSERT` clause.
141
+
if the `hoodie.record.merge.mode` is set to `EVENT_TIME_ORDERING`, ordering fields (via `orderingFields`) are required to be set with value in the `UPDATE`/`INSERT` clause.
142
142
143
143
It is enforced that if the target table has primary key and partition key column, the source table counterparts must enforce the same data type accordingly. Plus, if the target table is configured with `hoodie.record.merge.mode` = `EVENT_TIME_ORDERING` where target table is expected to have valid ordering fields configuration, the source table counterpart must also have the same data type.
144
144
:::
@@ -148,7 +148,7 @@ Examples below
148
148
```sql
149
149
-- source table using hudi for testing merging into non-partitioned table
150
150
createtablemerge_source (id int, name string, price double, ts bigint) using hudi
0 commit comments