io.delta.tables.DeltaMergeBuilder

All Implemented Interfaces:: org.apache.spark.internal.Logging, org.apache.spark.sql.delta.util.AnalysisHelper

public class DeltaMergeBuilder extends Object implements org.apache.spark.sql.delta.util.AnalysisHelper, org.apache.spark.internal.Logging

Builder to specify how to merge data from source DataFrame into the target Delta table. You can specify any number of whenMatched and whenNotMatched clauses. Here are the constraints on these clauses.

- whenMatched clauses:

- The condition in a whenMatched clause is optional. However, if there are multiple whenMatched clauses, then only the last one may omit the condition.

- When there are more than one whenMatched clauses and there are conditions (or the lack of) such that a row satisfies multiple clauses, then the action for the first clause satisfied is executed. In other words, the order of the whenMatched clauses matters.

- If none of the whenMatched clauses match a source-target row pair that satisfy the merge condition, then the target rows will not be updated or deleted.

- If you want to update all the columns of the target Delta table with the corresponding column of the source DataFrame, then you can use the whenMatched(...).updateAll(). This is equivalent to

         whenMatched(...).updateExpr(Map(
           ("col1", "source.col1"),
           ("col2", "source.col2"),
           ...))

- whenNotMatched clauses:

- The condition in a whenNotMatched clause is optional. However, if there are multiple whenNotMatched clauses, then only the last one may omit the condition.

- When there are more than one whenNotMatched clauses and there are conditions (or the lack of) such that a row satisfies multiple clauses, then the action for the first clause satisfied is executed. In other words, the order of the whenNotMatched clauses matters.

- If no whenNotMatched clause is present or if it is present but the non-matching source row does not satisfy the condition, then the source row is not inserted.

- If you want to insert all the columns of the target Delta table with the corresponding column of the source DataFrame, then you can use whenNotMatched(...).insertAll(). This is equivalent to

         whenNotMatched(...).insertExpr(Map(
           ("col1", "source.col1"),
           ("col2", "source.col2"),
           ...))

- whenNotMatchedBySource clauses:

- The condition in a whenNotMatchedBySource clause is optional. However, if there are multiple whenNotMatchedBySource clauses, then only the last one may omit the condition.

- When there are more than one whenNotMatchedBySource clauses and there are conditions (or the lack of) such that a row satisfies multiple clauses, then the action for the first clause satisfied is executed. In other words, the order of the whenNotMatchedBySource clauses matters.

- If no whenNotMatchedBySource clause is present or if it is present but the non-matching target row does not satisfy any of the whenNotMatchedBySource clause condition, then the target row will not be updated or deleted.

Scala example to update a key-value Delta table with new key-values from a source DataFrame:


    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .withSchemaEvolution()
     .whenMatched()
     .updateExpr(Map(
       "value" -> "source.value"))
     .whenNotMatched()
     .insertExpr(Map(
       "key" -> "source.key",
       "value" -> "source.value"))
     .whenNotMatchedBySource()
     .updateExpr(Map(
       "value" -> "target.value + 1"))
     .execute()

Java example to update a key-value Delta table with new key-values from a source DataFrame:


    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .withSchemaEvolution()
     .whenMatched()
     .updateExpr(
        new HashMap<String, String>() {{
          put("value", "source.value");
        }})
     .whenNotMatched()
     .insertExpr(
        new HashMap<String, String>() {{
         put("key", "source.key");
         put("value", "source.value");
       }})
     .whenNotMatchedBySource()
     .updateExpr(
        new HashMap<String, String>() {{
         put("value", "target.value + 1");
       }})
     .execute();

Since:: 0.3.0

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.sql.delta.util.AnalysisHelper
org.apache.spark.sql.delta.util.AnalysisHelper.FakeLogicalPlan, org.apache.spark.sql.delta.util.AnalysisHelper.FakeLogicalPlan$

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

DeltaMergeBuilder(DeltaTable targetTable, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, org.apache.spark.sql.Column onCondition, scala.collection.immutable.Seq<org.apache.spark.sql.catalyst.plans.logical.DeltaMergeIntoClause> whenClauses)
Method Summary

Modifier and Type

Method

Description

org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>

execute()

Execute the merge operation based on the built matched and not matched actions.

DeltaMergeMatchedActionBuilder

whenMatched()

Build the actions to perform when the merge condition was matched.

DeltaMergeMatchedActionBuilder

whenMatched(String condition)

Build the actions to perform when the merge condition was matched and the given condition is true.

DeltaMergeMatchedActionBuilder

whenMatched(org.apache.spark.sql.Column condition)

Build the actions to perform when the merge condition was matched and the given condition is true.

DeltaMergeNotMatchedActionBuilder

whenNotMatched()

Build the action to perform when the merge condition was not matched.

DeltaMergeNotMatchedActionBuilder

whenNotMatched(String condition)

Build the actions to perform when the merge condition was not matched and the given condition is true.

DeltaMergeNotMatchedActionBuilder

whenNotMatched(org.apache.spark.sql.Column condition)

Build the actions to perform when the merge condition was not matched and the given condition is true.

DeltaMergeNotMatchedBySourceActionBuilder

whenNotMatchedBySource()

Build the actions to perform when the merge condition was not matched by the source.

DeltaMergeNotMatchedBySourceActionBuilder

whenNotMatchedBySource(String condition)

Build the actions to perform when the merge condition was not matched by the source and the given condition is true.

DeltaMergeNotMatchedBySourceActionBuilder

whenNotMatchedBySource(org.apache.spark.sql.Column condition)

Build the actions to perform when the merge condition was not matched by the source and the given condition is true.

DeltaMergeBuilder

withSchemaEvolution()

Enable schema evolution for the merge operation.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.sql.delta.util.AnalysisHelper
improveUnsupportedOpError, resolveReferencesForExpressions, toDataset, tryResolveReferences, tryResolveReferencesForExpressions, tryResolveReferencesForExpressions

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- DeltaMergeBuilder
  
  public DeltaMergeBuilder(DeltaTable targetTable, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, org.apache.spark.sql.Column onCondition, scala.collection.immutable.Seq<org.apache.spark.sql.catalyst.plans.logical.DeltaMergeIntoClause> whenClauses)
Method Details
- whenMatched
  
  public DeltaMergeMatchedActionBuilder whenMatched()
  
  Build the actions to perform when the merge condition was matched. This returns DeltaMergeMatchedActionBuilder object which can be used to specify how to update or delete the matched target table row with the source row.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0
- whenMatched
  
  public DeltaMergeMatchedActionBuilder whenMatched(String condition)
  
  Build the actions to perform when the merge condition was matched and the given condition is true. This returns DeltaMergeMatchedActionBuilder object which can be used to specify how to update or delete the matched target table row with the source row.
  
  Parameters:
  
  condition - boolean expression as a SQL formatted string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0
- whenMatched
  
  public DeltaMergeMatchedActionBuilder whenMatched(org.apache.spark.sql.Column condition)
  
  Build the actions to perform when the merge condition was matched and the given condition is true. This returns a DeltaMergeMatchedActionBuilder object which can be used to specify how to update or delete the matched target table row with the source row.
  
  Parameters:
  
  condition - boolean expression as a Column object
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0
- whenNotMatched
  
  public DeltaMergeNotMatchedActionBuilder whenNotMatched()
  
  Build the action to perform when the merge condition was not matched. This returns DeltaMergeNotMatchedActionBuilder object which can be used to specify how to insert the new sourced row into the target table.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0
- whenNotMatched
  
  public DeltaMergeNotMatchedActionBuilder whenNotMatched(String condition)
  
  Build the actions to perform when the merge condition was not matched and the given condition is true. This returns DeltaMergeMatchedActionBuilder object which can be used to specify how to insert the new sourced row into the target table.
  
  Parameters:
  
  condition - boolean expression as a SQL formatted string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0
- whenNotMatched
  
  public DeltaMergeNotMatchedActionBuilder whenNotMatched(org.apache.spark.sql.Column condition)
  
  Build the actions to perform when the merge condition was not matched and the given condition is true. This returns DeltaMergeMatchedActionBuilder object which can be used to specify how to insert the new sourced row into the target table.
  
  Parameters:
  
  condition - boolean expression as a Column object
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0
- whenNotMatchedBySource
  
  public DeltaMergeNotMatchedBySourceActionBuilder whenNotMatchedBySource()
  
  Build the actions to perform when the merge condition was not matched by the source. This returns DeltaMergeNotMatchedBySourceActionBuilder object which can be used to specify how to update or delete the target table row.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- whenNotMatchedBySource
  
  public DeltaMergeNotMatchedBySourceActionBuilder whenNotMatchedBySource(String condition)
  
  Build the actions to perform when the merge condition was not matched by the source and the given condition is true. This returns DeltaMergeNotMatchedBySourceActionBuilder object which can be used to specify how to update or delete the target table row.
  
  Parameters:
  
  condition - boolean expression as a SQL formatted string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- whenNotMatchedBySource
  
  public DeltaMergeNotMatchedBySourceActionBuilder whenNotMatchedBySource(org.apache.spark.sql.Column condition)
  
  Build the actions to perform when the merge condition was not matched by the source and the given condition is true. This returns DeltaMergeNotMatchedBySourceActionBuilder object which can be used to specify how to update or delete the target table row .
  
  Parameters:
  
  condition - boolean expression as a Column object
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- withSchemaEvolution
  
  public DeltaMergeBuilder withSchemaEvolution()
  
  Enable schema evolution for the merge operation. This allows the schema of the target table/columns to be automatically updated based on the schema of the source table/columns.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- execute
  
  public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> execute()
  
  Execute the merge operation based on the built matched and not matched actions.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  0.3.0

Class DeltaMergeBuilder

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.sql.delta.util.AnalysisHelper

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.sql.delta.util.AnalysisHelper

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

DeltaMergeBuilder

Method Details

whenMatched

whenMatched

whenMatched

whenNotMatched

whenNotMatched

whenNotMatched

whenNotMatchedBySource

whenNotMatchedBySource

whenNotMatchedBySource

withSchemaEvolution

execute