MERGE optimization
You can improve MERGE performance in several ways:
-
Use source tables that are smaller than target tables.
Projections for MERGE operations
The Vertica query optimizer automatically chooses the best projections to implement a merge operation. A good projection design strategy provides projections that help the query optimizer avoid extra sort and data transfer operations, and facilitate MERGE performance.
Tip
You can rely on Database Designer to generate projections that address merge requirements. You can then customize these projections as needed.For example, the following MERGE statement fragment joins source and target tables tgt and src, respectively, on columns tgt.a and src.b:
=> MERGE INTO tgt USING src ON tgt.a = src.b ...
Vertica can use a local merge join if projections for tables tgt and src use one of the following projection designs, where inputs are presorted by projection ORDER BY clauses:
-
Replicated projections are sorted on:
-
Column
afor tabletgt -
Column
bfor tablesrc
-
-
Segmented projections are identically segmented on:
-
Column
afor tabletgt -
Column
bfor tablesrc -
Corresponding segmented columns
-
Optimizing MERGE query plans
Vertica prepares an optimized query plan if the following conditions are all true:
-
The
MERGEstatement contains both matching clausesWHEN MATCHED THEN UPDATE SETandWHEN NOT MATCHED THEN INSERT. If theMERGEstatement contains only one matching clause, it uses a non-optimized query plan. -
The
MERGEstatement excludes update and insert filters. -
The target table join column has a unique or primary key constraint. This requirement does not apply to the source table join column.
-
Both matching clauses specify all columns in the target table.
-
Both matching clauses specify identical source values.
For details on evaluating an
EXPLAIN-generated query plan, see MERGE path.
The examples that follow use a simple schema to illustrate some of the conditions under which Vertica prepares or does not prepare an optimized query plan for MERGE:
CREATE TABLE target(a INT PRIMARY KEY, b INT, c INT) ORDER BY b,a;
CREATE TABLE source(a INT, b INT, c INT) ORDER BY b,a;
INSERT INTO target VALUES(1,2,3);
INSERT INTO target VALUES(2,4,7);
INSERT INTO source VALUES(3,4,5);
INSERT INTO source VALUES(4,6,9);
COMMIT;
Optimized MERGE statement
Vertica can prepare an optimized query plan for the following MERGE statement because:
-
The target table's join column
t.ahas a primary key constraint. -
All columns in the target table
(a,b,c)are included in theUPDATEandINSERTclauses. -
The
UPDATEandINSERTclauses specify identical source values:s.a,s.b, ands.c.
MERGE INTO target t USING source s ON t.a = s.a
WHEN MATCHED THEN UPDATE SET a=s.a, b=s.b, c=s.c
WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a, s.b, s.c);
OUTPUT
--------
2
(1 row)
The output value of 2 indicates success and denotes the number of rows updated/inserted from the source into the target.
Non-optimized MERGE statement
In the next example, the MERGE statement runs without optimization because the source values in the UPDATE/INSERT clauses are not identical. Specifically, the UPDATE clause includes constants for columns s.a and s.c and the INSERT clause does not:
MERGE INTO target t USING source s ON t.a = s.a
WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c - 1
WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a, s.b, s.c);
To make the previous MERGE statement eligible for optimization, rewrite the statement so that the source values in the UPDATE and INSERT clauses are identical:
MERGE INTO target t USING source s ON t.a = s.a
WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c -1
WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a + 1, s.b, s.c - 1);