Hash segmentation clause
Specifies how to segment projection data for distribution across all cluster nodes. You can specify segmentation for a table and a projection. If a table definition specifies segmentation, Vertica uses it for that table's auto-projections.
It is strongly recommended that you use Vertica's built-in
HASH function, which distributes data evenly across the cluster, and facilitates optimal query execution.
Syntax
SEGMENTED BY expression ALL NODES [ OFFSET offset ]
Parameters
SEGMENTED BYexpression- A general SQL expression. Hash segmentation is the preferred method of segmentation. Vertica recommends using its built-in
HASHfunction, whose arguments resolve to table columns. If you use an expression other thanHASH, Vertica issues a warning.The segmentation expression should specify columns with a large number of unique data values and acceptable skew in their data distribution. In general, primary key columns that meet these criteria are good candidates for hash segmentation.
For details, see Expression Requirements below.
ALL NODES- Automatically distributes data evenly across all nodes when the projection is created. Node ordering is fixed.
OFFSEToffset- A zero-based offset that indicates on which node to start segmentation distribution.
This option is not valid for
CREATE TABLEandCREATE TEMPORARY TABLE.Important
If you create a projection for a table with the
OFFSEToption, be sure to create enough copies of each projection segment to satisfy system K-safety; otherwise, Vertica regards the projection as unsafe and cannot use it to query the table.You can ensure K-safety compliance when you create projections by combining
OFFSETandKSAFEoptions in theCREATE PROJECTIONstatement. On executing this statement, Vertica automatically creates the necessary number of projection copies.
Expression requirements
A segmentation expression must specify table columns as they are defined in the source table. Projection column names are not supported.
The following restrictions apply to segmentation expressions:
-
All leaf expressions must be constants or column references to a column in the
CREATE PROJECTION'sSELECTlist. -
The expression must return the same value over the life of the database.
-
Aggregate functions are not allowed.
-
The expression must return non-negative
INTEGERvalues in the range0 <= x < 263, and values are generally distributed uniformly over that range.Note
If the expression produces a value outside the expected range—for example, a negative value—no error occurs, and the row is added to the projection's first segment.
Examples
The following CREATE PROJECTION statement creates projection public.employee_dimension_super. It specifies to include all columns in table public.employee_dimension. The hash segmentation clause invokes the Vertica HASH function to segment projection data on the column employee_key; it also includes the ALL NODES clause, which specifies to distribute projection data evenly across all nodes in the cluster:
=> CREATE PROJECTION public.employee_dimension_super
AS SELECT * FROM public.employee_dimension
ORDER BY employee_key
SEGMENTED BY hash(employee_key) ALL NODES;