Locale handling in Vertica
The following sections describes how Vertica handles locale.
Session locale
Locale is session-scoped and applies only to queries executed in that session. You cannot specify locale for individual queries. When you start a session it obtains its locale from the configuration parameter DefaultSessionLocale.
Query restrictions
The following restrictions apply when queries are run with locale other than the default en_US@collation=binary:
-
When one or more of the left-side
NOT INcolumns isCHARorVARCHAR, multi-columnNOT INsubqueries are not supported . For example:=> CREATE TABLE test (x VARCHAR(10), y INT); => SELECT ... FROM test WHERE (x,y) NOT IN (SELECT ...); ERROR: Multi-expression NOT IN subquery is not supported because a left hand expression could be NULLNote
Even if columnstest.xandtest.yhave a NOT NULL constraint, an error occurs. -
If the outer query contains a
GROUP BYclause on aCHARorVARCHARcolumn, correlatedHAVINGclause subqueries are not supported. In the following example, theGROUP BY xin the outer query causes the error:=> DROP TABLE test CASCADE; => CREATE TABLE test (x VARCHAR(10)); => SELECT COUNT(*) FROM test t GROUP BY x HAVING x IN (SELECT x FROM test WHERE t.x||'a' = test.x||'a' ); ERROR: subquery uses ungrouped column "t.x" from outer query -
Subqueries that use analytic functions in the
HAVINGclause are not supported. For example:=> DROP TABLE test CASCADE; => CREATE TABLE test (x VARCHAR(10)); => SELECT MAX(x)OVER(PARTITION BY 1 ORDER BY 1) FROM test GROUP BY x HAVING x IN (SELECT MAX(x) FROM test); ERROR: Analytics query with having clause expression that involves aggregates and subquery is not supported
Collation and projections
Projection data is sorted according to the default en_US@collation=binary collation. Thus, regardless of the session setting, issuing the following command creates a projection sorted by col1 according to the binary collation:
=> CREATE PROJECTION p1 AS SELECT * FROM table1 ORDER BY col1;
In such cases, straße and strasse are not stored near each other on disk.
Sorting by binary collation also means that sort optimizations do not work in locales other than binary. Vertica returns the following warning if you create tables or projections in a non-binary locale:
WARNING: Projections are always created and persisted in the default
Vertica locale. The current locale is de_DE
Non-binary locale input handling
When the locale is non-binary, Vertica uses the
COLLATION function to transform input to a binary string that sorts in the proper order.
This transformation increases the number of bytes required for the input according to this formula:
result_column_width = input_octet_width * CollationExpansion + 4
The default value of configuration parameter CollationExpansion is 5.
Character data type handling
-
CHARfields are displayed as fixed length, including any trailing spaces. WhenCHARfields are processed internally, they are first stripped of trailing spaces. ForVARCHARfields, trailing spaces are usually treated as significant characters; however, trailing spaces are ignored when sorting or comparing either type of character string field using a non-binary locale. -
The maximum length parameter for
VARCHARandCHARdata type refers to the number of octets (bytes) that can be stored in that field and not number of characters. When using multi-byte UTF-8 characters, the fields must be sized to accommodate from 1 to 4 bytes per character, depending on the data.