ARRAY
Function statement:
ARRAY(<e1> T, ..., <en> T)
Supported engines: SparkSQL and Presto
Usage instructions: Generate an array from the given elements.
Return type: array<T>
Example:
> SELECT array(1, 2, 3);
[1,2,3]
FILTER
Function statement:
FILTER(<expr> array<T>, <predicate> function(T[, integer])->boolean)
Supported engine: SparkSQL
Usage instructions: Filter the input array using the given predicate.
Return type: array<T>
Example:
> SELECT `filter`(array(1, 2, 3), x -> x % 2 == 1);
[1,3]
> SELECT `filter`(array(0, 2, 3), (x, i) -> x > i);
[2,3]
> SELECT `filter`(array(0, null, 2, 3, null), x -> x IS NOT NULL);
[0,2,3]
Function statement:
TRANSFORM(<expr> array<T>, <func> function(T[, integer])->U)
Supported engine: SparkSQL
Usage instructions: Use func to transform elements in an array.
Return type: array<U>
Example:
> SELECT transform(array(1, 2, 3), x -> x + 1);
[2,3,4]
> SELECT transform(array(1, 2, 3), (x, i) -> x + i);
[1,3,5]
ZIP_WITH
Function statement:
ZIP_WITH(<left> array<T>, <right> array<U>, <func> function(T, U)->R)
Supported engine: SparkSQL
Usage instructions: Use a function to merge two given arrays element-wise into a single array. If one array is shorter, nulls are appended at the end to match the length of the longer array before the function is applied.
Return type: array<R>
Example:
> SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x));
[{"y":"a","x":1},{"y":"b","x":2},{"y":"c","x":3}]
> SELECT zip_with(array(1, 2), array(3, 4), (x, y) -> x + y);
[4,6]
> SELECT zip_with(array('a', 'b', 'c'), array('d', 'e', 'f'), (x, y) -> concat(x, y));
["ad","be","cf"]
FORALL
Function statement:
FORALL(<expr> array<T>, <pred> function(T)->boolean)
Supported engine: SparkSQL
Usage instructions: Test whether a predicate applies to all elements in the array.
Return type: boolean
Example:
> SELECT forall(array(1, 2, 3), x -> x % 2 == 0);
false
> SELECT forall(array(2, 4, 8), x -> x % 2 == 0);
true
> SELECT forall(array(1, null, 3), x -> x % 2 == 0);
false
> SELECT forall(array(2, null, 8), x -> x % 2 == 0);
NULL
AGGREGATE
Function statement:
AGGREGATE(<expr> array<T>, <start> U, <merge> function(U, T)->U, <finish> function(U)->R)
Supported engine: SparkSQL
Usage instructions: Aggregate the elements in the array expr.
Return type: R
Example:
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x);
6
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10);
60
EXISTS
Function statement:
EXISTS(<expr> array<T>, <pred> function(T)->boolean)
Supported engine: SparkSQL
Usage instructions: Test whether a predicate applies to one or more elements in the array.
Return type: boolean
Example:
> SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
true> SELECT exists(array(1, 2, 3), x -> x % 2 == 10);
false
> SELECT exists(array(1, null, 3), x -> x % 2 == 0);
NULL
> SELECT exists(array(0, null, 2, 3, null), x -> x IS NULL);
true
> SELECT exists(array(1, 2, 3), x -> x IS NULL);
false
ARRAY_CONTAINS
Function statement:
ARRAY_CONTAINS(<expr> array<T>, <value> T)
Supported engines: SparkSQL and Presto
Usage instructions: Return true if the array contains value.
Return type: boolean
Example:
> SELECT array_contains(array(1, 2, 3), 2);
true
ARRAYS_OVERLAP
Function statement:
ARRAYS_OVERLAP(<a> array<T>, <b> array<U>)
Supported engines: SparkSQL and Presto
Usage instructions: Return true if a contains at least one non-null element that also exists in b. If the arrays have no common elements, and they are non-empty, and either contains null elements, it returns null. Otherwise, it returns false.
Return type: boolean
Example:
> SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
true
ARRAY_INTERSECT
Function statement:
ARRAY_INTERSECT(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements that are the intersection of a and b, without duplicates.
Return type: array<T>
Example:
> SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
[1,3]
ARRAY_JOIN
Function statement:
ARRAY_JOIN(<a> ARRAY<T>, <delimiter> string[, <nullReplacement> string])
Supported engines: SparkSQL and Presto
Usage instructions: Join the elements of the given array using a delimiter and an optional string to replace nulls. If nullReplacement is not set, all null values are filtered out.
Return type: string
Example:
> SELECT array_join(array('hello', 'world'), ' ');
hello world
> SELECT array_join(array('hello', null ,'world'), ' ');
hello world
> SELECT array_join(array('hello', null ,'world'), ' ', ',');
hello , world
ARRAY_POSITION
Function statement:
ARRAY_POSITION(<a> array<T>, <element> T)
Supported engines: SparkSQL and Presto
Usage instructions: Return the 1-based index of the first element of the array.
Return type: integer
Example:
> SELECT array_position(array(3, 2, 1), 1);
3
ARRAY_SORT
Function statement:
ARRAY_SORT(<a> array<T>[, <func> function(T, T)->integer])
Supported engines: SparkSQL and Presto
Usage instructions: Sort the input array. If func is omitted, the array is sorted in ascending order.
Return type: array<T>
Example:
> SELECT array_sort(array(5, 6, 1), (left, right) -> case when left < right then -1 when left > right then 1 else 0 end);
[1,5,6]
> SELECT array_sort(array('bc', 'ab', 'dc'), (left, right) -> case when left is null and right is null then 0 when left is null then -1 when right is null then 1 when left < right then 1 when left > right then -1 else 0 end);
["dc","bc","ab"]
> SELECT array_sort(array('b', 'd', null, 'c', 'a'));
["a","b","c","d",null]
ARRAY_EXCEPT
Function statement:
ARRAY_EXCEPT(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements in a that are not in b, without duplicates.
Return type: array<T>
Example:
> SELECT array_except(array(1, 2, 3), array(1, 3, 5));
[2]
ARRAY_UNION
Function statement:
ARRAY_UNION(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements that are the union of a and b, without duplicates.
Return type: array<T>
Example:
> SELECT array_union(array(1, 2, 3), array(1, 3, 5));
[1,2,3,5]
NAMED_STRUCT
Function statement:
NAMED_STRUCT(name1 K, val1 V, ...)
Supported engine: SparkSQL
Usage instructions: Create a struct using the given field names and values.
Return type: struct
Example:
> SELECT named_struct("a", 1, "b", 2, "c", 3);
{"a":1,"b":2,"c":3}
STRUCT
Function statement:
STRUCT(<col1> T1, <col2> T2, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Create a struct using the given field names and values.
Return type: struct
Example:
> SELECT struct('a', 'b', 'c');
{"col1":"a","col2":"b","col3":"c"}
> SELECT struct('a', 'b', 'c', 1, 2);
{"col1":"a","col2":"b","col3":"c","col4":1,"col5":2}
SLICE
Function statement:
SLICE(<a> array<T>, <start> integer, <length> integer)
Supported engines: SparkSQL and Presto
Usage instructions: Return a subset of array a starting from the index start (array hidden index starts from 1, if start is negative, it counts from the end) with the specified length length.
Return type: array<T>
Example:
> SELECT slice(array(1, 2, 3, 4), 2, 2);
[2,3]
> SELECT slice(array(1, 2, 3, 4), -2, 2);
[3,4]
ARRAYS_ZIP
Function statement:
ARRAYS_ZIP(<a1> array<T>, ...)
Supported engine: SparkSQL
Usage instructions: Return a merged array where each element is of struct type, with the Nth struct containing the Nth value of each input array.
Return type: array<struct<string, T>>
Example:
> SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
[{"0":1,"1":2},{"0":2,"1":3},{"0":3,"1":4}]
> SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
[{"0":1,"1":2,"2":3},{"0":2,"1":3,"2":4}]
SORT_ARRAY
Function statement:
SORT_ARRAY(<a> array<T>[, ascendingOrder boolean])
Supported engines: SparkSQL and Presto
Usage instructions: Sort the input array in ascending or descending order. For double/float types, NaN is considered greater than any non-NaN element. Null elements are placed at the beginning of the returned array in ascending order or at the end in descending order.
Return type: array<T>
Example:
> SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
[null,"a","b","c","d"]
SHUFFLE
Function statement:
Supported engines: SparkSQL and Presto
Usage instructions: Return a random permutation of the given array.
Return type: array<T>
Example:
> SELECT shuffle(array(1, 20, 3, 5));
[3,1,5,20]
> SELECT shuffle(array(1, 20, null, 3));
[20,null,3,1]
ARRAY_MAX
Function statement:
Supported engines: SparkSQL and Presto
Usage instructions: Return the maximum value in the array. For double/float types, NaN is considered greater than any non-NaN element. Null elements are skipped.
Return type: array<T>
Example:
> SELECT array_max(array(1, 20, null, 3));
20
ARRAY_MIN
Function statement:
Supported engines: SparkSQL and Presto
Usage instructions: Return the minimum value in the array. For double/float types, NaN is considered greater than any non-NaN element. Null elements are skipped.
Return type: array<T>
Example:
> SELECT array_min(array(1, 20, null, 3));
1
FLATTEN
Function statement:
FLATTEN(<aa> array<array<T>>
Supported engines: SparkSQL and Presto
Usage instructions: Convert a two-dimensional array into a one-dimensional array.
Return type: array<T>
Example:
> SELECT flatten(array(array(1, 2), array(3, 4)));
[1,2,3,4]
SEQUENCE
Function statement:
SEQUENCE(<start> integer|date|timestamp, end integer|date|timestamp[, step integer|interval])
Supported engines: SparkSQL and Presto
Usage instructions: Generate an array from start to end (inclusive), incrementing by step. The type of the returned elements matches the type of start and end.
The start and stop expressions must parse to the same type. If the start and stop expressions are parsed to the date or timestamp type, then step must be parsed to an interval, year-month interval, or day-time interval type; otherwise, it must be parsed to the same type as start and end.
Return type: same as start
Example:
> SELECT sequence(1, 5);
[1,2,3,4,5]
> SELECT sequence(5, 1);
[5,4,3,2,1]
> SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
[2018-01-01,2018-02-01,2018-03-01]
> SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval '0-1' year to month);
[2018-01-01,2018-02-01,2018-03-01]
ARRAY_REPEAT
Function statement:
ARRAY_REPEAT(<element> T, <count> integer)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array containing count occurrences of element.
Return type: array<T>
Example:
> SELECT array_repeat('123', 2);
["123","123"]
ARRAY_REMOVE
Function statement:
ARRAY_REMOVE(<a> array<T>, <element> T)
Supported engines: SparkSQL and Presto
Usage instructions: Delete all elements equal to the element from the array.
Return type: array<T>
Example:
> SELECT array_remove(array(1, 2, 3, null, 3), 3);
[1,2,null]
ARRAY_DISTINCT
Function statement:
ARRAY_DISTINCT(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Delete duplicate values from the array.
Return type: array<T>
Example:
> SELECT array_distinct(array(1, 2, 3, null, 3));
[1,2,3,null]
ELEMENT_AT
Function statement:
ELEMENT_AT(<a> array<T>, <index> integer)
ELEMENT_AT(<m> map<K, V>, <key> K)
Supported engines: SparkSQL and Presto
Usage instructions: Return the array element at the given (1-based) index or the value for the given key.
Return type: T and V
Example:
> SELECT element_at(array(1, 2, 3), 2);
2
> SELECT element_at(map(1, 'a', 2, 'b'), 2);
b
MAP
Function statement:
Supported engines: SparkSQL and Presto
Usage instructions: Generate a map from the given elements.
Return type: MAP<K, V>
Example:
> SELECT map(1.0, '2', 3.0, '4');
{1.0:"2",3.0:"4"}
MAP_FROM_ARRAYS
Function statement:
MAP_FROM_ARRAYS(<keys> array<K>, <values> array<V>)
Supported engines: SparkSQL and Presto
Usage instructions: Create a map using the given key/value arrays. None of the elements in keys should be null.
Return type: map<K, V>
Example:
> SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
{1.0:"2",3.0:"4"}
MAP_KEYS
Function statement:
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array containing the keys of the map.
Return type: array<K>
Example:
> SELECT map_keys(map(1, 'a', 2, 'b'));
[1,2]
MAP_VALUES
Function statement:
MAP_VALUES(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array containing the values of the map.
Return type: array<V>
Example:
> SELECT map_values(map(1, 'a', 2, 'b'));
["a","b"]
MAP_ENTRIES
Function statement:
MAP_ENTRIES(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array of all entries in the given map.
Return type: array<struct<K, V>>
Example:
> SELECT map_entries(map(1, 'a', 2, 'b'));
[{"key":1,"value":"a"},{"key":2,"value":"b"}]
MAP_FROM_ENTRIES
Function statement:
MAP_FROM_ENTRIES(<entries> array<struct<K, V>>)
Supported engines: SparkSQL and Presto
Usage instructions: Return a map created from the given array of entries.
Return type: map<K, V>
Example:
> SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
{1:"a",2:"b"}
MAP_CONCAT
Function statement:
MAP_CONCAT(map1 map<K, V>, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Return the union of all given maps.
Return type: map<K, V>
Example:
> SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
{1:"a",2:"b",3:"c"}
MAP_FILTER
Function statement:
MAP_FILTER(<m> map<K, V>, <func> function(K, V)->boolean)
Supported engine: SparkSQL
Usage instructions: Filter the entries in m using func.
Return type: map<K, V>
Example:
> SELECT map_filter(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
{1:0,3:-1}
MAP_ZIP_WITH
Function statement:
MAP_ZIP_WITH(<map1> map<K, V1>, <map2> map<K, V2>, <func> function(K, V1, V2)->R)
Supported engine: SparkSQL
Usage instructions: Merge two given maps into a single map using func. For keys that appear in only one map, their values will be set to NULL. If the input maps contain duplicate keys, only the first entry for each duplicate key is transmitted to func.
Return type: MAP<K, R>
Example:
> SELECT map_zip_with(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2));
{1:"ax",2:"by"}
Function statement:
TRANSFORM_KEYS(<m> map<K, V>, <func> function(K, V)->R)
Supported engine: SparkSQL
Usage instructions: Transform the keys in the map using func.
Return type: map<R, V>
Example:
> SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + 1);
{2:1,3:2,4:3}
> SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
{2:1,4:2,6:3}
Function statement:
TRANSFORM_VALUES(<m> map<K, V>, <func> function(K, V)->R)
Supported engine: SparkSQL
Usage instructions: Transform the values in the map using func.
Return type: map<K, R>
Example:
> SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
{1:2,2:3,3:4}
> SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
{1:2,2:4,3:6}
SIZE
Function statement:
SIZE(<expr> array<T>|map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the size of an array or map.
Return type: integer
Example:
> SELECT size(array('b', 'd', 'c', 'a'));
4
> SELECT size(map('a', 1, 'b', 2));
2
> SELECT size(NULL);
-1
CARDINALITY
Function statement:
CARDINALITY(<expr> array<T>|map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the size of an array or map.
Return type: integer
Example:
> SELECT cardinality(array('b', 'd', 'c', 'a'));
4
> SELECT cardinality(map('a', 1, 'b', 2));
2
> SELECT cardinality(NULL);
-1
ANY_MATCH / FORALL
Function statement:
ANY_MATCH(<expr> array<T>, x -> lambda(x))
FORALL(<expr> array<T>, x -> lambda(x))
Supported engine: SparkSQL
Usage instructions: For each element in the array, execute the lambda expression in sequence. The function returns true if at least one element returns true; otherwise, it returns false.
If any element in the array is NULL, the function returns NULL.
Return type: boolean
Example:
> SELECT any_match(array(1, 2, 3), x -> x % 2 == 0);
false
> SELECT any_match(array(2, 4, 8), x -> x % 2 == 0);
true
> SELECT any_match(array(1, null, 3), x -> x % 2 == 0);
NULL
Was this page helpful?