UDF Description
Users can write UDF functions, package them into JAR files, and then use them in query analysis by defining them as functions in Data Lake Compute. Currently, DLC's UDFs are in HIVE format, inheriting from org.apache.hadoop.hive.ql.exec.UDF and implementing the evaluate method.
Example: Simple Array UDF Function.
public class MyDiff extends UDF {
public ArrayList<Integer> evaluate(ArrayList<Integer> input) {
ArrayList<Integer> result = new ArrayList<Integer>();
result.add(0, 0);
for (int i = 1; i < input.size(); i++) {
result.add(i, input.get(i) - input.get(i - 1));
}
return result;
}
}
Reference for POM file:
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>
Creating function
Note:
If you are creating a UDAF/UDTF function, you need to add the _udaf/_udtf suffix to the function name accordingly.
If you are familiar with SQL syntax, you can create a function by executing the CREATE FUNCTION syntax via Data Exploration, or by using the visual interface. The process is as follows:
2. Enter Data Management through the left sidebar, select the database for the function you need to create. If you need to create a new database, refer to Data Catalog and DMC. 3. Click Function to enter the function management page.
4. Click Create Function to proceed with creation. UDF's application package can be uploaded locally or a COS path can be selected (requires COS-related permissions), for instance, creating by selecting a COS path.
Function Class Name includes "Package Information" and "Function Execution Class Name".
Function Usage
2. Enter Data Exploration via the left navigation menu, select a Compute Engine, and then you can use SQL to invoke the function.
Was this page helpful?