tencent cloud

Feedback

Function Development

Last updated: 2024-11-01 16:35:05
    1. Log in to WeData Console and go to the function development page.
    2. Click Project List in the left menu to find the target project for the Function Development feature.
    3. After selecting a project, click to enter the Data Development module.
    4. Click Function Development in the left menu.

    Function Overview

    The UDF functions uploaded in the Resource Management feature can be used in Function Development. By categorizing functions and specifying the class name and usage, they can be used in data development processes. Currently, Hive SQL, Spark SQL, and DLC SQL function types are supported.

    Creating function

    1. On the Function Development page, click
    
    and select to create a new Hive SQL function, Spark SQL function, or DLC SQL function. Directly click the button
    
    to the right of the target path under the corresponding function type to create a corresponding type function.
    
    
    
    2. Configure the function in the popup window, and click Save and Submit to complete function registration.
    
    Configuration information is shown in the table below:
    Information
    Description(Optional)
    Function Type
    Create functions in the pre-set function categories based on their nature. Categories include: Analytical Functions, Encryption Functions, Aggregate Functions, Logical Functions, Date and Time Functions, Mathematical Functions, Conversion Functions, String Functions, IP and Domain Functions, Window Functions, and Other Functions.
    Class Name
    Enter the function's class name.
    Function File
    Select the address of the source file for the function:
    Select resource file: Choose the function file from the jar or zip resources uploaded from the Resource Management feature.
    Specify COS path: Obtain the function file from the platform COS bucket path.
    Resource File
    The function file option Select resource file requires selection of the desired function file in the resource management directory.
    COS Path
    The function file option Specify COS Path requires entering the path where the function file is located in the platform's COS bucket.
    Command Format
    Format: FunctionName(Input Parameters). For example, the sum function command format is sum(col)
    Usage Instructions
    Instructions for using custom-defined functions. For example, the instructions for using the sum function are: Calculate the summary value.
    Parameter description
    Parameter description for custom-defined functions. For example, the parameter description for the sum function is: col: Required. Column values can be of DOUBLE, DECIMAL, or BIGINT types. If the input is of STRING type, it will be implicitly converted to DOUBLE type for calculation.
    Returned values
    Return value description for custom-defined functions. For example, the return value for the sum function is: returns DOUBLE type.
    Sample code
    Example description for custom-defined functions. For example, the example for the sum function is: Calculate the total sales of all products, with the command example: select sum(sales) from table.
    3. After a function has been modified, you can save the history record using the version feature, including the version number, submitted by, submission time, change type, remarks, and support for rolling back to previous versions.

    Function Example

    Spark SQL function development example

    1. Create a project
    Create a Maven project and include the hive-exec dependency. Use the mvn command line to create the project or create it using the IDEA tool, replacing groupId and artifactId with your custom-defined names.
    mvn archetype:generate -DgroupId=com.example -DartifactId=demo-hive -Dveriosn=1.0-SNAPSHOT -Dpackage=com.example
    
    2. Writing code
    Include hive-exec and junit test dependencies in the pom file.
    <dependencies>
    <dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-exec</artifactId>
    <version>2.3.8</version>
    </dependency>
    <dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.11</version>
    <scope>test</scope>
    </dependency>
    </dependencies>
    Create a Java class in the src/main/java/com/example directory, inherit the org.apache.hadoop.hive.ql.exec.UDF class, write the evaluate method, and implement the specific behavior of the custom function, e.g., convert the input string to uppercase form.
    public class UppercaseUDF extends UDF {
    public String evaluate(String input) {
    return input.toUpperCase();
    }
    }
    3. Compiling and Packaging
    Introduce maven packaging plugin. Under the project root directory, execute the mvn package command to compile and package. The generated package name is: demo-hive-1.0-SNAPSHOT.jar.
    <build>
    <plugins>
    <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.8.1</version>
    <configuration>
    <source>1.8</source>
    <target>1.8</target>
    </configuration>
    </plugin>
    <!--(start) for package jar with dependencies -->
    <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-assembly-plugin</artifactId>
    <version>3.0.0</version>
    <configuration>
    <archive>
    <!--Specify the class where the main method is located-->
    <manifest>
    <mainClass>com.example.UppercaseUDF</mainClass>
    </manifest>
    </archive>
    <!--Do not change jar-with-dependencies-->
    <descriptorRefs>
    <descriptorRef>jar-with-dependencies</descriptorRef>
    </descriptorRefs>
    <appendAssemblyId>false</appendAssemblyId>
    </configuration>
    <executions>
    <execution>
    <id>make-assembly</id> <!-- this is used for inheritance merges -->
    <phase>package</phase> <!-- bind to the packaging phase -->
    <goals>
    <goal>single</goal>
    </goals>
    </execution>
    </executions>
    </plugin>
    <!--(end) for package jar with dependencies -->
    </plugins>
    </build>
    
    <repositories>
    <repository>
    <id>alimaven</id>
    <name>aliyun maven</name>
    <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
    </repository>
    </repositories>
    Execute the mvn package command:
    mvn package -Dmaven.test.skip=true
    You can also use IDEA software tools to package and generate a jar package with dependencies.
    
    4. Function Operations
    Enter the WeData function development page, create a custom Definition function, fill in the full path name of the function class: com.example.UppercaseUDF, select the corresponding resource file, which is the jar package implementing the custom Definition function. If there is no resource file, create a resource.
    4.1 Resource Upload:
    Upload the demo-hive-1.0-SNAPSHOT.jar function package through the resource management feature.
    
    4.2 Function Creation:
    Create a Spark SQL function through the function development feature.
    
    Sample Function Information:
    Information
    Content
    Function Type
    Other Functions
    Class Name
    com.example.UppercaseUDF
    Function File
    Select the resource file
    Resource File
    demo-hive-1.0-SNAPSHOT.jar
    Command Format
    UppercaseUDF(col)
    Usage Instructions
    Enter the string to be converted to uppercase format
    Parameter description
    Enter string type parameter
    Returned values
    Output string in uppercase format
    4.3 Function Usage:
    In the development space, create a new SQL file, use the newly created function, and verify its feature.

    DLC SQL Function Development Example

    Users can write UDF Functions, package them as JAR files, and use them in Data Development - Function Management. Currently, DLC Data Lake Compute UDFs are in Hive format, inheriting from org.apache.hadoop.hive.ql.exec.UDF, implementing the evaluate method. The steps for creating a function on the Function Development page can refer to Hive SQL Function.
    Java Code Example:
    public class MyDiff extends UDF {
    public ArrayList<Integer> evaluate(ArrayList<Integer> input) {
    ArrayList<Integer> result = new ArrayList<Integer>();
    result.add(0, 0);
    for (int i = 1; i < input.size(); i++) {
    result.add(i, input.get(i) - input.get(i - 1));
    }
    return result;
    }
    }
    Sample POM file:
    <dependencies>
    <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-log4j12</artifactId>
    <version>1.7.16</version>
    <scope>test</scope>
    </dependency>
    <dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-exec</artifactId>
    <version>1.2.1</version>
    </dependency>
    </dependencies>
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support