ETL is the process of extracting data from a business system, cleaning and transforming the data, and then loading the data to a data warehouse. It aims to consolidate and standardize raw data to facilitate the decision-making of enterprises. An ETL job collects data from a data source, transforms the data or adds information to the data, and loads the results to a data sink. You don't even need to know programming languages to start an ETL job. Just select a data source and a sink and configure field mappings based on your business logic.
This section shows you how to develop an ETL job with a private cluster. It includes the following documents:
Was this page helpful?