Conteúdo do Curso
Introduction to Data Engineering with Azure
Introduction to Data Engineering with Azure
Grouping, Select, and Sort Transformations in ADF
Now we will explore three essential transformations in Azure Data Factory: Grouping, Select, and Sort. These transformations are critical for organizing and structuring your data in the data flow. We will discuss how to use each transformation to manipulate data effectively.
For example, you can group sales data by the "Region"
column and calculate the sum of "SalesAmount"
to determine the total sales for each region. Similarly, grouping by "ProductCategory"
and applying a count function can provide the number of products sold in each category.
For example, you can use the Select transformation to rename the column "Cust_ID"
to "CustomerID"
for better clarity. Additionally, you can drop unnecessary columns, such as "TempData,"
to streamline the dataset for further analysis.
For example, the Sort transformation can be used to arrange sales data in descending order based on the "TotalSales"
column, ensuring the highest sales appear first. Alternatively, you could sort employee records by "HireDate"
in ascending order to view the most recently hired employees.
How to Use Grouping, Select, and Sort Transformations in ADF
- Create new Data Flow or use existing one;
- Drag a Grouping Transformation onto the Data Flow canvas;
- In the Grouping settings, choose the column(s) to group by (e.g.,
Region
); - Define aggregation logic for other columns (e.g., sum, average, max);
- Drag a Select Transformation onto the canvas;
- In the Select settings, choose columns to keep, rename or reorder them as needed;
- Drag a Sort Transformation onto the canvas;
- In the Sort settings, define the column(s) to sort by and choose the sort order (ascending or descending);
- Add a Sink Transformation to store the output in a destination like SQL or Blob Storage;
- Validate the Data Flow configuration to ensure everything is correct.
Obrigado pelo seu feedback!