Course Content
Introduction to Data Engineering with Azure
Introduction to Data Engineering with Azure
2. Foundations of Azure Data Factory
Combining Data with Joins
Data often resides in multiple tables or sources, making it essential to merge information to derive meaningful insights. In this chapter, you'll learn how to use joins in Azure Data Factory (ADF) Data Flows to merge datasets efficiently
ADF supports several types of joins:
- Inner Join: combines rows where keys match in both datasets;
- Left Outer Join: includes all rows from the left dataset and matching rows from the right;
- Right Outer Join: includes all rows from the right dataset and matching rows from the left;
- Full Outer Join: includes all rows from both datasets, with nulls for unmatched rows;
- Cross Join: produces a Cartesian product of both datasets.
How to Combine Data with Joins in ADF
- Create new Data Flow or use existing one;
- Drag two Source Transformations onto the Data Flow canvas and connect them to the respective SQL tables;
- Drag a Join Transformation from the toolbox onto the canvas and connect the two sources to it;
- In the Join Transformation settings, select the Join Type and set the Join Condition;
- Add a Derived Column Transformation or any other transformation after the join to get some insights;
- Add a Sink Transformation to store the output;
- Validate the Data Flow configuration to ensure everything is correct.
Everything was clear?
Thanks for your feedback!
Section 3. Chapter 3