Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Understanding Data Flows in Azure Data Factory | Data Flows and Transformations in ADF
Introduction to Data Engineering with Azure
course content

Course Content

Introduction to Data Engineering with Azure

Introduction to Data Engineering with Azure

1. Getting Started with Azure and Core Tools
2. Foundations of Azure Data Factory
3. Data Flows and Transformations in ADF
4. Practical Problem Solving with ADF

bookUnderstanding Data Flows in Azure Data Factory

For instance, imagine a scenario where you need to clean, enrich, and aggregate sales data from multiple regions. Instead of writing extensive SQL or Python scripts, you can use a Data Flow to visually map these transformations and execute them seamlessly within ADF.

Key Components of Data Flows

  • Source Transformation: defines where the data originates, such as Blob Storage or a SQL Database;
  • Transformations: include tools like filtering, joining, aggregating, or deriving new columns to manipulate the data;
  • Sink Transformation: specifies the destination for the processed data, such as another SQL Database, a data lake, or a file storage.

We will start our work with creating simple dataflow with source and sink transformations.

How to Set Up a Source Transformation

  1. Add a new Data Flow in the Author section of Azure Data Factory Studio;
  2. Drag a Source Transformation from the toolbox onto the Data Flow canvas;
  3. In the Source Transformation settings, select a Linked Service, such as Azure SQL Database or Azure Blob Storage, to connect to your data source;
  4. Choose an existing Dataset or create a new Dataset that represents the data to be ingested;
  5. Configure file format options if connecting to Blob Storage, or provide a SQL query to filter or structure the incoming data for databases;
  6. Validate the configuration and preview the data to ensure the source is correctly set up.

Sink Transformation for Processed Data

After defining transformations, use a Sink Transformation to specify where the transformed data will be stored. For example, you might save aggregated data back to the SQL database or export it as a CSV file to Blob Storage.

Which of the following best describes an example use case for Data Flows?

Which of the following best describes an example use case for Data Flows?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 3. Chapter 1
We're sorry to hear that something went wrong. What happened?
some-alt