Microsoft Fabric offers a robust platform for modern data management, enabling streamlined processing and advanced analytics. Data Pipelines, one of the tools, allow for automated extract, transform, and load processes, essential for ingesting transactional data into analytical data stores. Data pipeline is similar to Azure Data Factory/Synapse pipelines. You can create pipelines by using multiple activities and connect them together to perform orchestration. It can be configured to run interactively or schedule them to run automatically.
Today we will discuss data pipeline’s core activities and functions and then we will create a sample data pipeline using copy data activity.
Data Pipeline can be used in different scenarios like-
Before you start making pipelines in Microsoft Fabric, it’s important to grasp some basic ideas. In data pipelines, you’ll come across various elements like activities, parameters, and pipeline runs.
Activities in Microsoft Fabric’s data pipelines serve two primary purposes: data transformation and control. Popular activities include Copy Data, Dataflow, Stored Procedure, ForEach, If Condition, and Lookup.
Lookup Activity retrieves records or values from external sources for reference by subsequent activities.
Parameters: Parameters in pipelines allow for customization by providing specific values for each pipeline run. This flexibility enhances the reusability of pipelines, facilitating dynamic data ingestion and transformation processes.
Pipeline Runs: Pipeline runs are initiated each time a pipeline is executed. Runs can be started on-demand or scheduled at regular intervals. Utilize the unique run ID to review run details, ensure successful completion, and examine specific execution settings.
Once the data pipeline is established, you’ll encounter an interface to configure the ‘Copy Data’ activity. By choosing the ‘Copy Data’ activity, you’ll be guided to select a data source from the available options. You can opt for any provided data sources. For example, let’s proceed with creating the copy data activity using the sample dataset titled NYC Taxi-Green.
After selecting the source data, click ‘Next’ to proceed to the ‘Connect to data source’ step. Here, you’ll be able to preview the dataset.
After reviewing the dataset, click ‘Next’ to proceed to the ‘Choose data destination’ page. Here, you’ll need to select the destination where your data will be stored. For this example, we will select Lakehouse as the data destination.
Choose Lakehouse and proceed to select the specific Lakehouse where you want to store the data.
After selecting Lakehouse, proceed to the next step where you’ll be presented with this interface to configure how the dataset will be stored. Choose ‘Tables’ for the Root folder and ‘Load to new table’ as the Load settings. Rename the table as desired. Additionally, you have the option to map the columns if you wish to modify the column names or data types.
Click ‘Next’ to proceed to the ‘Review + save’ page where you can review the configurations made to copy the data from source to destination. After verifying the names of the source and destination data, select ‘Save + Run’ to save the copy data activity.
After selecting ‘Save and Run,’ the activity will begin processing the data from the source to the Lakehouse table. You can monitor the status of the run from the output pane.
After completion, you can view all the settings under the ‘Activities’ tab by selecting the ‘Copy Data’ activity.
To conclude, Microsoft Fabric’s Data Pipeline enables easier orchestration of ETL processes and facilitates extraction of data from a variety of source systems. From this article, we understand that core activities such as Copy Data streamlines extraction task, while the intuitive interface enables efficient configuration for data transfer and storage. This empowers organizations to harness data effectively for actionable insights and informed decision-making.
Stay tuned for a series of upcoming blogs covering various experiences within Fabric.
Stay ahead in a rapidly world. Subscribe to Prysm Insights,our monthly look at the critical issues facing global business.
© 2024 Data Crafters | All rights reserved