In Microsoft Fabric, dataflows are one of the most efficient ways to ingest and transform data. Their low-code design and user-friendly interface make them a favorite among data professionals who want to build scalable data pipelines without heavy coding.
Currently, there are three different generations of Dataflows:
- Dataflow Gen 1
- Dataflow Gen 2
- Dataflow Gen 2 CI/CD (Git integration)
The question is: which one should you choose?
How We Tested Microsoft Fabric Dataflows
To answer that, I ran a performance comparison using a sizable dataset: three years of NYC taxi trip data from 2016 to 2018, totaling approximately 30 million rows. Each Dataflow type was tested three times under the same Microsoft Fabric capacity, and each test was executed independently to avoid any resource contention.
This approach allowed for a fair apples-to-apples comparison of speed and capacity consumption.
Performance Comparison of Dataflows
Dataflow Gen 1: Baseline Performance
Fabric Capacity
While Dataflow Gen 1 is not a native Microsoft Fabric artifact, I tested it under the same Fabric capacity for a fair comparison baseline. The outcome? Significantly slower performance, highlighting its limitations in modern Fabric environments.
Dataflow gen 1 clocked in at an average refresh time of 19 minutes and 34 seconds.

Dataflow Gen 1 in Pro Workspace
To give dataflow gen 1 another chance on its home field, I decided to revert it’s workspace back to pro and see if there is a performance difference. I used the same methodology and refreshed the dataflow three separate times. The individual results are listed below and the average refresh time of the three runs were 14 minutes and 54 seconds.

Dataflow Gen 2: Faster and More Efficient
Now it’s time for Dataflow Gen 1’s predecessor, Dataflow Gen 2, to be put to the test. The new kid on the block returned impressive results with an average refresh time of 5 minutes and 55 seconds. Not only is Dataflow Gen 2 more performant than Gen 1, but it also comes with all the added features you get with a Fabric-backed item such as the ability to designate a data destination.

Dataflow Gen 2 CI/CD: Performance with Git Integration
Last, but certainly not least, is the newest kid on the block, Dataflow Gen2 CI/CD. This version gives you everything the standard Gen 2 Dataflow offers, plus full CI/CD compatibility.
It clocked an impressive average refresh time of 6 minutes and 25 seconds. This makes it not only a very viable option, but also one of the most compelling considering its strong performance with the bonus of Git integration.

Capacity Consumption: Speed vs Cost
Speed isn’t everything and only tells part of the story. We also need to consider capacity Units (CU) to understand the true cost of that performance.
Below is a breakdown of the CUs consumed by each of the different data flows used in this exercise. Again, Dataflow Gen 1 is not a native Fabric item, but we’ll include its runs on fabric capacity just for the purpose of comparison.

The results show that while Dataflow Gen 2 delivered the fastest refresh times, it required about 30% more capacity than the CI/CD version. Still, both Gen 2 types were far more efficient than Gen 1, which used over twice the capacity of its closest competitor.
Final Thoughts: Which Dataflow Should You Choose?
While every Dataflow variant has its role, Fabric’s performance optimizations give a clear edge to Gen2. To make the choice easier, I’ve summarized below when each version is the best fit.
Type | When to Use |
Dataflow Gen1 | When you’re not on Fabric…because it’s the only choice |
Dataflow Gen2 | When you’re on Fabric capacity and don’t need Git integration |
Dataflow Gen2 (CI/CD) | When you’re on Fabric and need Git integration |
For most Microsoft Fabric users, Gen 2 Dataflows are the clear choice. They deliver faster refresh times, better efficiency, and advanced features that Gen 1 simply can’t match. And if Git integration is a requirement, Gen 2 CI/CD offers the perfect blend of speed and DevOps compatibility.