If you are a Power BI power user, you may have wondered: how many versions of Power BI Desktop are there? The quick answer is: it depends!
Depending on your organisation’s preferences, data governance requirements, and the platforms you intend to use for report deployment you may use either Power BI Desktop, the “standard version”, or Power BI Desktop RS (Report Server). Power BI Desktop has variations tailored to meet specific needs, such as cloud-based analytics or on-premises reporting. While many users might only encounter the standard version, there’s another important variant for specialised scenarios.
Power BI Desktop comes in two primary versions:
Power BI Desktop: This is the standard version most users rely on. It’s the go-to tool for transforming data, creating semantic models, and building interactive reports. This version is designed to seamlessly integrate with the Power BI Service hosted on Microsoft Fabric, enabling cloud-based sharing, collaboration, and advanced features like Direct Lake, AI-driven insights, and more. Regular updates ensure that this version includes the latest features and innovations, such as new Power Query and DAX functions, enhanced visuals, and cutting-edge integrations.
Power BI Desktop RS (Report Server): This is a specialised version of Power BI Desktop designed to work exclusively with Power BI Report Server, a locally hosted reporting platform. It is tailored for organisations that prefer to keep their data and reports on-premises due to regulatory, security, or strategic reasons, avoiding reliance on cloud services like the Power BI Service on Microsoft Fabric. Although the two versions look nearly identical in functionality, they serve distinct purposes. Power BI Desktop RS is specifically aligned with the capabilities of Power BI Report Server, supporting features available up to the latest release cycle of the server. For instance, Power BI Desktop RS updates are less frequent; typically released every few months, in line with Power BI Report Server’s update schedule; making it slightly behind the standard version in terms of cutting-edge features. However, it ensures stability and compatibility for on-premises deployments.
Managing data connections in Microsoft Fabric can be challenging if you’re unsure where to start. This blog post and its detailed YouTube video will help you find, manage, and share the existing data connections, making your workflow more efficient and streamlined. A meaningful use case for this feature is to reuse the existing connections leading to more controlled connections to the data sources. More on this later in this blog.
Understanding Data Connections in Microsoft Fabric
In Microsoft Fabric, a data connection links the platform to various data sources, whether in the cloud or on-premises. Different items in Microsoft Fabric, such as Data Factory Pipelines, Dataflows, Paginated reports, Semantic Models, KQL databases, and Mirrored Azure SQL databases (currently in preview), create these data connections.
Finding Data Connections
To find data connections in Microsoft Fabric:
Click on Settings at the top right of the page.
Select Manage connections and gateways.
Navigate to the Connections tab.
This tab displays all the connections shared with you or created by you. From here, you can check the status of each connection, remove old connections, and manage them as needed.
Manage connections and gateways
This page used to be called Manage Gateways where we could configure and manage on-premises data gateways. I have a very old blog post explaining the gateway setup and configuration in the cloud and on your local server here. While it’s an old post, the topics are still relevant, so check it out if you are interested in the gateway configuration.
Note
As the preceding image shows, the Data page is currently in publicPreview, hence, it is subject to change. It is also worthwhile to mention that not all connections are currently accessible via this page such as connections that are natively created by KQL databases within Fabric.
Check Connection Status
To check the connection status, click the status button of each connection. The result shows if the connection is online or offline.
I have been thinking about a mechanism to generate measures from numeric columns on Power BI data models. Of course, we can use Tabular Editor, but it requires some scripting, which is all right. However, the more advanced our requirements get, the more complex the C# script. In real-world development scenarios, it does not make sense to blindly create measures for all numeric columns, such as the key columns used to define relationships between tables, making C# scripting a bit more complex.
In this blog and accompanying YouTube video, I explain using Copilot within Power BI Desktop to create measures from numeric columns. This feature represents a significant advancement in Power BI’s capabilities as of April 2024, enabling data analysts and BI professionals to streamline parts of their data analysis tasks.
Prerequisites
As explained in a previous post here, we first need to enable Copilot on the Fabric Portal. Please note that Copilot in Power BI Desktop requires either Power BI Premium Capacity or AT LEAST an F64 Fabric Capacity. Unfortunately, Copilot is NOT available on PPU, Embedded capacities, Fabric capacities smaller than F64 and Fabric Trial (FT) capacities.
We also need to have the latest version of Power BI Desktop installed on our machine. With that, let’s begin.
YouTube Video
Here is the video on YouTube where I explain the same thing in less than 5 min. But if you are after more details, continue reading.
Introduction to Power BI and Copilot
As Power BI evolves, it incorporates more sophisticated AI-driven capabilities that simplify various aspects of data analytics. The integration of Copilot in Power BI Desktop enhances user interaction with data in many ways. Our focus on this blog is specifically using Copilot to create simple yet crucial measures based on numeric columns that previously required manual effort.
Use Copilot for Measure Creation
Using Copilot is straightforward and demonstrates impressive intelligence in its operational logic. The following steps explain how to do so:
Since May 2023 that Microsoft announced Microsoft Fabric for the first time, Power BI is a part of Microsoft Fabric. Hence, we use the term Microsoft Fabric throughout this post to refer to Power BI or Power BI Service.
The Problem
Implementing incremental refresh on Power BI is usually straightforward if we carefully follow the implementation steps. However in some real-world scenarios, following the implementation steps is not enough. In different parts of my latest book, Expert Data Modeling with Power BI, 2’nd Edition, I emphasis the fact that understanding business requirements is the key to every single development project and data modelling is no different. Let me explain it more in the context of incremental data refresh implementation.
Let’s say we followed all the required implementation steps and we also followed the deployment best practices and everything runs pretty good in our development environment; the first data refresh takes longer, we we expected, all the partitions are also created and everything looks fine. So, we deploy the solution to production environment and refresh the semantic model. Our production data source has substantially larger data than the development data source. So the data refresh takes way too long. We wait a couple of hours and leave it to run overnight. The next day we find out that the first refresh failed. Some of the possibilities that lead the first data refresh to fail are Timeout, Out of resources, or Out of memory errors. This can happen regardless of your licensing plan, even on Power BI Premium capacities.
Another issue you may face usually happens during development. Many development teams try to keep their development data source’s size as close as possible to their production data source. And… NO, I am NOT suggesting using the production data source for development. Anyway, you may be tempted to do so. You set one month’s worth of data using the RangeStart and RangeEnd parameters just to find out that the data source actually has hundreds of millions of rows in a month. Now, your PBIX file on your local machine is way too large so you cannot even save it on your local machine.
This post provides some best practices. Some of the practices this post focuses on require implementation. To keep this post at an optimal length, I save the implementations for future posts. With that in mind, let’s begin.
Best Practices
So far, we have scratched the surface of some common challenges that we may face if we do not pay attention to the requirements and the size of the data being loaded into the data model. The good news is that this post explores a couple of good practices to guarantee smoother and more controlled implementation avoiding the data refresh issues as much as possible. Indeed, there might still be cases where we follow all best practices and we still face challenges.
Note
While implementing incremental refresh is available in Power BI Pro semantic models, but the restrictions on parallelism and lack of XMLA endpoint might be a deal breaker in many scenarios. So many of the techniques and best practices discussed in this post require a premium semantic model backed by either Premium Per User (PPU), Power BI Capacity (P/A/EM) or Fabric Capacity.
The next few sections explain some best practices to mitigate the risks of facing difficult challenges down the road.
Practice 1: Investigate the data source in terms of its complexity and size
This one is easy; not really. It is necessary to know what kind of beast we are dealing with. If you have access to the pre-production data source or to the production, it is good to know how much data will be loaded into the semantic model. Let’s say the source table contains 400 million rows of data for the past 2 years. A quick math suggests that on average we will have more than 16 million rows per month. While these are just hypothetical numbers, you may have even larger data sources. So having some data source size and growth estimation is always helpful for taking the next steps more thoroughly.
Practice 2: Keep the date range between the RangeStart and RangeEnd small
Continuing from the previous practice, if we deal with fairly large data sources, then waiting for millions of rows to be loaded into the data model at development time doesn’t make too much sense. So depending on the numbers you get from the previous point, select a date range that is small enough to let you easily continue with your development without needing to wait a long time to load the data into the model with every single change in the Power Query layer. Remember, the date range selected between the RangeStart and RangeEnd does NOT affect the creation of the partition on Microsoft Fabric after publishing. So there wouldn’t be any issues if you chose the values of the RangeStart and RangeEnd to be on the same day or even at the exact same time. One important point to remember is that we cannot change the values of the RangeStart and RangeEnd parameters after publishing the model to Microsoft Fabric.