Incremental Refresh in Power BI, Part 3: Best Practices for Large Semantic Models

Incremental Refresh in Power BI, Best Practices for Large Semantic Models

In the two previous posts of the Incremental Refresh in Power BI series, we have learned what incremental refresh is, how to implement it, and best practices on how to safely publish the semantic model changes to Microsoft Fabric (aka Power BI Service). This post focuses on a couple of more best practices in implementing incremental refresh on large semantic models in Power BI.

Note

Since May 2023 that Microsoft announced Microsoft Fabric for the first time, Power BI is a part of Microsoft Fabric. Hence, we use the term Microsoft Fabric throughout this post to refer to Power BI or Power BI Service.

The Problem

Implementing incremental refresh on Power BI is usually straightforward if we carefully follow the implementation steps. However in some real-world scenarios, following the implementation steps is not enough. In different parts of my latest book, Expert Data Modeling with Power BI, 2’nd Edition, I emphasis the fact that understanding business requirements is the key to every single development project and data modelling is no different. Let me explain it more in the context of incremental data refresh implementation.

Let’s say we followed all the required implementation steps and we also followed the deployment best practices and everything runs pretty good in our development environment; the first data refresh takes longer, we we expected, all the partitions are also created and everything looks fine. So, we deploy the solution to production environment and refresh the semantic model. Our production data source has substantially larger data than the development data source. So the data refresh takes way too long. We wait a couple of hours and leave it to run overnight. The next day we find out that the first refresh failed. Some of the possibilities that lead the first data refresh to fail are Timeout, Out of resources, or Out of memory errors. This can happen regardless of your licensing plan, even on Power BI Premium capacities.

Another issue you may face usually happens during development. Many development teams try to keep their development data source’s size as close as possible to their production data source. And… NO, I am NOT suggesting using the production data source for development. Anyway, you may be tempted to do so. You set one month’s worth of data using the RangeStart and RangeEnd parameters just to find out that the data source actually has hundreds of millions of rows in a month. Now, your PBIX file on your local machine is way too large so you cannot even save it on your local machine.

This post provides some best practices. Some of the practices this post focuses on require implementation. To keep this post at an optimal length, I save the implementations for future posts. With that in mind, let’s begin.

Best Practices

So far, we have scratched the surface of some common challenges that we may face if we do not pay attention to the requirements and the size of the data being loaded into the data model. The good news is that this post explores a couple of good practices to guarantee smoother and more controlled implementation avoiding the data refresh issues as much as possible. Indeed, there might still be cases where we follow all best practices and we still face challenges.

Note

While implementing incremental refresh is available in Power BI Pro semantic models, but the restrictions on parallelism and lack of XMLA endpoint might be a deal breaker in many scenarios. So many of the techniques and best practices discussed in this post require a premium semantic model backed by either Premium Per User (PPU), Power BI Capacity (P/A/EM) or Fabric Capacity.

The next few sections explain some best practices to mitigate the risks of facing difficult challenges down the road.

Practice 1: Investigate the data source in terms of its complexity and size

This one is easy; not really. It is necessary to know what kind of beast we are dealing with. If you have access to the pre-production data source or to the production, it is good to know how much data will be loaded into the semantic model. Let’s say the source table contains 400 million rows of data for the past 2 years. A quick math suggests that on average we will have more than 16 million rows per month. While these are just hypothetical numbers, you may have even larger data sources. So having some data source size and growth estimation is always helpful for taking the next steps more thoroughly.

Practice 2: Keep the date range between the RangeStart and RangeEnd small

Continuing from the previous practice, if we deal with fairly large data sources, then waiting for millions of rows to be loaded into the data model at development time doesn’t make too much sense. So depending on the numbers you get from the previous point, select a date range that is small enough to let you easily continue with your development without needing to wait a long time to load the data into the model with every single change in the Power Query layer. Remember, the date range selected between the RangeStart and RangeEnd does NOT affect the creation of the partition on Microsoft Fabric after publishing. So there wouldn’t be any issues if you chose the values of the RangeStart and RangeEnd to be on the same day or even at the exact same time. One important point to remember is that we cannot change the values of the RangeStart and RangeEnd parameters after publishing the model to Microsoft Fabric.

Continue reading “Incremental Refresh in Power BI, Part 3: Best Practices for Large Semantic Models”

Microsoft Fabric: Overcome Reaching the Maximum Number of Fabric Trial Capacities

Microsoft Fabric Overcome Reaching the Maximum Number of Fabric Trial Capacities

If you are evaluating Microsoft Fabric and do not currently own a Premium Capacity, chances are you’re using Microsoft Fabric Trial Capacities. All Power BI users within an organisation or specific security groups given the rights can opt into Fabric Trial Capacities. Therefore, you may already have several Trial Fabric Capacities in your tenant. Your Fabric Administrators can specifically control who can opt into the Fabric Trial capacities within the Fabric Admin Portal, on the Help and support settings section, and enabling the Users can try Microsoft Fabric paid features setting as shown in the following image:

Enable Users can try Microsoft Fabric paid features for specific security groups via Fabric Admin Portal
Enable Users can try Microsoft Fabric paid features for specific security groups via Fabric Admin Portal

The authorised users can then opt into Fabric Trial by following this process:

  1. Click the Account Manager on the top right corner of the page
  2. Click the Start trial button
  3. Click the Start trial button again
  4. Provide the required details
  5. Click the Extend my free trial button

The following image shows the preceding steps:

Start Fabric Free Trial
Start Fabric Free Trial

As you see, opting into Fabric Trial is simple, unless it isn’t!

There are cases where authorised users cannot start their Fabric Trial because their tenant has already exceeded the limit of available trial capacities. In that case, the users get the following message:

Continue reading “Microsoft Fabric: Overcome Reaching the Maximum Number of Fabric Trial Capacities”

Microsoft Fabric: Use Copilot to Generate Data Model Synonyms

Microsoft Fabric: Use Copilot to Generate Data Model Synonyms

One of my older posts explains how to enable Copilot on Fabric and how to use Copilot to generate Power BI reports. In this post, I aim to explain yet another use case for Copilot that can help us to make a better and more useful semantic model in Power BI using synonyms. In an old post published in May 2016, I explained how to use Power BI synonyms to take our Power BI Q&A experience to another level. In that post, I explained how we could use synonyms to translate data model objects in different languages so the end-user could ask questions in their native language and get the results in Power BI. That was such a cool use case for synonyms, I suppose, wasn’t it? Fast track to December 2023, I believe the Q&A is still one of the coolest Power BI features that stands out when demoing the solutions to the customers; therefore, it makes absolute sense to use synonyms to improve the Q&A‘s efficiency and accuracy. This blog post explores the possibility of using Copilot to define synonyms in Power BI Desktop.

Prerequisites

As explained here, we first need to enable Copilot on Fabric Service. Please note that the technique explained in this post requires either Power BI Premium Capacity or at least F64 Fabric capacity and won’t work on PPU, Embedded capacities, Fabric capacities smaller than F64 or Fabric Trial (FT) capacities.

We also need to have the latest version of Power BI Desktop installed on our machine. With that, let’s begin.

Using Power BI Copilot to generate synonyms

While defining synonyms for the semantic model objects significantly helps with the Q&A experience, it is still a cumbersome process if done manually. So, if we meet the prerequisites, we can summon Copilot to the rescue. Follow these steps after opening a Power BI file in Power BI Desktop:

  1. Ensure you’re signed into Fabric service with your account
  2. Click the Insert tab
  3. Select the Q&A visual
  4. On the Q&A visual, click the Q&A Setup button shown with a gear icon
  5. On the Q&A Setup window, you must see a message offering to “Improve Q&A with synonyms from Copilot” on top of the window; click the Add synonyms button

The following image shows the preceding steps:

Improve Q&A with synonyms from Power BI Copilot in Microsoft Fabric
Improve Q&A with synonyms from Copilot
Continue reading “Microsoft Fabric: Use Copilot to Generate Data Model Synonyms”

Microsoft Fabric: Generating Reports with Copilot

Microsoft Fabric Generating Reports with Copilot on Fabric

In Nov 2023, Microsoft announced Microsoft Fabric’s general availability and Public Preview of Copilot in Microsoft Fabric. In a previous post, I explained what Copilot means to Power BI developers, which is valid for other Fabric developers such as data engineers and data scientists as Copilot for Fabric helps with those experiences as well. But the main focus of this blog post is to discuss the requirements, how to enable Copilot, and how to use it from a Power BI development point of view. So, this blog will not discuss other aspects of Copilot in Microsoft Fabric. With that, let’s begin.

Requirements

Right off the bat, Copilot is only available on Power BI Premium capacities or their equivalent Fabric capacities. So, NO it is NOT available on Power BI Pro or Premium Per User or Power BI Embedded Analytics. So the Power BI items you want to use Copilot on must be in a Workspace assigned to a Power BI Premium P1 or Microsoft Fabric F64 capacities or higher.

You also need to have a Contributor role on the premium workspace.

To use Copilot, your Microsoft Fabric Administrator must enable it from the Fabric Admin Portal. This setting is not available in all regions yet, but Microsoft is gradually rolling it out to more regions.

Useful links:

Enabling Copilot on Fabric Admin Portal

As mentioned before, your Fabric Administrator must enable Copilot features within the Admin Portal. Follow these steps to enable Copilot on your tenant after logging into Microsoft Fabric:

  1. Click Settings (the gear icon on the top right of the page)
  2. Click Admin portal
  3. Ensure that the Tenant setting tab is selected
  4. Scroll all the way down to the Copilot and Azure OpenAI Service (preview)​ section

Note

You can also use the search box and search for OpenAI to find the Copilot and Azure OpenAI Service (preview)​ section.

  1. Enable the Users can use a preview of Copilot and other features powered by Azure OpenAI
  2. Click the Apply button
  3. Enable the ​​​Data sent to Azure OpenAI can be processed outside your tenant’s geographic region, compliance boundary, or national cloud instance
  4. Click the Apply button again

That is it. You enabled the Copilot capabilities on your tenant.

The following image shows the preceding steps:

Enabling Copilot for Power BI in Fabric Service Admin Portal
Enabling Copilot in Fabric Admin Portal
Continue reading “Microsoft Fabric: Generating Reports with Copilot”