Incremental Refresh in Power BI, Part 1: Implementation in Power BI Desktop

Incremental-Refresh-in-Power-BI-Part-1-Implementation-in-Power-BI-Desktop

Incremental refresh, or in short, IR, refers to loading the data incrementally, which has been around in the world of ETL for data warehousing for a long time. Let us discuss incremental refresh (or incremental data loading) in a simple language to better understand how it works.

From a data movement standpoint, there are always two options when we transfer data from location A to location B:

  1. Truncation and load: We transfer the data as a whole from location A to location B. If location B has some data already, we entirely truncate the location B and reload the whole data from the location A to B
  2. Incremental load: We transfer the data as a whole from location A to location B just once for the first time. The next time, we only load the data changes from A to B. In this approach, we never truncate B. Instead, we only transfer the data that exists in A but not in B

When we refresh the data in Power BI, if we have not configured an incremental refresh, we use the first approach, which is truncation and load. Needless to say that in Power BI, the first approach only applies to tables with Import or Dual storage modes. Previously, the Incremental load was available only in the tables with either Import or Dual storage modes. But the new announcement from Microsoft about Hybrid Tables makes a big difference in how Incremental load works. With the Hybrid Tables, the Incremental load is available on a portion of the table when a specific partition is in Direct Query mode, while the rest of the partitions are in Import storage mode.

Incremental refresh used to be available only on Premium capacities, but from Feb 2020 onwards, it is also available in Power BI Pro with some limitations. However, the Hybrid Tables are currently available on Power BI Premium Capacity and Premium Per User (PPU) and not Pro. Let’s hope that Microsft will change its licensing plan for the Hybrid Tables in the future and make it available in Pro.

I will write about Hybrid Tables in a future blog post.

When we successfully configure the incremental refresh policies in Power BI, we always have two ranges of data; the historical range and the incremental range. The historical range includes all data processed in the past, and the incremental range is the current range of data to process. Incremental refresh in Power BI always looks for data changes in the incremental range, not the historical range. Therefore, the incremental refresh will not notice any changes in the historical data. When we talk about the data changes, we are referring to new rows inserted, updated or deleted, however, the incremental refresh detects updated rows as deleting the rows and inserting new rows of data.

Benefits of Incremental Refresh

Configuring incremental refresh is beneficial for large tables with hundreds of millions of rows. The following are some benefits of configuring incremental refresh in Power BI:

  • The data refreshes much faster than when we truncate and load the data as the incremental refresh only refreshes the incremental range
  • The data refresh process is less resource-intensive than refreshing the entire data all the time
  • The data refresh is less expensive and more maintainable than the non-incremental refreshes over large tables
  • The incremental refresh is inevitable when dealing with massive datasets with billions of rows that do not fit into our data model in Power BI Desktop. Remember, Power BI uses in-memory data processing engine; therefore, it is improbable that our local machine can handle importing billions of rows of data into the memory

Now that we understand what incremental refresh is, let us see how it works in Power BI.

Implementing Incremental Refresh Policies with Power BI Desktop

We currently can configure incremental refresh in the Power BI Desktop and in Dataflows contained in a Premium Workspace. In this blog post, we look at the incremental refresh implementation within the Power BI Desktop.

After we successfully implement the incremental refresh policies with the desktop, we publish the model to Power BI Service. The first data refresh takes longer as we transfer all data from the data source(s) to Power BI Service for the first time. After the first load, all future data refreshes will be incremental.

How to Implement Incremental Refresh

Implementing incremental refresh in Power BI is simple. There are two generic parts of the implementation:

  1. Preparing some prerequisites in Power Query and defining incremental policies in the data model
  2. Publishing the model to Power BI Service and refreshing the dataset

Let’s briefly get to some more details to quickly understand how the implementation works.

  • Preparing Prerequisites in Power Query
    • We require to define two parameters with DateTime data type in Power Query Editor. The names for the two parameters are RangeStart and RangeEnd, which are reserved for defining incremental refresh policies. As you know, Power Query is case sensitive, so the names of the parameters must be RangeStart and RangeEnd.
    • The next step is to filter the table by a DateTime column using the RangeStart and RangeEnd parameters when the value of the DateTime column is between RangeStart and RangeEnd.

Notes

  • The data type of the parameters must be DateTime
  • The datat tpe of the column we use for incremental refresh must be Int64 (integer) Date or DateTime.Therefore, for scenarios that our table has a smart date key instead of Date or DateTime, we have to convert the RangeStart and RangeEnd parameters to Int64
  • When we filter a table using the RangeStart and RangeEnd parameters, Power BI uses the filter on the DateTime column for creating partitions on the table. So it is important to pay attention to the DateTime ranges when filtering the values so that only one filter condition must have an “equal to” on RangeStart or RangeEnd, not both
Continue reading “Incremental Refresh in Power BI, Part 1: Implementation in Power BI Desktop”

Download Power BI Reports from Power BI Service

Download report from Power BI Service

A new cool feature added to Power BI Service is the ability to download Power BI reports from Power BI Service. This feature is highly demanded and it’s available from November 2016. I was really excited when I noticed that and I had to try it straight away. I was in a bus back to home on Friday, but, I couldn’t wait until I get home and test this cool feature. So I created a personal hotspot and started testing it in the bus. To make the level of my excitement clearer, I should reveal a secret. I get motion sick in the bus very quickly. It gets worth when I read something, even reading a text on my mobile. Man, it’s really horrible feeling. Knowing that I’ll potentially get sick, I turned on my tablet (a Windows 10 tablet of course) to test this new cool feature. So I logged into my Power BI Service account, I opened a report, clicked File menu and this is what I got

Inactive Download report from Power BI Service

But, why?

Two possibilities jumped into my head immediately:

  • The dataset of this particular report is not supported at the moment
  • The “Download report” feature is NOT supported in my area

So I opened Power BI Desktop and created a report on top of an Excel file very quickly, then I published it to the service and voila! It worked. So it is also available in my area.

 

Download report from Power BI Service

But, what was wrong with the previous report though? The dataset?

I checked the report’s dataset, it was on-premises SQL Server. Could it be a problem?

I created another Power BI report in Power BI Desktop on top of adventure works on SQL Server 2016. I published the model and interestingly the download report feature was still active. So how on earth I shouldn’t be able to download that report?

Well, I was in the bus, wobble about and I was feeling that the motion sickness symptom is coming for me and there were a bunch of “whys” in my head.

So I had to experiment some other datasets as well. I tested the following datasets:

  1. CSV files
  2. Folder
  3. SQL Server Direct Query
  4. SQL Server Analysis Services (SSAS) Multidimensional (Connect Live)
  5. SQL Server Analysis Services Tabular
  6. From Web
  7. Azure SQL Database
  8. Azure SQL Data Warehouse

Continue reading “Download Power BI Reports from Power BI Service”

Azure SQL Data Warehouse and Power BI

Azure SQL Data Warehouse and Power BI

Without a doubt cloud computing is going to change the future of data analytics and data visualisation very significantly. Microsoft Azure SQL Data Warehouse recently released for public preview. Combining Power BI as a powerful data visualisation tool with Azure SQL Data Warehouse will give the users the ability to see data insights of their data stored in Azure Data Warehouse very easily. In this post I explain how to install Azure SQL Data Warehouse and the the way it works with Power BI. Before going any further I’d like to have a look at the Azure SQL Data Warehouse very briefly.

What Is Azure SQL Data Warehouse?

Based on Microsoft documentation a SQL Data Warehouse is

Azure SQL Data Warehouse is an enterprise-class distributed database capable of processing petabyte volumes of relational and non-relational data.

Azure SQL Data Warehouse supports stored procedures, user-defined functions, indexes and collations. It uses columnstore index technology which significantly improves query performance as well as getting you up to 5 times compression in compare with traditional row based indexing.

I leave it to you learn more about Azure SQL Data Warehouse. But, it is important to keep in mind that there are some features like primary keys and foreign keys that are NOT supported in Azure SQL Data Warehouse which affect the way we use Power BI as a data visualisation tool over Azure SQL Data Warehouse. Without primary keys and foreign keys there is no physical relationships between the tables so Power BI service cannot detect any relationships by itself. There is a workaround for this that we can create some SQL views in Azure side to make it work. This can be an expensive solution. The other way is to load the data warehouse into a Power BI Desktop model which can detect the relationships automatically.

Now you know a bit bout Azure SQL Data Warehouse let’s get back to the subject and talk more about Power BI and Azure SQL Data Warehouse.

First things first. You need to have a Microsoft Azure subscription. If you don’t already have it you can use it for a one month trial here. You’ll also get $250 credit. But, remember that if you succeed the $250 in less than a month then you’ll need to pay for it if you want to use it longer.

Install Azure SQL Data Warehouse

After you get your Azure subscription, login to your account and you should see a dashboard like this

Install Azure SQL Data Warehouse

I’m not going to explain the above dashboard as it is out of scope of this article.

  • Click New

Install Azure SQL Data Warehouse 01

  • Click “Data + Storage” then click “SQL Data Ware House”

Install Azure SQL Data Warehouse 02

  • Enter a name for your database
  • Select a performance value

Continue reading “Azure SQL Data Warehouse and Power BI”

Power BI and Dynamics CRM

Dynamics CRM and Power BI

Microsoft is building lots of cloud based technologies these days and Microsoft Dynamics CRM is not an exception. With a powerful data visualisation tool like Power BI Microsoft took a great step to integrate Dynamics CRM and Power BI which means you can easily connect from Power BI website and Power BI Desktop to a cloud based Microsoft Dynamics CRM instance. In this article I express a step-by-step tutorial to be able to connect to connect to Dynamics CRM from Power BI website as well as Power BI Desktop.

Prerequisites:

  • You can only connect to Dynamics CRM Online (Cloud based Dynamics CRM) from both Power BI Desktop and Power BI website. If you have an older version on-premises Dynamics CRM and you’re willing to create visualisations on Power BI then you need to connect to the CRM database on SQL Server instance just like any other SQL Server databases.
  • You need to have a valid OData URL for a Dynamics CRM Online instance and an administrator must enable the OData endpoint in the CRM site settings. To find the OData endpoint address:
    • After browsing your CRM Online in a browser click “Customizations” from “Settings”

Dynamics CRM OData for Power BI

  • Click “Developer Resources”

Dynamics CRM OData for Power BI 2

  • Scroll down and then you can see OData URL under “Organization Data Service”

Dynamics CRM OData for Power BI 3

  • You should connect to Dynamics CRM Online using the same user account as your Power BI website. So if you have a different Power BI account then unfortunately you need to create a new account in Power BI which is identical to your CRM account.
  • Your browser’s popup blocker should be disabled or you should exclude https://login.microsoftonline.com from your popup blocker.
  • As Microsoft Dynamics CRM is integrated with Power BI you need to have Office 365 subscription. If you don’t have Dynamics CRM Online and Office 365 subscription, but, you’re willing to learn how Power BI and Dynamics CRM Online work together you can sign up for a 30-day trial of Microsoft Dynamics CRM here. There is also a trial guide for Microsoft Dynamics CRM Online and Office 365 here.
  • Continue reading “Power BI and Dynamics CRM”