Understanding Power BI Totals: The Math, the Model, and the Misconceptions

The long-running debate around how Power BI calculates totals in tables and matrices has been part of the community conversation for years. Greg Deckler has kept the topic alive through his ongoing “broken totals” posts on social media, often suggesting that Power BI should include a simple toggle to make totals behave more like Excel. His continued campaign prompted a detailed reply from Daniel Otykier in his article No More Measure Totals Shenanigans, and earlier, Diego Scalioni explored how DAX evaluates totals internally in his post Cache me if you can: DAX Totals behind the scenes.

This blog brings all those perspectives together from a scientific and comparative angle. It looks at how totals are calculated in Power BI and compares that behaviour with Tableau, Excel, Paginated Reports, QlikView and even T-SQL. The goal is not to take sides, but to clear up the confusion around what is happening under the hood.

If you are into podcasts and prefer the audio version of this blog, I got you covered. Here an AI generated podcast for this blog. 👇

Power BI’s Broken Totals – Myth Debunked

Are Power BI Totals Really Broken?

Let’s get one thing clear right at the start, no, Power BI totals are not broken. There is no “it depends” this time. What some interpret as broken behaviour is actually how DAX and the underlying model are designed to work.

This post is not personal, it is purely scientific and technical. While I have great respect for Greg and his significant contributions to the Power BI community, I disagree with the use of the word “BROKEN.” It sounds dramatic but does not reflect the full truth. Totals in Power BI behave exactly as the model and the maths define them to. Want to know why? Keep reading.

Why this matters

When someone with Greg’s influence keeps saying totals are “broken”, it really affects how new users see Power BI. Some even start thinking the tool itself is not reliable, when what they are seeing is actually how different reporting tools do their calculations in different ways.

It helps to know the main calculation styles that these tools use:

  • Cell based: This is what you get in worksheet formulas and classic PivotTables that use Excel ranges. Totals are just simple sums of the shown items, with no model or relationships behind the scene.
  • Model driven: This is how Power BI works and also Excel PivotTables that use the Data Model (Power Pivot) or connect to a tabular dataset. Measures are calculated again for every context, so totals depend on how filters and relationships are set.
  • Query driven: Tools like Paginated Reports work this way. The report runs a query, for example SQL or DAX, gets the dataset, and then sums or averages values in the report design. The author decides how each total should be calculated.
  • Hybrid (query and context driven): Tableau fits in here. It gets the data through a query but also lets you change the level of detail and how totals behave in the visual. So sometimes it acts like a query tool and sometimes more like a model one.

Most of the confusion happens when people compare results from these tools as if they all worked the same way. Once you understand the difference between cell based, model driven, query driven, and hybrid tools, the way Power BI shows its totals starts to make full sense.

The problem that started it

Greg’s long-running example uses a small table with a single column of numbers and a DAX measure like this:

SUMX(SampleData, SampleData[Amount]) - 10

In the total row, the result shows 590, while he expects 580 (two groups of 290 each). Based on that, he argues that Power BI totals are “wrong”.

But DAX is only doing what it is told to do. In this measure, the subtraction of 10 happens after the total amount is calculated, not for each row. If the intention was to take 10 away per row, then the measure should be written like this:

SUMX(SampleData, SampleData[Amount] - 10)

This version gives the expected 580 because the subtraction now happens at the lowest level of detail, which is per row.

This might look like a small detail, but it is exactly where most of the confusion around totals begins. The difference is not about Power BI being wrong; it is about understanding where in the calculation the operation happens.

The math behind it

Before we look at the numbers, let’s first talk about what we are trying to do. We Greg’s small and very simple table that shows some amounts by Category and Colour:

CategoryColourAmount
ARed100
AGreen100
ABlue100
BRed100
BGreen100
BBlue100
Continue reading “Understanding Power BI Totals: The Math, the Model, and the Misconceptions”

Endorsement in Power BI, Part 2, How to Endorse?

Endorsement in Power BI, Part 2, How to Endorse?

In the previous post I explained the basic concepts around endorsement in Power BI. We discussed that users’ ability to collaborate in creating and sharing artifacts is one of the key aspects of users’ experience in Power BI. But it would be hard, if not impossible, to identify the quality of the artifact without a mechanism to identify the artifact’s quality in large organisations. Endorsement is the answer to this challenge. We discussed the following in the previous post:

In this post, I explain the following:

How do Power BI administrators enable certification and grant rights to security groups?

In the previous post, we discussed that a Power BI administrator must enable certification and grant sufficient rights to the security groups. Therefore, all members of the specified security group are authorised to certify the artifacts. If you are a Power BI administrator, follow these steps to do so:

  1. After logging into Power BI Service, click the Settings button
  2. Click Admin Portal
  3. From the Tenant settings, scroll down to find the Export and sharing settings
  4. Find and expand the Certification setting
  5. Enable certification
  6. Put the certification process documentation URL (if any)
  7. It is not recommended to enable this feature for the entire organisation. So, select the Specific security groups option
  8. Type the security group name and select it from the list
  9. Click the Apply button

The following image shows the above steps:

Enabling certification from the Admin Portal in Power BI Service
Enabling certification from the Admin Portal in Power BI Service

It may take up to 15 minutes for the changes to go through. After that, all the members of the specified security can certify the artifacts. In the next section, we see how to certify the supported artifacts.

Note

Everyone who has “write” permission on the Workspace containing the artifact can promote it. Therefore, the users or security groups with one of the Admin, Member, or Contributor roles in the Workspace can promote the artifacts.

However, one should not promote the artifacts just because he/she can. The organisations usually have a promotion process to follow, but the boundaries around promoting are often much more relaxed than certifying it.

Continue reading “Endorsement in Power BI, Part 2, How to Endorse?”

Business Intelligence Components and How They Relate to Power BI

Business Intelligence Components and How They Relate to Power BI

When I decided to write this blog post, I thought it would be a good idea to learn a bit about the history of Business Intelligence. I searched on the internet, and I found this page on Wikipedia. The term Business Intelligence as we know it today was coined by an IBM computer science researcher, Hans Peter Luhn, in 1958, who wrote a paper in the IBM Systems journal titled A Business Intelligence System as a specific process in data science. In the Objectives and principles section of his paper, Luhn defines the business as “a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera.” and an intelligence system as “the communication facility serving the conduct of a business (in the broad sense)”. Then he refers to Webster’s dictionary’s definition of the word Intelligence as the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal”.

It is fascinating to see how a fantastic idea in the past sets a concrete future that can help us have a better life. Isn’t it precisely what we do in our daily BI processes as Luhn described of a Business Intelligence System for the first time? How cool is that?

When we talk about the term BI today, we refer to a specific and scientific set of processes of transforming the raw data into valuable and understandable information for various business sectors (such as sales, inventory, law, etc…). These processes will help businesses to make data-driven decisions based on the existing hidden facts in the data.

Like everything else, the BI processes improved a lot during its life. I will try to make some sensible links between today’s BI Components and Power BI in this post.

Generic Components of Business Intelligence Solutions

Generally speaking, a BI solution contains various components and tools that may vary in different solutions depending on the business requirements, data culture and the organisation’s maturity in analytics. But the processes are very similar to the following:

  • We usually have multiple source systems with different technologies containing the raw data, such as SQL Server, Excel, JSON, Parquet files etc…
  • We integrate the raw data into a central repository to reduce the risk of making any interruptions to the source systems by constantly connecting to them. We usually load the data from the data sources into the central repository.
  • We transform the data to optimise it for reporting and analytical purposes, and we load it into another storage. We aim to keep the historical data in this storage.
  • We pre-aggregate the data into certain levels based on the business requirements and load the data into another storage. We usually do not keep the whole historical data in this storage; instead, we only keep the data required to be analysed or reported.
  • We create reports and dashboards to turn the data into useful information

With the above processes in mind, a BI solution consists of the following components:

  • Data Sources
  • Staging
  • Data Warehouse/Data Mart(s)
  • Extract, Transform and Load (ETL)
  • Semantic Layer
  • Data Visualisation

Data Sources

One of the main goals of running a BI project is to enable organisations to make data-driven decisions. An organisation might have multiple departments using various tools to collect the relevant data every day, such as sales, inventory, marketing, finance, health and safety etc.

The data generated by the business tools are stored somewhere using different technologies. A sales system might store the data in an Oracle database, while the finance system stores the data in a SQL Server database in the cloud. The finance team also generate some data stored in Excel files.

The data generated by different systems are the source for a BI solution.

Staging

We usually have multiple data sources contributing to the data analysis in real-world scenarios. To be able to analyse all the data sources, we require a mechanism to load the data into a central repository. The main reason for that is the business tools required to constantly store data in the underlying storage. Therefore, frequent connections to the source systems can put our production systems at risk of being unresponsive or performing poorly. The central repository where we store the data from various data sources is called Staging. We usually store the data in the staging with no or minor changes compared to the data in the data sources. Therefore, the quality of the data stored in the staging is usually low and requires cleansing in the subsequent phases of the data journey. In many BI solutions, we use Staging as a temporary environment, so we delete the Staging data regularly after it is successfully transferred to the next stage, the data warehouse or data marts.

If we want to indicate the data quality with colours, it is fair to say the data quality in staging is Bronze.

Data Warehouse/Data Mart(s)

As mentioned before, the data in the staging is not in its best shape and format. Multiple data sources disparately generate the data. So, analysing the data and creating reports on top of the data in staging would be challenging, time-consuming and expensive. So we require to find out the links between the data sources, cleanse, reshape and transform the data and make it more optimised for data analysis and reporting activities. We store the current and historical data in a data warehouse. So it is pretty normal to have hundreds of millions or even billions of rows of data over a long period. Depending on the overall architecture, the data warehouse might contain encapsulated business-specific data in a data mart or a collection of data marts. In data warehousing, we use different modelling approaches such as Star Schema. As mentioned earlier, one of the primary purposes of having a data warehouse is to keep the history of the data. This is a massive benefit of having a data warehouse, but this strength comes with a cost. As the volume of the data in the data warehouse grows, it makes it more expensive to analyse the data. The data quality in the data warehouse or data marts is Silver.

Extract, Transfrom and Load (ETL)

In the previous sections, we mentioned that we integrate the data from the data sources in the staging area, then we cleanse, reshape and transform the data and load it into a data warehouse. To do so, we follow a process called Extract, Transform and Load or, in short, ETL. As you can imagine, the ETL processes are usually pretty complex and expensive, but they are an essential part of every BI solution.

Continue reading “Business Intelligence Components and How They Relate to Power BI”

Quick Tips: How to Enable Dataflows In Power BI Service

Dataflows in Power BI Service

Dataflows (Preview) in Power BI Service has been landed yesterday (6th November 2018). I had a little bit of difficulties to enable this cool new feature so I thought it is good to write a Quick tip about it. While Dataflows is under preveiw at the time of writing this quick tip, the situation may be totally different in the future.

Straight away, fully featured Dataflows is available in a Power BI Premium capacity or in a Power BI Embedded Capacity, but, while this is still in preview, you can take advantage of limited features available in your Power BI Pro license. Features like “Linked entities from other dataflows” or “Computed Entities”, like merging tables to a new table, are not available in a Power BI Pro license.

Dataflows Computed Entities

Enabling Dataflows

  • After sign in to Power BI Service click “Settings”
  • Click “Admin Portal”

Power BI Service Admin Portal

  • Select Capacity type you are in, either Premium or Embedded
  • Click on a desired capacity that you’d like to enable Dataflows

Managing a Premium Capacity in Power BI Admin Portal

  • Scroll down to find and click “Workloads” under “More Options”
  • Enable “Dataflows (Preview)”
  • If you stick to the default “Max Memory (%)” value that is set to 20 you’ll get an error message saying “There was an issue updating your workload setting. Try again in a little while”. The error message is not helpful at all. The reason you get the error message is that the “Max Memory (%)” value must be a number between 27 to 100 while the default is 20.

Enabling Dataflows in Power BI Service Continue reading “Quick Tips: How to Enable Dataflows In Power BI Service”