Optimising OData Refresh Performance in Power Query for Power BI and Excel

OData has been adopted by many software solutions and has been around for many years. Most solutions are using the OData is to serve their transactional processes. But as we know, Power BI is an analytical solution that can fetch hundreds of thousands (or millions) rows of data in a single table. So, obviously, OData is not optimised for that kind of purpose. One of the biggest challenges many Power BI developers face when working with OData connections is performance issues. The performance depends on numerous factors such as the size of tables in the backend database that the OData connection is serving, peak read data volume over periods of time, throttling mechanism to control over-utilisation of resources etc…

So, generally speaking, we do not expect to get a blazing fast data refresh performance over OData connections, that’s why in many cases using OData connections for analytical tools such as Power BI is discouraged. So, what are the solutions or alternatives if we do not use OData connections in Power BI? Well, the best solution is to migrate the data into an intermediary repository, such as Azure SQL Database or Azure Data Lake Store or even a simple Azure Storage Account, then connect from Power BI to that database. We must decide on the intermediary repository depending on the business requirements, technology preferences, costs, desired data latency, future support requirement and expertise etc…

But, what if we do not have any other options for now, and we have to use OData connection in Power BI without blasting the size and costs of the project by moving the data to an intermediary space? And.. let’s face it, many organisations dislike the idea of using an intermediary space for various reasons. The simplest one is that they simply cannot afford the associated costs of using intermediary storage or they do not have the expertise to support the solution in long term.

In this post, I am not discussing the solutions involving any alternatives; instead, I provide some tips and tricks that can improve the performance of your data refreshes over OData connections in Power BI.

Notes

The tips in this post will not give you blazing-fast data refresh performance over OData, but they will help you to improve the data refresh performance. So if you take all the actions explained in this post and you still do not get an acceptable performance, then you might need to think about the alternatives and move your data into a central repository.

If you are getting data from a D365 data source, you may want to look at some alternatives to OData connection such as Dataverse (SQL Endpoint), D365 Dataverse (Legacy) or Common Data Services (CDS). But keep in mind, even those connectors have some limitations and might not give you an acceptable data refresh performance. For instance, Dataverse (SQL Endpoint) has 80MB table size limitation. There might be some other reasons for not getting a good performance over those connections such as having extra wide tables. Believe me, I’ve seen some tables with more than 800 columns.

Some suggestions in this post apply to other data sources and are not limited to OData connections only.

Suggestion 1: Measure the data source size

It is always good to have an idea of the size of the data source we are dealing with and OData connection is no different. In fact, the backend tables on OData sources can be wast. I wrote a blog post around that before, so I suggest you use the custom function I wrote to understand the size of the data source. If your data source is large, then the query in that post takes a long time to get the results, but you can filter the tables to get the results quicker.

Suggestion 2: Avoid getting throttled

As mentioned earlier, many solutions have some throttling mechanisms to control the over-utilisation of resources. Sending many API requests may trigger throttling which limits our access to the data for a short period of time. During that period, our calls are redirected to a different URL.

Tip 1: Disabling Parallel Loading of Tables

One of the many reasons that Power BI requests many API calls is loading the data into multiple tables in Parallel. We can disable this setting from Power BI Desktop by following these steps:

  1. Click the File menu
  2. Click Options and settings
  3. Click Options
  4. Click the Data Load tab from the CURREN FILE section
  5. Untick the Enable parallel loading of tables option
Disabling Parallel Loading of Tables in Power BI
Disabling Parallel Loading of Tables in Power BI Desktop
Continue reading “Optimising OData Refresh Performance in Power Query for Power BI and Excel”

Quick Tips: OData Feed Analyser Custom Function in Power Query

OData Feed Analyser Custom Function in Power Query for Power BI and Excel

It’s been a while that I am working with OData data source in Power BI. One challenge that I almost always do not have a good understanding of the underlying data model. It can be really hard and time consuming if there is no one in the business that understands the underlying data model. I know, we can use $metadata to get the metadata schema from the OData feed, but let’s not go there. I am not an OData expert but here is the thing for someone like me, I work with various data sources which I am not necessarily an expert in, but I need to understand what the entities are, how they are connected etc… then what if I do not have access any SMEs (Subject Matter Expert) who can help me with that?

So getting involved with more OData options, let’s get into it.

The custom function below accepts an OData URL then it discovers all tables, their column count, their row count (more on this later), number and list of related tables, number and list of columns of type text, type number and Decimal.Type.

// fnODataFeedAnalyser
(ODataFeed as text) => 
  let
    Source = OData.Feed(ODataFeed),
    SourceToTable = Table.RenameColumns(
        Table.DemoteHeaders(Table.FromValue(Source)), 
        {{"Column1", "Name"}, {"Column2", "Data"}}
      ),
    FilterTables = Table.SelectRows(
        SourceToTable, 
        each Type.Is(Value.Type([Data]), Table.Type) = true
      ),
    SchemaAdded = Table.AddColumn(FilterTables, "Schema", each Table.Schema([Data])),
    TableColumnCountAdded = Table.AddColumn(
        SchemaAdded, 
        "Table Column Count", 
        each Table.ColumnCount([Data]), 
        Int64.Type
      ),
    TableCountRowsAdded = Table.AddColumn(
        TableColumnCountAdded, 
        "Table Row Count", 
        each Table.RowCount([Data]), 
        Int64.Type
      ),
    NumberOfRelatedTablesAdded = Table.AddColumn(
        TableCountRowsAdded, 
        "Number of Related Tables", 
        each List.Count(Table.ColumnsOfType([Data], {Table.Type}))
      ),
    ListOfRelatedTables = Table.AddColumn(
        NumberOfRelatedTablesAdded, 
        "List of Related Tables", 
        each 
          if [Number of Related Tables] = 0 then 
            null
          else 
            Table.ColumnsOfType([Data], {Table.Type}), 
        List.Type
      ),
    NumberOfTextColumnsAdded = Table.AddColumn(
        ListOfRelatedTables, 
        "Number of Text Columns", 
        each List.Count(Table.SelectRows([Schema], each Text.Contains([Kind], "text"))[Name]), 
        Int64.Type
      ),
    ListOfTextColunmsAdded = Table.AddColumn(
        NumberOfTextColumnsAdded, 
        "List of Text Columns", 
        each 
          if [Number of Text Columns] = 0 then 
            null
          else 
            Table.SelectRows([Schema], each Text.Contains([Kind], "text"))[Name]
      ),
    NumberOfNumericColumnsAdded = Table.AddColumn(
        ListOfTextColunmsAdded, 
        "Number of Numeric Columns", 
        each List.Count(Table.SelectRows([Schema], each Text.Contains([Kind], "number"))[Name]), 
        Int64.Type
      ),
    ListOfNumericColunmsAdded = Table.AddColumn(
        NumberOfNumericColumnsAdded, 
        "List of Numeric Columns", 
        each 
          if [Number of Numeric Columns] = 0 then 
            null
          else 
            Table.SelectRows([Schema], each Text.Contains([Kind], "number"))[Name]
      ),
    NumberOfDecimalColumnsAdded = Table.AddColumn(
        ListOfNumericColunmsAdded, 
        "Number of Decimal Columns", 
        each List.Count(
            Table.SelectRows([Schema], each Text.Contains([TypeName], "Decimal.Type"))[Name]
          ), 
        Int64.Type
      ),
    ListOfDcimalColunmsAdded = Table.AddColumn(
        NumberOfDecimalColumnsAdded, 
        "List of Decimal Columns", 
        each 
          if [Number of Decimal Columns] = 0 then 
            null
          else 
            Table.SelectRows([Schema], each Text.Contains([TypeName], "Decimal.Type"))[Name]
      ),
    #"Removed Other Columns" = Table.SelectColumns(
        ListOfDcimalColunmsAdded, 
        {
          "Name", 
          "Table Column Count", 
          "Table Row Count", 
          "Number of Related Tables", 
          "List of Related Tables", 
          "Number of Text Columns", 
          "List of Text Columns", 
          "Number of Numeric Columns", 
          "List of Numeric Columns", 
          "Number of Decimal Columns", 
          "List of Decimal Columns"
        }
      )
  in
    #"Removed Other Columns"
Continue reading “Quick Tips: OData Feed Analyser Custom Function in Power Query”