Finding Minimum Date and Maximum Date Across All Tables in Power Query in Power BI and Excel

Finding Minimum Date and Maximum Date Across All Tables in Power Query in Power BI and Excel

When we talk about data analysis in Power BI, creating a Date table is inevitable. There are different methods to create a Date table either in DAX or in Power Query. In DAX you my use either CALENDAR() function or CALENDARAUTO() function to create the Date table. In Power Query you may use a combination of List.Dates()#date() and #duration() functions. Either way, there is one point that is always challenging and it is how to find out a proper date range, starting from a date in the past and ending with a date in the future, that covers all relevant dates within the data model. One simple answer is, we can ask the business. The SMEs know what the valid date range is..

While this is a correct argument it is not always the case. Especially with the Start Date which is a date in the past. In many cases the business says:

Lets’s have a look at the data to find out.

That is also a correct point, we can always a look at the data, find all columns with either Date or DateTime datatypes then sort the data in ascending or descending order to get the results. But what if there many of them? Then this process can be very time consuming.

Many of you may already thought that we can use CALENDARAUTO() in DAX and we are good to go. Well, that’s not quite right. In many cases there are some Date or DateTime columns that must not be considered in our Date dimension. Like Birth Date or Deceased Date. More on this later in this post.

In this post I share a piece of code I wrote for myself. I was in a situation to identify the Start Date and the End Date of the date dimension many times, so I thought it might help you as well.

How it works?

The Power Query expressions I share in this post starts with getting all existing queries using:

  • #sections intrinsic variable
  • Filtering out the current query name, which is GetMinMaxAllDates in my sample, to avoid getting the following error:

Expression.Error: A cyclic reference was encountered during evaluation.

Expression.Error: A cyclic reference was encountered during evaluation.
  • Filtering out the queries that are NOT as type table
  • Adding a new structured column named TableSchema that includes the tables’ structure
  • Expanding the TableSchema structured column keeping the Name and Kind columns and renaming the Name column to Column Name and the Kind column to Datatype
  • Filter the results to keep only the columns with either Date or DateTime datatypes
  • Filtering out unnecessary values from the Column Name like Birth Date
  • Adding a new column named Min Date that gets the minimum value of the column that appears in the Column Name column of the table value that appears in the Value column

Hmm! I suppose it is too much mentioning valuecolumn and table in different contexts. I hope I’m not making it even more confusing though.

  • Adding another new column named Max Date similar to how we created the Min Date
  • Extracting the minimum value of the Min Date column
  • Extracting the maximum values of the Max Date column
  • Showing the latter two as a list

So if you are looking for a solution here is the Power Query expressions that I use:

let
    AllQueries = #sections,
    RecordToTable = Record.ToTable(AllQueries[Section1]),
    FilterOutCurrentQuery = Table.SelectRows(RecordToTable, each [Name] <> "GetMinMaxAllDates" and Type.Is(Value.Type([Value]), type table) = true),
    AddTableSchemaColumn = Table.AddColumn(FilterOutCurrentQuery, "TableSchema", each try Table.Schema([Value]) otherwise null),
    ExpandTableSchema = Table.Buffer(Table.ExpandTableColumn(AddTableSchemaColumn, "TableSchema", {"Name", "Kind"}, {"Column Name", "Datatype"})),
    FilterTypes = Table.SelectRows(ExpandTableSchema, each ([Datatype] = "datetime" or [Datatype] = "date")),
    AddedMinDateColumn = Table.AddColumn(FilterTypes, "Min Date", each Date.From(List.Min(Table.Column([Value], [Column Name])))),
    AddedMaxDateColumn = Table.AddColumn(AddedMinDateColumn, "Max Date", each Date.From(List.Max(Table.Column([Value], [Column Name])))),
    FilterOutUnnecessaryColumns = Table.SelectRows(AddedMaxDateColumn, each ([Column Name] <> "BirthDate")),
    MinDate = List.Min(List.Combine({FilterOutUnnecessaryColumns[Min Date], FilterOutUnnecessaryColumns[Max Date]})),
    MaxDate = List.Max(List.Combine({FilterOutUnnecessaryColumns[Min Date], FilterOutUnnecessaryColumns[Max Date]})),
    MinMaxDates = {"Min Date = " & Text.From(MinDate), "Max Date = " & Text.From(MaxDate)}
in
        MinMaxDates

You can download the above expressions from here.

The image below illustrates the results of running the above code in Power Query Editor having 11 fact tables and 2 dimension tables. Those tables have 17 columns with either Date or DateTime datatypes:

GetMinMaxAllDates Query in Power Query

Note: Once again, you need to pass the current query name in the expressions above. In my case the current query name is GetMinMaxAllDates as shown in the image below:

Filtering out the current Query Name

Earlier in this post I mentioned that in many cases we do NOT want all Date or DateTime columns to be covered by the Date table. A good example for it is Birth Date and Deceased Date. If we do not note that then we can create a lot of irrelevant dates in our Date table like what we get as the Min Date in the above image which is 10/02/1916. As you can in the image above there is a FilterOutUnnecessaryColumns step. We click on that step to filter the unnecessary values from the Column Name column as shown in the image below:

Filtering out Birth Date

Click on the last step which is MinMaxDates to see the new values as shown in the image below:

New Min Date after fingering out the Birth Date column

By running the above query you get the valid date range, so you can now create a Date table with any method of choice, either in Power Query or DAX using the above date range. Remember, creating the Date table is completely separate process. This query is only helping us finding minimum and maximum valid dates across all tables loaded into the Power Query Editor.

Please note that this method is only to get the min and max valid dates across all tables. If you attempt to load the results into the data model in Power BI, the resulting table will be empty. This behaviour is the result of using #sections or #shared intrinsic variables in Power Query and their inherent limitations. While #sections is useful for accessing metadata about all queries, its dynamic nature can cause problems when trying to load data into the Power BI model. Therefore, we have to Disable query load to avoid getting an empty table in the data model.

Considerations

  • The above tables altogether have 40M rows and the GetMinMaxAllDates query ran in approximately 10 sec on my machine which is not bad at all. However, in larger tables it may take more to give you the results
  • You must have some queries already loaded into the Power BI Editor
  • This method also works in Direct Query mode, but you expect the query to take more time to get the results
  • The above query retrieves the min date and max date across all tables. When you create a Date table, be aware that the Date column should start from the 1st Jan of the min date going all the way up to the 31st Dec of the max date
  • This method works in Power Query Editor within Power BI Desktop RS as well
  • This method is NOT supported in Power BI Dataflows

Enjoy your Dating!


Discover more from BI Insight

Subscribe to get the latest posts sent to your email.

12 thoughts on “Finding Minimum Date and Maximum Date Across All Tables in Power Query in Power BI and Excel

    1. Hi Nate,

      Welcome to BIInsight.com and thanks for .
      I suppose I had to explain it more thoroughly, so thanks for your valuable feedback and bringing this to my attention.
      You have to put the current query name in the expressions to filter it out like the image below, otherwise you’ll get the “Cyclic…” error which is not pleasant!

      I’ll update the post in a min.

      Cheers.

      1. Right…after I asked this I figured it out. Appreciate it though…makes sense for sure. I can get that to work now…but more now the question is how to use the results in PQ to build a date table effectively? I already have a PQ date table where I have a “StartDate” and “EndDate” variable that can be set per report. I would like to replace those with the results of these and am wondering what code I would use to reference the results.

        1. I know!
          Well, at first my aim was to also create a Date table with the results of the above process, which makes absolute sense.
          Unfortunately, Power BI doesn’t like dynamic Power Query expressions and it raises an error message complaining as below which is nonsense.

          Error returned: ‘OLE DB or ODBC error: [Expression.Error] We cannot convert the value null to type Logical.. ‘.

          Therefore, I decided to share only the Min/Max date detection part.
          Having said that, the dynamic Date table creation using the above method works perfectly in Excel PQ.
          I may write another blogpost about that later, but by all means, give it a go using the outputs of the above expression in your PQ expressions to generate the Date table.
          If you could make it work please share your experience with us.

          Cheers.

          1. I used dynamic expression all the time. Here is my solution to get Min & Max date then generate a list of Calendar Date.

            I used a combination of techniques to auto-generate dates. Initially yours then parts from a stack overflow article; “Add missing date rows in Power BI/Power Query and take value of row above”.

            Query with comments:

            let
            // Get list of all queries
            AllQueries = #sections,
            // Convert list to table
            RecordToTable = Record.ToTable(AllQueries[Section1]),
            // Filter list of tables – Filter on the tables you want used
            #”FilterTables – Update List!” = Table.SelectRows(RecordToTable, each ([Name] = “CallDetails” or [Name] = “InteractionDetails”)),
            // Get table schema
            #”Added TableSchema” = Table.AddColumn(#”FilterTables – Update List!”, “TableSchema”, each try Table.Schema([Value]) otherwise null),
            // Expand Schema
            #”Expanded TableSchema” = Table.ExpandTableColumn(#”Added TableSchema”, “TableSchema”, {“Name”, “Kind”}, {“Column”, “Datatype”}),
            // Remove all non-date columns
            #”Filtered to date & datetime” = Table.SelectRows(#”Expanded TableSchema”, each [Datatype] = “datetime” or [Datatype] = “date”),
            // Get Minimum Date for all selected columns
            #”Add MinDate” = Table.AddColumn(#”Filtered to date & datetime”, “MinDate”, each Date.From(List.Min(Table.Column([Value], [Column])))),
            // Get Maximum date for all selected columns
            #”Add MaxDate” = Table.AddColumn(#”Add MinDate”, “MaxDate”, each Date.From(List.Max(Table.Column([Value], [Column])))),
            // convert to date to prevent errors
            Convert2Date = Table.TransformColumnTypes(#”Add MaxDate”,{{“MinDate”, type date}, {“MaxDate”, type date}}),
            // set start date variable by getting MIN date from MinDate column
            start = List.Min(Convert2Date[MinDate]),
            // set end date variable by getting MAX date from MaxDate column
            end = List.Max(Convert2Date[MaxDate]),
            // number of dates
            size = Number.From(end – start ) + 1,
            // Create dates from MinDate to MaxDate
            GenerateDates = List.Dates(start, size, #duration(1,0,0,0)),
            // new table with list of dates
            DateTable = Table.FromColumns({GenerateDates},{“CalendarDate”}),
            // join CalendarDate to existing table
            Joined = Table.Join(DateTable,”CalendarDate”,Convert2Date,”MinDate”,JoinKind.FullOuter),
            // Remove all columns other than Calendar Date
            #”RemCols<>CalendarDate” = Table.SelectColumns(Joined,{“CalendarDate”})
            //remove un-used columns
            in
            #”RemCols<>CalendarDate”

          2. Hi Dan,

            Welcome to BIInsight.com.
            Thanks for sharing your code, I appreciate that.
            I’m afraid this method won’t work in Power BI Desktop as you are using #sections intrinsic variable.
            This is NOT allowed in Power BI.
            You may use this in Excel without any issues on your local machine, but it breaks as soon as you publish the Excel file to the Power BI Service.

            Cheers.

  1. Hi Soheil,

    Thanks for this – its proving very interesting. Hoping that a year later you might have found a solution to the “OLE DB or ODBC” error? Would be great to implement this if possible

    1. Hi Luke,
      Welcome to BIInsight.com.
      Unfortunately, technology limitation doesn’t allow to generate the Date table using this method.
      We have to use the #sections intrinsic variable dynamically to get all columns with Date or DateTime data type.
      This is not allowed in Power Query.
      As mentioned in one of the previous columns, even if we use Power Query in Excel then publish the Excel to Power BI Service, the dataset refresh will fail.
      Cheers.

  2. Hi Soheil Bakhshi,

    Can you please elaborate the steps for:

    We have to use the #sections intrinsic variable dynamically to get all columns with Date or DateTime data type.

    Thank you!

    1. Hi Purushoth,

      Welcome to biinsight.com and thanks for your question.
      In Power Query, a section is a way to group related section members and name them within an M document.
      It is indeed a very technical concept that is a bit hard to explain, however, I already put a link to the related part of the official “M Language Specification” document explaining the #section intrinsic variable for those who want to know more details about it.
      But in short, what the #section intrinsic variable does is that it identifies all sections and their members within an M document as a record.
      Let’s unlock the preceding sentence:
      – To understand an M document, you can think of a single document that includes all M (Power Query) codes contained within the Power Query Editor (in Power BI Desktop or Excel)
      – The queries shown in within the Power Query Editor are the section members
      Here is a screenshot of using the #sections intrinsic variable in the Power Query Editor:
      #sections intrinsic variable in Power Query (M)
      As the screenshot shows, the #sections retrieves all the existing queries (including parameters). That is exactly the result that we are after.
      Hopefully that answers your question.

  3. I need to do something similar to this, but actually load the dates into PowerBI desktop from the query editor. It essentially is a table of all my tables and their max date. However, when I load in, the dates dont load in and the column becomes blank. It displays correctly in the Query Editor. is there a way to make it so I can load the end result of the query into Desktop?

    1. Thanks for pointing this out. I forgot to mention the limitations of using either #shared or #sections intrinsic variables in Power Query. I updated the blog with the following note:
      Please note that this method is only to get the min and max valid dates across all tables. If you attempt to load the results into the data model in Power BI, the resulting table will be empty. This behaviour is the result of using #sections or #shared intrinsic variables in Power Query and their inherent limitations. While #sections is useful for accessing metadata about all queries, its dynamic nature can cause problems when trying to load data into the Power BI model. Therefore, we have to Disable query load to avoid getting an empty table in the data model.
      However, there is a manual work around that might work for your scenario (which personally am not a fan of). You can replace the #sections with a manual code like below:

      {
      { "Query1", Query1 },
      { "Query2", Query2 },
      // Add more queries as needed
      }

      As mentioned earlier, this code is pretty manual, but, it should work.
      Hopefully that helps.

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.