Business Intelligence Components and How They Relate to Power BI

Business Intelligence Components and How They Relate to Power BI

When I decided to write this blog post, I thought it would be a good idea to learn a bit about the history of Business Intelligence. I searched on the internet, and I found this page on Wikipedia. The term Business Intelligence as we know it today was coined by an IBM computer science researcher, Hans Peter Luhn, in 1958, who wrote a paper in the IBM Systems journal titled A Business Intelligence System as a specific process in data science. In the Objectives and principles section of his paper, Luhn defines the business as “a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera.” and an intelligence system as “the communication facility serving the conduct of a business (in the broad sense)”. Then he refers to Webster’s dictionary’s definition of the word Intelligence as the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal”.

It is fascinating to see how a fantastic idea in the past sets a concrete future that can help us have a better life. Isn’t it precisely what we do in our daily BI processes as Luhn described of a Business Intelligence System for the first time? How cool is that?

When we talk about the term BI today, we refer to a specific and scientific set of processes of transforming the raw data into valuable and understandable information for various business sectors (such as sales, inventory, law, etc…). These processes will help businesses to make data-driven decisions based on the existing hidden facts in the data.

Like everything else, the BI processes improved a lot during its life. I will try to make some sensible links between today’s BI Components and Power BI in this post.

Generic Components of Business Intelligence Solutions

Generally speaking, a BI solution contains various components and tools that may vary in different solutions depending on the business requirements, data culture and the organisation’s maturity in analytics. But the processes are very similar to the following:

  • We usually have multiple source systems with different technologies containing the raw data, such as SQL Server, Excel, JSON, Parquet files etc…
  • We integrate the raw data into a central repository to reduce the risk of making any interruptions to the source systems by constantly connecting to them. We usually load the data from the data sources into the central repository.
  • We transform the data to optimise it for reporting and analytical purposes, and we load it into another storage. We aim to keep the historical data in this storage.
  • We pre-aggregate the data into certain levels based on the business requirements and load the data into another storage. We usually do not keep the whole historical data in this storage; instead, we only keep the data required to be analysed or reported.
  • We create reports and dashboards to turn the data into useful information

With the above processes in mind, a BI solution consists of the following components:

  • Data Sources
  • Staging
  • Data Warehouse/Data Mart(s)
  • Extract, Transform and Load (ETL)
  • Semantic Layer
  • Data Visualisation

Data Sources

One of the main goals of running a BI project is to enable organisations to make data-driven decisions. An organisation might have multiple departments using various tools to collect the relevant data every day, such as sales, inventory, marketing, finance, health and safety etc.

The data generated by the business tools are stored somewhere using different technologies. A sales system might store the data in an Oracle database, while the finance system stores the data in a SQL Server database in the cloud. The finance team also generate some data stored in Excel files.

The data generated by different systems are the source for a BI solution.

Staging

We usually have multiple data sources contributing to the data analysis in real-world scenarios. To be able to analyse all the data sources, we require a mechanism to load the data into a central repository. The main reason for that is the business tools required to constantly store data in the underlying storage. Therefore, frequent connections to the source systems can put our production systems at risk of being unresponsive or performing poorly. The central repository where we store the data from various data sources is called Staging. We usually store the data in the staging with no or minor changes compared to the data in the data sources. Therefore, the quality of the data stored in the staging is usually low and requires cleansing in the subsequent phases of the data journey. In many BI solutions, we use Staging as a temporary environment, so we delete the Staging data regularly after it is successfully transferred to the next stage, the data warehouse or data marts.

If we want to indicate the data quality with colours, it is fair to say the data quality in staging is Bronze.

Data Warehouse/Data Mart(s)

As mentioned before, the data in the staging is not in its best shape and format. Multiple data sources disparately generate the data. So, analysing the data and creating reports on top of the data in staging would be challenging, time-consuming and expensive. So we require to find out the links between the data sources, cleanse, reshape and transform the data and make it more optimised for data analysis and reporting activities. We store the current and historical data in a data warehouse. So it is pretty normal to have hundreds of millions or even billions of rows of data over a long period. Depending on the overall architecture, the data warehouse might contain encapsulated business-specific data in a data mart or a collection of data marts. In data warehousing, we use different modelling approaches such as Star Schema. As mentioned earlier, one of the primary purposes of having a data warehouse is to keep the history of the data. This is a massive benefit of having a data warehouse, but this strength comes with a cost. As the volume of the data in the data warehouse grows, it makes it more expensive to analyse the data. The data quality in the data warehouse or data marts is Silver.

Extract, Transfrom and Load (ETL)

In the previous sections, we mentioned that we integrate the data from the data sources in the staging area, then we cleanse, reshape and transform the data and load it into a data warehouse. To do so, we follow a process called Extract, Transform and Load or, in short, ETL. As you can imagine, the ETL processes are usually pretty complex and expensive, but they are an essential part of every BI solution.

Continue reading “Business Intelligence Components and How They Relate to Power BI”

Quick Tips: Time Dimension with Time Bands at Seconds Granularity in Power BI and SSAS Tabular

Time Dimension with Time Bands at Seconds Granularity in Power BI and SSAS Tabular

I wrote some other posts on this topic in the past, you can find them here and here. In the first post I explain how to create “Time” dimension with time bands at minutes granularity. Then one of my customers required the “Time” dimension at seconds granularity which encouraged me to write the second blogpost. In the second blogpost though I didn’t do time bands, so here I am, writing the third post which is a variation of the second post supporting time bands of 5 min, 15 min, 30 min, 45 min and 60 min while the grain of the “Time” dimension is down to second. in this quick post I jump directly to the point and show you how to generate the “Time” dimension in three different ways, using T-SQL in SQL Server, using Power Query (M) and DAX. Here it is then:

Time Dimension at Second Grain with Power Query (M) Supporting Time Bands:

Copy/paste the code below in Query Editor’s Advanced Editor to generate Time dimension in Power Query:

let
Source = Table.FromList({1..86400}, Splitter.SplitByNothing()),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "ID"}}),
#"Time Column Added" = Table.AddColumn(#"Renamed Columns", "Time", each Time.From(#datetime(1970,1,1,0,0,0) + #duration(0,0,0,[ID])), Time.Type),
    #"Hour Added" = Table.AddColumn(#"Time Column Added", "Hour", each Time.Hour([Time]), Int64.Type),
    #"Minute Added" = Table.AddColumn(#"Hour Added", "Minute", each Time.Minute([Time]), Int64.Type),
    #"5 Min Band Added" = Table.AddColumn(#"Minute Added", "5 Min Band", each Time.From(#datetime(1970,1,1,Time.Hour([Time]),0,0) + #duration(0, 0, (Number.RoundDown(Time.Minute([Time])/5) * 5) + 5, 0)), Time.Type),
    #"15 Min Band Added" = Table.AddColumn(#"5 Min Band Added", "15 Min Band", each Time.From(#datetime(1970,1,1,Time.Hour([Time]),0,0) + #duration(0, 0, (Number.RoundDown(Time.Minute([Time])/15) * 15) + 15, 0)), Time.Type),
#"30 Min Band Added" = Table.AddColumn(#"15 Min Band Added", "30 Min Band", each Time.From(#datetime(1970,1,1,Time.Hour([Time]),0,0) + #duration(0, 0, (Number.RoundDown(Time.Minute([Time])/30) * 30) + 30, 0)), Time.Type),
#"45 Min Band Added" = Table.AddColumn(#"30 Min Band Added", "45 Min Band", each Time.From(#datetime(1970,1,1,Time.Hour([Time]),0,0) + #duration(0, 0, (Number.RoundDown(Time.Minute([Time])/45) * 45) + 45, 0)), Time.Type),
#"60 Min Band Added" = Table.AddColumn(#"45 Min Band Added", "60 Min Band", each Time.From(#datetime(1970,1,1,Time.Hour([Time]),0,0) + #duration(0, 0, (Number.RoundDown(Time.Minute([Time])/60) * 60) + 60, 0)), Time.Type),
    #"Removed Other Columns" = Table.SelectColumns(#"60 Min Band Added",{"Time", "Hour", "Minute", "5 Min Band", "15 Min Band", "30 Min Band", "45 Min Band", "60 Min Band"})
in
    #"Removed Other Columns"
Continue reading “Quick Tips: Time Dimension with Time Bands at Seconds Granularity in Power BI and SSAS Tabular”

Quick Tips: Time Dimension with Granularity of Seconds in Power BI and Analysis Services Tabular Models

Quick Tips: Time Dimension with Granularity of Seconds in Power BI and Analysis Services Tabular Models
Photo by Markus Spiske

In an article I posted a while back I showed different methods of creating Time dimension in Power BI and Tabular models. The Time dimension I explained was in Minutes. In this post I show you simple way to create Time dimension supporting Seconds. As this is a quick tip, I only show you how to get the Time and ID columns in the Time dimension. If you need to add time bands (time buckets) check this out for more details.

Time Dimension in Seconds Grain with Power Query (M):

Copy/paste the code below in Advanced Editor to generate Time dimension in Power Query:

let
Source = Table.FromList({1..86400}, Splitter.SplitByNothing()),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "ID"}}),
#"Time Column Added" = Table.AddColumn(#"Renamed Columns", "Time", each Time.From(#datetime(1970,1,1,0,0,0)+#duration(0,0,0,[ID]))),
#"Changed Type" = Table.TransformColumnTypes(#"Time Column Added",{{"ID", Int64.Type}, {"Time", type time}})
in
#"Changed Type" 
Time Dimension in Power Query

Time Dimension in Seconds Grain with DAX:

Run the DAX expression below in a new calculated Table in Power BI or SSAS Tabular model:

Time in DAX = ADDCOLUMNS(
                       GENERATESERIES(1, 86400, 1)
                        , "Time", TIME(0, 0, 0) + [Value]/86400
                        )
Time Dimension in Seconds Level with DAX
Continue reading “Quick Tips: Time Dimension with Granularity of Seconds in Power BI and Analysis Services Tabular Models”

Time Dimension in Power BI and SSAS Tabular Model Supporting Minutes Time Bands

2018-05-23 12_58_48-Symbols (Open in Visio).vsdx - Visio Professional

Date dimension has been discussed quite a lot on the Internet and you can find lots of valuable articles around it here and there. But what if you need to analyse your data in time level? A customer has a requirement to analyse their data in Minutes level. This means that the granularity of the fact table would be at minute level. So, if they store the data in their transactional database in seconds level, then we need to aggregate that data to minutes level. I don’t want to go there, just bear in mind that the granularity of your fact table is something that you must think about at the very first steps. In most cases, if not all cases, you’d be better to have a separate Time dimension. Then you need to have a TimeID or Time column in your fact table to be able to create a relationship between the Time dimension and the fact table. In this post I show you two ways to create Time dimension in Power BI:

  • Creating Time dimension with DAX
  • Creating Time dimension with Power Query (M)

Alternatively, you can take care of the Time dimension in the source system like SQL Server. Continue reading and you’ll find a T-SQL codes as complementary.

The techniques that I explain here can be done in SSAS Tabular model and Azure Analysis Services as well.

Requirements:

To follow the steps of building the test model you need to have:

  • Power BI Desktop: Download the latest version from here
  • A sample fact table containing time or datetime. I modified FactInternetSales from AdventureWorksDW and made it available for you to download in Excel format (find the download link at the bottom of the post)

Continue reading “Time Dimension in Power BI and SSAS Tabular Model Supporting Minutes Time Bands”