Performing Calculations in Tableau

Overview

Time: min
Objectives
  • How to perform calculations in tableau

Sometimes, the data lacks some fields that are necessary for analysis. In that case, we can createa new field(column)by performing desired calculations. Those fields are called calculated fields.The calculated field could be a row level calculated field or a column level calculated field.This new calculated field created is saved to the data source.

The 3 broad categories of calculations include – Basic Calculations

1.Row level Calculations –When we apply row level calculation, results are generated for every line in the datasource.

Let us see few examples –Transform/Split –Using split we can split strings and perform string manipulation. Click on customer name, transform, and split. This will split the customer’s name in two fields (first name, last name).

Screen Shot 2022-03-19 at 6 56 20 PM

Screen Shot 2022-03-19 at 6 56 38 PM

We can also splitthe string based on a delimiter/separator.Let us split the order id column to obtain only the numbers in the id.

Screen Shot 2022-03-19 at 6 57 08 PM

The column has a delimiter ‘-‘ in the format. We specify that as the delimiter. Next, we want all the numbers in the id which are at the end of the string. We select split off lastoption.A new calculated field has been created with the number string of the order id for each row in the data source.

Screen Shot 2022-03-19 at 6 57 46 PM

Finding strings/pattern -We can also spot substring or strings in the column. We will try to see product ids string that contains “TA” in it. Create a new calculated fieldwhich mentions the keyword contains. This will search for all the rows containing TA as a part of the string and returns a true value for them.

Screen Shot 2022-03-19 at 6 58 17 PM

Screen Shot 2022-03-19 at 6 58 46 PM

If/Else Conditional Calculation –Let us create a calculated field to see if the sales are profitable or not.

Screen Shot 2022-03-19 at 6 59 19 PM

Screen Shot 2022-03-19 at 6 57 46 PM

We add order id to the row shelf and profits to text marks card. Next, we add the calculated field created to the row shelf. The output is as shown.

Screen Shot 2022-03-19 at 6 59 43 PM

Screen Shot 2022-03-19 at 6 57 46 PM

Screen Shot 2022-03-19 at 7 00 38 PM

The visualization gives the profits across region and count of customers for each region.

Screen Shot 2022-03-19 at 7 01 11 PM

Allows calculations at data source as well as visualization level(just like basic calculations). But provides more control over the level of granularity. The level of detail could be high-level details (less granularity, summary, more aggregated) or low-level details (more granularity, more details, less aggregated). To understand the how the high-level/low-level details work, let us take an example-

Screen Shot 2022-03-21 at 6 15 44 PM

Screen Shot 2022-03-21 at 6 17 39 PM

Screen Shot 2022-03-21 at 6 18 54 PM

There are 3 types of LOD expressions:

Fixed expression is independent of view. When we use exclude expression, it is like removing some dimensions from the view. When we use include expression, it is like adding dimensions to the view. Syntax –{Type of expression [Dimension list] : Aggregate}

Screen Shot 2022-03-21 at 6 20 29 PM

Next, create a calculated field with the fixed expression for sum of sales.

Screen Shot 2022-03-21 at 6 20 59 PM

Screen Shot 2022-03-21 at 6 21 33 PM

We created a grand total field in the measures. Double click on the fieldfrom the measures.

Screen Shot 2022-03-21 at 6 22 26 PM

This gives us the grand total and sales. Move the measure names from rows shelf to column and add region to rows shelf.

Screen Shot 2022-03-21 at 6 22 53 PM

Here we are viewing sales across all the regions. Now remove region and add category from the dimensions. We can see that the grand total is the same irrespective of the whatever dimension we add to the row shelf. The fixed grand total value is independentof the dimensionsas the calculation does not contain any dimension. Let us try and add filter on the region. We can see that, even if we unselect some of the regions the grand total is unaffected.

Screen Shot 2022-03-21 at 6 23 38 PM

Exclude LOD –This type of calculation is sensitive to dimensions. Exclude does not return a scalar value (unlike fixed LOD), it returns an aggregated value.It gives dimension in the view minus what is in the expression.Let’s create some visualization using exclude calculation.

Screen Shot 2022-03-21 at 6 24 12 PM

In the above visualization, we exclude region, so now if we add region to the filter, the grand total will change when we select/unselect some regions.

Screen Shot 2022-03-21 at 6 24 46 PM

Let us add another dimension to the row shelf and observe the values. The exclude calculations values are now changed. Adding another dimension has affected the calculation values. We can add more than one dimension in the exclude LOD calculation. Edit the exclude calculation and add category and sub-category to the exclude calculation. Remove region.

Screen Shot 2022-03-21 at 6 25 35 PM

IncludeLOD –These calculations will include dimensions in the view and the dimension mentioned in the include expression.The results are aggregated.Let uscreate a table calculation to understand the topic.We create the calculated fieldas follows. We are including sub-category in it.

Screen Shot 2022-03-21 at 6 26 25 PM

Now add the calculated fieldthat we created.

Screen Shot 2022-03-21 at 6 27 10 PM

Now the visualization shows maximum of sum of sales as the total for each category. Now we remove sub-category from the row shelf.

Screen Shot 2022-03-21 at 6 27 58 PM

So now we don’t have sub-category in the visualization but since it is a part of the calculated field, we still have the max sum of sales of the sub-category for every category.

Screen Shot 2022-03-21 at 6 33 10 PM

Now when we click on sum of sales, we can see a quick table calculation option. And we have all the commonly used table calculations like running total, percent total etc. So,we are applying aggregation/calculationto the already existing aggregation in the visualization. Select percent of total from the options.

Screen Shot 2022-03-21 at 6 33 49 PM

Screen Shot 2022-03-21 at 6 34 06 PM

After applying percent of total, the original values in the table are changed to the percent values. Now let us see how the three aspects of table calculation –layout, direction and scope affect the calculation.

Table calculations are sensitive to the layout of the visualization. If we add another dimension let us say sub-category to row shelf and try and move the region dimension from row to column shelf, we see that the values in the tables change.Next, we can see that the percent of total is being calculated from left to right.

That means the calculation is sensitive to the direction of consideration.We can have the calculation from left to right or from top to bottom. To change the direction, we can select an option as follows.We can also see options like cell, order date and region which gives us the scope/partition for the calculation.

Screen Shot 2022-03-21 at 6 35 02 PM

Key Points