Performing Calculations in Tableau
Overview
Time: minObjectives
How to perform calculations in tableau
Sometimes, the data lacks some fields that are necessary for analysis. In that case, we can createa new field(column)by performing desired calculations. Those fields are called calculated fields.The calculated field could be a row level calculated field or a column level calculated field.This new calculated field created is saved to the data source.
The 3 broad categories of calculations include – Basic Calculations
- Level of Detail (LOD) expressions
-
Table Calculations
- Basic Calculations - Basic calculations can further be divided into Row-level calculations and aggregate calculations.
1.Row level Calculations –When we apply row level calculation, results are generated for every line in the datasource.
Let us see few examples –Transform/Split –Using split we can split strings and perform string manipulation. Click on customer name, transform, and split. This will split the customer’s name in two fields (first name, last name).
We can also splitthe string based on a delimiter/separator.Let us split the order id column to obtain only the numbers in the id.
The column has a delimiter ‘-‘ in the format. We specify that as the delimiter. Next, we want all the numbers in the id which are at the end of the string. We select split off lastoption.A new calculated field has been created with the number string of the order id for each row in the data source.
Finding strings/pattern -We can also spot substring or strings in the column. We will try to see product ids string that contains “TA” in it. Create a new calculated fieldwhich mentions the keyword contains. This will search for all the rows containing TA as a part of the string and returns a true value for them.
If/Else Conditional Calculation –Let us create a calculated field to see if the sales are profitable or not.
We add order id to the row shelf and profits to text marks card. Next, we add the calculated field created to the row shelf. The output is as shown.
-
Aggregate calculations
-
Generally, adding any measure to row or column shelf aggregates the measure. But we can create calculated field and perform aggregation on the rows in the data source.Let’s count the number of customers. This aggregation will give the count of customers against the dimension we choose in the visualization. Let us add region to the row shelf, profits,and the calculated field (Count_Customer) to the column shelf.
The visualization gives the profits across region and count of customers for each region.
- Level of Detail expression
Allows calculations at data source as well as visualization level(just like basic calculations). But provides more control over the level of granularity. The level of detail could be high-level details (less granularity, summary, more aggregated) or low-level details (more granularity, more details, less aggregated). To understand the how the high-level/low-level details work, let us take an example-
- Drag the sales to the text marks card.We can see the sumof sales of all the products. This has less granularity and more aggregation. This gives the high-level picture of sales for the orders table.
- Add region to the row column shelf. This adds a level of details because now we are viewing the sum of sales across the 4regions.This has added a little more granularity and little less aggregation as compared to the first step.
- Add states to the row shelf. This is a more granular visualization as compared to step 2. So, we started from a high-level visualization in step 1 and kept adding more details to reach to a low-level detail visualization.
There are 3 types of LOD expressions:
- Fixed
- Exclude
- Include
Fixed expression is independent of view. When we use exclude expression, it is like removing some dimensions from the view. When we use include expression, it is like adding dimensions to the view. Syntax –{Type of expression [Dimension list] : Aggregate}
-
- Fixed –It returns a scalar value. It is independent of the dimensions in the view. Let us create a fixed expression field. Start by adding sum of sales to the text marks card.
Next, create a calculated field with the fixed expression for sum of sales.
We created a grand total field in the measures. Double click on the fieldfrom the measures.
This gives us the grand total and sales. Move the measure names from rows shelf to column and add region to rows shelf.
Here we are viewing sales across all the regions. Now remove region and add category from the dimensions. We can see that the grand total is the same irrespective of the whatever dimension we add to the row shelf. The fixed grand total value is independentof the dimensionsas the calculation does not contain any dimension. Let us try and add filter on the region. We can see that, even if we unselect some of the regions the grand total is unaffected.
Exclude LOD –This type of calculation is sensitive to dimensions. Exclude does not return a scalar value (unlike fixed LOD), it returns an aggregated value.It gives dimension in the view minus what is in the expression.Let’s create some visualization using exclude calculation.
In the above visualization, we exclude region, so now if we add region to the filter, the grand total will change when we select/unselect some regions.
Let us add another dimension to the row shelf and observe the values. The exclude calculations values are now changed. Adding another dimension has affected the calculation values. We can add more than one dimension in the exclude LOD calculation. Edit the exclude calculation and add category and sub-category to the exclude calculation. Remove region.
IncludeLOD –These calculations will include dimensions in the view and the dimension mentioned in the include expression.The results are aggregated.Let uscreate a table calculation to understand the topic.We create the calculated fieldas follows. We are including sub-category in it.
Now add the calculated fieldthat we created.
Now the visualization shows maximum of sum of sales as the total for each category. Now we remove sub-category from the row shelf.
So now we don’t have sub-category in the visualization but since it is a part of the calculated field, we still have the max sum of sales of the sub-category for every category.
- Table Calculations Table calculations are the transformation that you apply to the values in the visualization. They are calculated based on what is present in the visualization.So, basically when we create a visualizationand then add the table calculation. They are added on top of the aggregated function used in visualization. Three things that will affect the table calculations are – Layout
- Direction/ addressing
- Scope/ partitioning Let us start with creating a visualization to understand it better. Add Region to the row shelf and order date to the column shelf. Add sum of sales to the text marks card.Click on totals in the analytics tab to get the grand total.
Now when we click on sum of sales, we can see a quick table calculation option. And we have all the commonly used table calculations like running total, percent total etc. So,we are applying aggregation/calculationto the already existing aggregation in the visualization. Select percent of total from the options.
After applying percent of total, the original values in the table are changed to the percent values. Now let us see how the three aspects of table calculation –layout, direction and scope affect the calculation.
Table calculations are sensitive to the layout of the visualization. If we add another dimension let us say sub-category to row shelf and try and move the region dimension from row to column shelf, we see that the values in the tables change.Next, we can see that the percent of total is being calculated from left to right.
That means the calculation is sensitive to the direction of consideration.We can have the calculation from left to right or from top to bottom. To change the direction, we can select an option as follows.We can also see options like cell, order date and region which gives us the scope/partition for the calculation.
Key Points