
Hi everyone,
I am attaching a sample workbook here.
The objective is to view data with and without jumps or outliers.
The dashboard shall let the user can select the X-axis is the days/SOH/Mileage, Y-axis is the discharge.
filters : date range, vin, mileage range, soh range, filter to include or exclude outliers or jumps
Additionally we need a threshold parameter for discharge. Based on the value we enter in the threshold parameter, we only see vehicles that are above this threshold. The vehicle numbers must be outputed as list.
Challenge :
Can anyone help me how to provide that filter to remove or have outliers after the box plot is constructed for the multiple metric axis??
@Vaishnavi Kilambi You have two different questions 1) filter out outliers and 2)filter based on threshold.
Tableau uses 1.5 times the IQR as the definition of an outlier, so you need a calculation that identifies points outside that limit. First, start with identifying the two values that define the IQR: the 1st and 3rd Quartiles. Because, every metric axis values needs its own range, the calculations are FIXED on [Metric Axis].
//C_1Q (LOD)
{ FIXED [METRIC AXIS]:PERCENTILE([Discharge],0.25)}
//C_3Q (LOD)
{ FIXED [METRIC AXIS]:PERCENTILE([Discharge],0.75)}
Next calculate the whisker length, which demarcates outliers
//C_WhiskerLength (LOD)
([C_3Q (LOD)]-[C_1Q (LOD)])*1.5
Finally, create a Boolean calculation to identify outliers (where TRUE is an outlier)
//C_Outlier (LOD)
[Discharge]<=([C_1Q (LOD)]-[C_WhiskerLength (LOD)])
OR
[Discharge]>=([C_3Q (LOD)]+[C_WhiskerLength (LOD)])
You can use this calculation to both color outliers when included and to filter out outliers.
Because you are using a FIXED LoD in the calculation, any filters will need to be placed in Context if you want the filter to affect the outlier calculation.
You must also make a change to the boxplot worksheet. By default, the worksheet is set to aggregate. This will result in the wrong values in the box plot because it does not weigh the [Discharge] based on how many records have that value. For example, 10061 has eleven records and 10029 has one record, but they would both be given weight of one. You need to go to Analysis on the menu and uncheck 'Aggregate Measures'.
You can see the difference on the two worksheets named 'Aggregated Table' and 'Disaggregated Table'. On the 'Aggregated table' the quartiles values calculated with a table calculation (which Tableau uses for the boxplot) has different values than the LOD calculation. On the 'Disaggregated table', the values are the same.
For the Threshold question, you didn't provide enough information to know exactly how you wanted to applied, but in general you would create a parameter (I used ALL values, but you could create a list or range to limit the possible values). Then create a filter to be above that value
//F_Threshold
SUM([Discharge])>=[P_Threshold]