Need a special offer?Find out if your project fits.
+

Distinct count in hierarchical data

Answered
Sophie L asked on January 31, 2022

Hello,
I encountered an issue trying to distinct count hierarchical entries in a dataset
The issue arises when 2 entries have an identical name under different branches of a hierarchy.
For instance, if I define a Geography, and want to display the distinct count of some data per city, when different country/regions may have a city with the same name, like "London, Ontario, Canada" vs "London, London, UK", or "Paris, IDF, France" vs "Paris, Texas, USA", I would expect my distinct count to consider those 2 cities as different, although they share the same name.
Here's a jsfiddle to illustrate, where I tried:

  1. Base case: distinct count (City) => green/Paris => (USA):1 + (France):1 = (Total):1 !
    • You might argue that a undistinct count would have returned the expected result, but this is only an example as my real production case really does require using a distinct count
  2. Adding a property "City Code" (type: "property", hierarchy: "City") hoping that this would allow considering cities as actually distinct in computing distinct count 
    • That didn't change anything. Maybe I misunderstand the intent of this "property type" (the basic example of "Product/ProductId" led me to believe this added this kind of discriminating attribute)
  3. Adding a "City Code" without making it a property of "City" (type: string) to use for the distinct count instead of "City"
    • That seems to work, but it may be slightly less intuitive (I still need to check if it solves my "real life" scenario)
    • Also, I was not able to select City Code as an aggregation if defined as a property of City
  4. Creating a Geography Hierarchy (Country/State/City)
    • I'm completely unable to count anything related to a node in that hierarchy...
    • In my real life scenario, I haven't defined that data hierarchically (although it is), but this is something that I could consider, and I'd like to understand better.

These all may come from my misunderstanding, but any help would be appreciated.

4 answers

Public
Sophie L February 1, 2022

Hello again, I may have a follow up question with regards to hierarchies.
I would like to display fields in a certain order when I select a hierarchy.
The sortingMethod page explains how to use that method to sort fields like Category.
The following works in a non hierarchical City definition:

const sortedCities = ["Paris", "Toronto", "London"];
const sortCities = (v1, v2) => sortedCities.findIndex(v1) - sortedCities.findIndex(v2);

flexmonster.sortingMethod("City", sortCities);

How can I define the sorting if City is a node inside the Geography Hierarchy?

mapping = {
"Country": { hierarchy: "Geography" },
"State": { type: "string", hierarchy: "Geography", parent: "Country" },
"City": { type: "string", hierarchy: "Geography", parent: "State" },
}

flexmonster.sortingMethod("Country", sortCountries)
flexmonster.sortingMethod("City", sortCities) // Doesn't sort
Public
Nadia Khodakivska Nadia Khodakivska Flexmonster February 2, 2022

Hello, Sophie,
 
Thank you for the detailed information.
 
We checked all of the described ways and consider the third one to be the best solution to your issue, but with some modifications. Please check the comments below to each of the approaches:

  1. You're correct that Flexmonster will treat members with the same name as identical ones while calculating the Distinct Count. It's necessary to make some modifications to your dataset to distinguish such members, like adding the City Code field in this case.
  2. Kindly note that the property type wasn't intended to distinguish the members with the same name - it is impossible to reach such values while aggregating data. Such fields can be accessed, for example, on the cellclick event.
  3. The property fields are not shown in the Field List. It is impossible to create a measure based on such fields. However, we suggest using the approach where you add the calculated value named, for example, "City Distinct Count", calculated based on Distinct Count of City Code (without the property type). Please check the following JSFiddle for illustration: https://jsfiddle.net/flexmonster/64r7szjp/.
  4. Hierarchy levels can only be used in rows/columns. With hierarchy or without it, we still recommend using "City Code" for Distinct Count.

Regarding your follow-up:
Currently, the sortingMethod can be applied only to non-hierarchical data.

Please let us know in case of other questions.
 
Kind regards,
Nadia

Public
Sophie L February 15, 2022

Hello Nadia, thank you for these explanations.
I have one follow-up question. Is it possible to disable aggregations on the "City" field, to only leave the computed value available?
 

Public
Nadia Khodakivska Nadia Khodakivska Flexmonster February 16, 2022

Hello, Sophie,
 
Thank you for the response.
 
Currently, it is impossible to disable all of the aggregations of a field. At least one aggregation should be selected. You can select one aggregation for the "City" field in the Mapping Object. For example,

mapping: {
"City": {
type: "string",
aggregations: ["count"],
}
}

Then you define only your calculated value in measures. If you want to ensure that the user won't make any changes in your report, Flexmonser provides the read-only mode and the ability to hide the toolbar. You are welcome to check the following JSFiddle for reference: https://jsfiddle.net/flexmonster/64r7szjp/45/
 
Please let us know if it works for you. Feel free to contact us if other questions arise.
 
Kind regards,
Nadia 

Please login or Register to Submit Answer