Elasticsearch-aggregation based on query instead of all data

posted on August 12th, 2020

Answered

Kevin Ullrich asked on August 12, 2020

Hi,
we are currently using the Trial-Version to check if this tool fits our needs. Looks great so far, however we've stumbled upon an issue.
We provide our customers with a custom search form which contains many fields the user can fill out to search for specifics (Search through specific years, by name (with or without wildcards), by age, etc etc). Now, we would like to add a pivot-grid to that page. So the user can process the data to their liking. However, these actions should be based on the search query and not on all data. So in short:
{
    "size": 0,
    "aggs": {
        "sumOfField":{
            "sum":{
                "field": "columnField"
            }
        }
    },
    "rowName":{
        "terms": {
            "field": "rowField",
            "size": 13
        },
        "aggs": {
            "sumColumnField":{
                "sum":{
                    "field": "columnField"
                }
            }
        }
    }
}
this is how it currently works, however we'd like to be able to run aggregations like so:
{
    "size": 0,
    "query": {
        "wildcard": {
            "randomField": {
                "value": "*random*"
            }
        }
    },
    "aggs": {
        "sumOfField": {
            "sum": {
                "field": "columnField"
            }
        }
    },
    "rowName": {
        "terms": {
            "field": "rowField",
            "size": 13
        },
        "aggs": {
            "sumColumnField": {
                "sum": {
                    "field": "columnField"
                }
            }
        }
    }
}

This makes it so the aggregations are based on the data returned by the query instead of all data. Is this possible? We've looked into things like slice object where you can provide a filter and/or query (which are kinda limiting to begin with), however these options do not change the amount of data aggregations have to go through.

EDIT: Since we run the elasticsearch requests through our backend (aka not directly connected to elasticsearch; as a proxy), it would be sufficient if flexmonster could send our search/query data together with its normal request data. If we could modifiy what flexmonster is sending (not only the headers), that would most likely fix our issue. We've looked at customizeAPIRequest, but that doesn't seem to work (for us).

5 answers

Public

Milena Pechura ⋅ Flexmonster ⋅ August 13, 2020

Hello!

Thank you for writing to us.

We would like to kindly recommend specifying subquery property of the dataSource object. The subquery value could be changed depending on the user input.

Another approach is to use custom data source API – our custom communication protocol allowing you to retrieve already aggregated data from a server to Flexmonster Pivot. You could implement the custom data source API server, which gets the data from Elastic Search, processes it, and sends it to Flexmonster.
The protocol gives you full control over data processing but requires more developers' resources than the previous solution.

Our team kindly suggests trying the subquery approach first.

Please let us know what you think. Do not hesitate to contact us in case further questions arise.

Kind regards,
Milena

Public

Kevin Ullrich ⋅ August 17, 2020

Hi!

This is exactly what we were looking for. I guess we didn't notice the subquery function for some reason. Fantastic!
Thank you very much.

Public

Milena Pechura ⋅ Flexmonster ⋅ August 17, 2020

Hi!

We are glad to hear that our recommendation helped.

Please let us know in case any further questions arise.

Best regards,
Milena

Public

bernard dradeou ⋅ April 20, 2021

Could you please provide an example of a subquery for elasticsearch ?
Whatever i have tried i get "query malformed, no start_object after query name"
"dataSource": {
"type": "elasticsearch",
/* the host for connection */
"node": "http://localhost:9200",
/* the name of Elasticsearch index to connect */
"index": "dbs",
"subquery":"'bool':{'must':[{'match':{'doc_type':'2'}}]}"
}

Public

Vera Didenko ⋅ Flexmonster ⋅ April 21, 2021

Hello,

Thank you for reaching out to us.

Here is an example of how subquery could be used for Elasticsearch:

"dataSource": {
   "type": "elasticsearch",
   "node": "https://olap.flexmonster.com:9200",
   "index": "fm-product-sales",
   "subquery": {
      "bool": {
         "filter": [
            {
             "terms": {
                "Category.keyword": ["Bikes", "Accessories"],
             }
            },
            {
             "terms": {
                "Color.keyword": ["blue", "red"]
             }
            }
         ]
      }
   },
   "mapping": {...}
}

We have prepared a JSFiddle example for illustration: https://jsfiddle.net/flexmonster/fo2kj4hw/.

Please let us know if this helps.

Kind regards,
Vera

Changes to Flexmonster Software License Agreement

Elasticsearch-aggregation based on query instead of all data

5 answers

Please login or Register to Submit Answer