calculated field for datasource.type ocsv and elasticsearch

posted on April 20th, 2020

Answered

Jaco asked on April 20, 2020

Hi,

I noticed that for datasource.type ocsv there is an option to create "calculated fields" , but for datasource.type elasticsearch the option is not available? Why is that? How to enable it?

Hope to hear from you.
Regards Jaco

5 answers

Public

Illia Yatsyshyn ⋅ Flexmonster ⋅ April 22, 2020

Hello, Jaco,

Thank you for reaching out to us.

Our team would like to kindly inform you that calculated values are not currently available for Elasticsearch. It is due to the specifics in the data exchange process between Flexmonster and Elasticsearch. The used approach imposes some limitation including the absence of possibility to implement calculated values.

The feature is currently implemented for CSV and JSON data sources. Also, the Flexmonster Data Server has been released with a minor update version 2.8.5. It allows connection to JSON, CSV files, and SQL databases.

Connection to the data source through the named server allows using calculated values for all mentioned data sources.
You are welcome to find out more about it in our documentation.

Do not hesitate to contact us in case other questions arise.

Best regards,
Illia

Public

Jaco ⋅ April 23, 2020

Hi Illia,

Thanks for your answer. I have upgraded to the new version 2.8.5 and have taken a look at the Flexmonster Data Server.
I really like the idea of what the FM Data Server is meant to do, namely: to significantly reduce the time of data loading and enables analyzing large datasets.
With the FM Data Server it is indeed possible to get larger datasets from SQL Server and it is really fast.
But i came across the following issues:

The queries for some of our datasets return millions of records and lots of columns. The FM Data Server returns timeout for these queries.
we want this data to be near-real-time, so we need the refresh-interval at 1 minute (which is possible)
But this would mean that the FM Data Server would execute a high-resource-query every minute with millions of records. This would put an unnecessairy (high and constant) pressure on the database, and that is not desirable.

So for us at this moment, this is not a viable solution. Maybe if the data is synced instead of read-in in full, it would work better for us.

Regards Jaco

Public

Illia Yatsyshyn ⋅ Flexmonster ⋅ April 23, 2020

Hello, Jaco,

Thank you for your feedback; it will be considered when implementing further versions of Flexmonster Data Server.

Our team would like to kindly explain that Flexmonster Data Server allows loading only the data required for the current slice without a need to load the whole data set. It allows decreasing the loading speed and dramatically boosts the performance.

Concerning the syncing instead of loading the whole chosen part at once, we would like to kindly inform you that such a feature is not available in the current version of Flexmonster Data Server. Even so, improvements connected with updating the data are likely to take place in one of the further releases of the Flexmonster Data Server.

As for now, it is possible to implement your own realization of the custom data source API protocol with an appropriate update algorithm for your case. Detailed information about the protocol can be found in our documentation and in the following blog post.
The brief overview of requests and response structure is placed in our API Reference.

Also, we have prepared two open-source sample servers implementing the custom data source API. You are welcome to use them as a reference while implementing your own realization. Both of them are placed in our GitHub repository.
Their overview can be found in our documentation as well:

Finally, our team would like to kindly take an interest in what a database do you use and what is an approximate size of the data set required to be displayed using Flexmonster.

Please contact us in case of additional questions.

Best regards,
Illia

Public

Jaco ⋅ May 15, 2020

Hi Illia,

Regarding your remark:
Our team would like to kindly explain that Flexmonster Data Server allows loading only the data required for the current slice without a need to load the whole data set. It allows decreasing the loading speed and dramatically boosts the performance.

In the data server you predefine the (slices) datasets and then we use filtering (mostly on the period) to the get a new subset of data. To get those new subsets we need the Data server to contain all of the data we could ask.
For example:
One of our main tables contains 18 million records. Our reports range from showing insight over the past 5 year as well as for the current week.

So the Data server must have all the 18 million records, and then Flexmonster asks for the data for a given period. Reading in all those 18 million records with all the data connected to it (customer info, project info etc) it gets way too time consuming and to keep this data up to date.
So, therefore reading the entire dataset every x minutes is not an option.

We use both SQL Server and Elastic Search

Regards Jaco

Public

Illia Yatsyshyn ⋅ Flexmonster ⋅ May 19, 2020

Hello, Jaco,

Thank you for providing us with such a detailed response.

Our team would like to confirm that every-minute data reloading can be too time-consuming for the mentioned data size.

We suggest considering splitting the data set on several indexes (for example, split by the year) in order to avoid loading the whole data set to the server. More information about the way to do that can be found in our documentation.

In case splitting the data set is not an option, we suggest considering developing your own implementation of the custom data source API. It can be complemented with an updating algorithm that would be appropriate for your case.

Please contact us in case additional assistance is needed.

Kind regards,
Illia

Changes to Flexmonster Software License Agreement

calculated field for datasource.type ocsv and elasticsearch

5 answers

Please login or Register to Submit Answer