Get Free Trial

Problem on Handling Large Dataset by Elasticsearch and Data server.

Ricky Man asked on November 5, 2020

Hi, i have a report with quite a lot aggregations on a large dataset (~14GB), but there’s problem on loading time. Any suggestions on improve the performance on loading large dataset?
I have tried Elasticsearch as the data source, even i increased the request time out, it’s still a problem that a report have to load up over mins.
Then, i tried with Data server with local csv file (also with ~14GB so far). But job is killed when i start the server. 
Logs are like:
2020-11-05 06:41:37.1189|INFO|Flexmonster.DataServer.HostedServices.MonitorUserUpdateService|Monitor User Storage Service is running
2020-11-05 06:41:37.4189|INFO|Flexmonster.DataServer.Core.PrepopulatingCacheService|Prepopulation service start working
2020-11-05 06:41:37.4248|INFO|Flexmonster.DataServer.Core.DataStorages.DataStorage|Start loading index sample-index
2020-11-05 06:41:37.7646|INFO|Flexmonster.DataServer.Core.PrepopulatingCacheService|Index sample-index was loaded in 0.3443089 seconds
2020-11-05 06:41:37.7646|INFO|Flexmonster.DataServer.Core.DataStorages.DataStorage|Start loading index test-index

1 answer

Mykhailo Halaida Mykhailo Halaida Flexmonster November 9, 2020

Hi Ricky,
Thank you for writing to us.
We would suggest narrowing down your dataset if possible – it is already rather large for a CSV, and it gets even bigger after being processed by the Data Server.
The Data Server then stores the processed data in your machine’s RAM, which simply might not provide that many resources, thereby causing the mentioned error.
We hope this helps.

Please login or Register to Submit Answer