Data visualisation on the service of Sherlock Holmes
“I see it, I deduce it…You see, but you do not observe.”
A Scandal in Bohemia.
A week ago we started series of blogs about data visualisation with covering the main aspects of qualitative visualisations. Followed by the logic of using visualisation on the different stages of a typical analytic process we proceed our way with describing the role of visualisation techniques in the exploratory data analysis (EDA). It is an actual topic as nowadays many EDA approaches have been assimilated into data mining, as well as into big data analytics.
After preprocessing and cleaning raw data, we can finally put our hands in it. And at this stage, we have a great option to start with exploratory data analysis. Primarily EDA is aimed for seeing what any data can tell us beyond the formal modelling or hypothesis testing task. FYI, the term was founded and promoted by John Tukey who likened it to detective work. And indeed, the role of an analyst who is conducting data exploration is in many ways similar to a detective who is investigating a crime. Why is it so? As any detective does data researcher collect evidence, hints, and clues related to the central issue of the case. In other words, they both try to explore data in as many ways as possible until a plausible story emerges. It is a great place to cite here “A Scandal in Bohemia” one more time simply to emphasise the idea:
“It is a capital mistake to theorise before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
Thus, it is no wonder that data visualisation is commonly used as one of the main tools in EDA.
Returning to data analysis process, we can distinguish such major research goals for data analysis:
- Observing a time-based process
- Exploring relationships
- Finding clusters
- Checking distributions, comparing mean differences and discovering outliers.
All of them can employ visualisation techniques for intensification our understanding of the data. By means of pivot tables, scatter plots, bar charts, histograms, multivariable charts etc one can explore and analyse huge amounts of information in order to get a valid hypothesis for further investigation. Also, the possibility of doing this kind of work fully in a browser makes a huge impact on productivity and flexibility.
So in our next blog, we are going to run exploratory data analysis with the help of Flexmonster Pivot Component.
See you soon!