My Data Analytics Journey | What is Data Analytics?

What is Data Analytics?

Data Analytics is the process of collecting, cleaning and transforming data followed by modelling and analysis to answer key business questions and aid in making sound, data-driven business decisions. There are 4 different types of analytics:



The "What" Descriptive Analytics

After collecting and integrating the data for ease of usage, a data analyst first has to ascertain the “What”, such as by using a chart to display information about a company’s past performance. For instance, imagine that you, a data analyst working at an ice-cream store, was given this simple data set containing the Date, the Sales for that day, and the corresponding temperature.



Even within this small dataset, it is hard to discern patterns from the numbers. As such, to get an overview of the dataset, data analyst will plot one variable against the other, as shown in the charts below. 





From these two charts, we can tell that:
- Temperature is positively correlated to Sales
- Sales has been dropping sharply over this time period
- Temperature has dropped as a whole

The "Why" Diagnostic Analytics
The data analyst will then need to link up with the various business departments such as product and sales team to come up with various hypotheses and find out more about the other possible factors that could have led to a decrease in sales.

The "What Will" Predictive Analytics
A simple prediction that can be made based on this simple dataset is the predicted sales for a particular day when the temperature is say 34 Degrees. From the best fit line (Ordinary Least Squares), we can predict that sales for that day might be in the ballpark of 660. 


The "How" Prescriptive Analytics

After carrying out the various forms of analytics, the data analyst will discuss with the various business teams to come up with data-driven recommendations such as suggesting the appropriate prices for the products and any other changes to be rolled out. 

The softwares available

To aid a data analyst in carrying these analyses, there are many different softwares that we can use. I chose Tableau and PowerBi for data visualisation as they both involve a drag and drop interface, which bypasses a lot of the coding that an analyst will otherwise need to do to achieve similar visualisations. In fact, I picked up Tableau in University and Power BI during my internship at an FMCG firm. MySQL is also useful in helping one design and create databases to store tabular data and relations. 

Coding for Data Science & Machine Learning

Additionally, Python and R are both coding languages that enable us to carry out powerful machine learning due to the many in-built libraries (which are pre-written code written by some other coders that we can simply import and use). For instance, to find the mean of 5 and 7, we can either manually sum up the two numbers and divide it by 2, or we can simply import mean from statistics and use it to simplify our code. You can see how immensely such libraries can help when the mathematics quickly become increasingly complex as we move to machine learning models. 


Hopefully this has been useful to you! Stay tuned for the next post where I’ll be going more in-depth into data modelling and how different data can be stored.  



 


Comments