Introduction – The Difference Between Data Analyst and Data Scientist Positions
Between Data Science and Data Analyst Professionals There is a love-hate relationship. All of course in good friendships, but nonetheless the boundaries between these two areas of practice have recently begun to determine and are still not completely clear .
In the past statisticians would perform all the roles. They would retrieve the data, gain insights and build statistical models.
After the amount of data in organizations increased significantly and higher skill was required to retrieve data from complex data warehouses, a new field called ‘Data mining’ emerged and would later turn into ‘Data Analyst’.
Simultaneously to this progress, another field called ‘Data Science’ emerged and it inherited the additional role of statisticians – building statistical models.
The split of the statistician’s role into these two areas has created confusion between the scopes of the new roles. Managers of many start-up companies wanted to have a Data Scientist in the company, a position that is considered very innovative, but in practice they needed a Data Analyst because they wanted to gain insights from the data and did not know how to distinguish between the different roles.
This phenomena created strange situations in which data analysts were recruited as data scientists and boosted their knowledge in the Python language to learn how they can run machine learning (ML) algorithms with scikit-learn package while data scientists who specialized in Machine learning actually worked as data analysts and were frustrated that they did not build ML models and perform a role that did not suit them.
This confusing situation still exists today (at least in Israel where I live). There are still companies that consider Data Analyst as the initial role and the Data scientist as more advanced data analyst who can also develop ML algorithms.
Despite the confusion between the roles today the boundaries between the professions are starting to become clearer. More companies realize that there is a role for data analysts who answer business questions and know how to set up strategic KPI reports that help the company improve performance, and there are data scientists who develop machine learning products and artificial vision.
These days new algorithms called AutoML are being written and can once again challenge the boundaries between the roles.
What is AutoML
The term AutoML is the bread of the words Automation and Machine learning, and it refers to algorithms that can automatically produce Machine learning models. That means, instead of letting data scientists spend multiple hours on creating Machine learning models, the algorithms will create them.
In other words, instead of hiring a data scientist to train a neural network, or decide to choose a different statistical algorithm, the AutoML algorithm will do that work. In a click of a button the algorithm will take care of preparing the data, cleaning it, training several models and deciding which model is the most successful.
It is important to emphasize that fans of this approach also admit that at the moment, AutoML’s algorithms are still in their beginning phase, and it may take some more time for the algorithms to mature and be stable so that everything will work without human interference.
What does this mean about the future of the Data Scientist role? It’s too early to know. The internet is buzzing with discussions about the future of Data Science’s role in the organization. If Data Engineer or Data Analyst could easily build and run models, where is the uniqueness of Data Science in the process? Maybe in the future we will only need Data Scientists experts who know how to build very complex models that AutoML algorithms do not know how to build? There is no solid opinion yet on these issues but in this post we will focus on the impact of these developments on the role of Data Analyst.
How AutoML can help a data analyst
As mentioned before, the role of a data analyst is to answer business questions such as explaining why a phenomenon occurs in the data. Now the analyst can run an AutoML algorithm that identifies the influencing factor and focus his efforts on the business factors and the sociologies origin of the questions.
For example, if an analyst wants to know why sales have increased in Wisconsin, she can run an AutoML algorithm that locates the variable that most affected sales in Wisconsin, then focus her analysis on trying to figure out the business or sociological factors that caused the phenomena in Wisconsin rather than New York. The analyst can check whether a competing company has been set up in Wisconsin or check whether there were emerging demonstrations in Wisconsin against the company’s products due to a YouTube video of people spreading conspiracy theories against the company. Such analyzes require resources and time, and AutoML algorithms can free up that time for the data analyst.
Another example, a data analyst who wants to know the value of the customers (LTV) that came from various marketing channels, can run an AutoML algorithm and with the click of a button get the calculation of the customer value in each channel. The analyst will be able to focus now on analyzing the profit of investing in each channel (ROI analysis), and understand why the users who came from the Tik Tok campaign are worth more than the users who came from the campaigns on Google? Maybe using Tik Tok calms the potential customers and when they sign up they tend to buy a larger subscription package?
A note about working with machine learning models
Although this new field is in its beginning phase, there are already companies offering AutoML algorithm services that can help analysts. In addition, there are BI tools that allow analysts to run key indicator algorithms (finding the most influential factor) and time series (trend line forecasting). If you are a data analyst who has access to these algorithms, do not hesitate to use them and enrich your analyzes with them.
Also, when working with statistical models, it is important to pay attention to the model’s assumptions and the correctness of the model is accepted regardless of whether the model developed by AutoML algorithm or human being.
Summary
The development of AutoML algorithms raises questions about the Data science future roles and undermines again the boundaries Data science and Data analyst . The new developments also affect analysts and help them focus their analysis on business and sociologists factors that are hidden behind the data.
This article was written by Yuval Marnin.
If you have a need to hire a freelancer data analyst you may contact me at: [email protected]
You may also hire me through upwork platform on that link:
https://www.upwork.com/freelancers/~018940225ce48244f0\
Further reading
The advantage of hiring a freelance data analyst.
What does a data analyst is doing and how it can help your company.