Data scientist skills

Skills Of The Data Expert In Tomorrow’s Job Market: More Than Just Number Crunching

Have you ever been to a job interview, where the interviewer asked you questions which made you think “I don’t think that a good answer/solution exists for the problem” (thus you were doubting how to answer), although you have the right skills, knowledge and experience for the job? I often ask such questions when I interview candidates. Why? Because I am testing their creativity. I ask questions for which I do not know the answer myself, or I do not know whether there is an answer, aiming to see how the candidate will deal with the unknown situation. I am testing the candidates’ creativity, more than their knowledge.

Big Data Skills In The Financial Sector Job Market

The Dutch digital radio program Financials van Morgen (Financials of Tomorrow, in Dutch) interviewed me in a session devoted to working in the financial sector. In this interview I talked about skills that are required in (big) data analytics jobs in Finance, but also broader (i.e. in any other industry). If you missed it, you can still listen to the podcast, or specifically to my interview. The financial sector is a good “use case” for discussing data skills because it is very data-intensive and IT-savvy. Many financial institutions have large IT departments and sizeable data / analytics departments.

Four Skill Areas For Data Professionals

Data jobs come in many flavors. Data vacancies have titles such as “data scientist”, “data analyst”, “BI analyst” (BI stands for Business Intelligence), “data engineer” and more. What kind of skills do we expect these people to have? In short, I consider that a data & analytics team requires four areas of skills: (1) data annotation and cleansing, (2) data analysis, (3) data interpretation and (4) data communication.

Data Annotation and Cleansing

This area is the least “sexy” part of the job, yet a critical sector factor. Reality is that many datasets are not fully suitable for automation. Incompleteness, inconsistency and incorrectness are just a few known problems. Consider for example a dataset of your clients, where in some cases (not always) the client’s address appears in the company name field, or where in some cases the company’s name or country are simply omitted. If you want to perform analyses of such a dataset, you first need to clean the data, and to annotate it, i.e. add the tags that are necessary for your analysis. Think about annotating whether someone is a client or not, or what type of client they are (e.g. B2C vs B2B). If these steps are not done well, the subsequent data analysis may fail because the required insights will not have been made available.

Data Analysis, Part 1: Routine Data Analysis

It goes without saying that a data analyst / data scientist (or any similar job) needs to have the hard skills required to do the job. They need to be familiar with analytics / BI tools, they need to have experience in data modelling techniques etc. But these hard skills alone will not suffice in today’s fast-changing environment. Data professionals require creativity and curiosity.

Data analysis tasks can be categorized as either routine well-defined analyses or explorative analyses. The former is a situation where you already know a domain, and you know that if you perform a pre-defined analysis, you’ll be able to obtain specific insights. For example, if you examine your monthly sales, and you analyse the country or the clients, you’ll be able to understand in which markets you are more successful.

Data Analysis, Part 2: Explorative Data Analysis

Explorative analysis is more innovative in its nature. You perform an analysis with the aim to identify insights, but you do not know yet (or: not fur sure) which parameters will deliver the insights. You then engage in an iterative process of trial-and-error, or try-learn-improve. Sometimes you learn that you need to redefine your questions, or obtain extra inputs (datasets). This iterative process requires creativity because you do not know in advance how to obtain the insights, and sometimes you do not even know which insights exactly you’re looking for. You just work with the assumption that the datasets contain interesting insights, but you’re trying to find out what exactly. In other words, neither your problem statement nor your process to solve the problem are well-defined in advance. And hence creativity is a “must have” skill. I would add also that curiosity is very important. Curious people are likely to be more creative in dealing with such ill-defined problems, because curiosity makes them try different approaches, and thus be more creative.

Data Interpretation – Understanding the Context

Analytic models help data experts see patterns and outliers in datasets, understand trends and spot anomalies/problems. Yet without interpreting the data analysis results, one cannot draw meaningful conclusions. That’s why data interpretation is a third important skill area. For example, data analysis may show that 40% of your revenue is generated in one country. By itself, you may think that you’re successful in this country. In order to interpret the data, you need to understand the context. For example, if in the preceding 4 months 50%-60% of your revenue was generated in that country, you may think that there is a problem. When you add context parameters such as the actual sales amount and the spend on marketing per country, you can draw meaningful conclusions: maybe the share of this specific country decreased because you actually entered a new market (new country) successfully, and your total sales increased significantly.

Data Interpretation – Dealing With Bias

“Context” is a tricky thing though, because it opens up room for introducing bias, i.e. the tendency to over- or under-estimate the value of a parameter in data analysis. For example, if you develop a model for understanding how tall 13-year old boys are, but you use data about boys from China only, your model will be biased (Chinese boys), and not suitable for other countries. This is an easy example because you can easily identify the bias. But sometimes bias is hidden, or implicit. One such case is when you are not aware of the bias. For example, when you’re not aware that your sample dataset includes only Chinese boys. A more interesting case is when the bias is caused by the data expert’s own bias. Humans tend to have biases due to their prior experiences, and the risk is that such biases impact their data analysis. For example, if you do a study on alcohol consumption, and you expect that men drink more alcohol than women, you may avoid investigating other potential parameters that could define consumption of alcohol (e.g. profession, age, presence on social media). You’re looking for a specific answer (because of your bias), and therefore you’ll fail to identify the impact of other parameters on the answer.  Consequently, data experts must have strong ethics and challenge themselves and their peers about potential biases in their data and in their analysis. The data expert is an investigator in pursuit of “the truth” while avoiding bias.

Data Communication

Finally, after data experts analyse and interpret data, and draw conclusions, they need to communicate the results (the derived insights), typically to business stakeholders, i.e. people who run the business and make decisions based on the generated insights.

The challenge here is that these stakeholders do not necessarily understand the process of creating these insights. Neither are they often aware of relevant constraints such as (lack of) data availability, or data quality. They are typically not interested in understanding how the insights were created (if your explanation is too technical, you’ll lose their attention), and often they may not have the patience to hear about your modelling decisions and the embedded assumptions. They just want to hear the bottom line, and ideally in their own lingo. Now the data expert needs communication skills, or else the new insights may go unused, or may be misunderstood by the business users, potentially leading to them making wrong decisions due to an incorrect understanding of the insights.

The Soft-Skilled Data Analytics Professional

Therefore hard skills of number-crunching and modelling techniques are just one part of skillset for data analytics professionals. When teams get larger, some data experts may engage more in tasks that require hard skills, while others engage more in tasks that also require soft skills. Yet because data analysts are not domain experts, they will always work better when interacting with domain experts, including the users of the insights. A few years ago hard skills of data analysis were still hard to find, and if you had these skills you could easily find a job as a data analyst / scientist / expert. As data analysis becomes a mainstream job, and the hard skills become more and more a “commodity”, I expect that the soft skills (as described in this article) will gradually become the differentiating factor in this job market.

What are your thoughts?

Suggested reading: Risk Identification in International Trade – From Data to Science to Insights

Go back to the blog start page.

Sign up to receive blog updates via email.

About the author

Ziv Baida

View all posts

Leave a Reply

Your email address will not be published. Required fields are marked *