Have you recently been to a CIO Summit and learned the various ways in which data science can revolutionize your business? Are your competitors are building data products garnished with machine learning and your CEO feels that you should do the same. It is now time to hire your first superstar data scientist and get back into the game.
However, you soon figure out that it is infinitely more challenging to hire a data scientist compared to a software developer, perhaps because of the following three reasons, (1) It is difficult to write up a job description for a data scientist role (2) Large number of data scientists are willing to apply yet a few have the required experience (3) Few industry standards and benchmarks are available.
Ask them to build a basic version first
Data science and it’s closely allied cousins like machine learning indeed holds the promise to change the outcome of your business fundamentally. However, it is important to remember that your first data product should probably be something much more straightforward — for example, a
business intelligence dashboard that can monitor the overall health of your business using key KPIs. Building something simple yet effective will help you encourage the higher management to invest more in your data science team and take on challenging data projects.
Also, it is important to remember that predicting outcomes for your business, using advanced tools such as machine learning only becomes reliable after multiple iterations and often takes months if not years to build.
Ask them to review your data
The data produced or aggregated by your business can range from texts to audio files, to images and even videos. If your company is handling medical or financial records, there are often additional security measures and industry standards for storing and retrieving such data-sets. Make sure that your first hire has previous experience in handling those data types.
Ask them to build data-pipelines
The data produced by your core business is often not adequately aggregated for use. For example, large chunks of the data can be stored in a non-digital format or can be kept under lock and key in private digital storage. Also, data can get corrupt over time, and new data may have a different data type. Make sure that your first hire has experience in scraping and collating data from multiple data sources, and have expertise in building data-pipelines. This will help you create an operational skeleton for your business, that can then be used to produce one or more data products. An efficiently built data-pipeline will also improve data security, by ensuring that you can give restricted access to follow on employees.
Do not hire a ‘jack of all trades’
Data science is often used in the industry as an umbrella term for a whole set of overlapping skillsets that spans from data preparation to artificial intelligence and data visualization. The software tools used by different types of data scientists varies. For example, a veteran computer programmer will know multiple coding languages such as Python, R, C++ and some big data frameworks such as Hadoop, Spark, NoSQL. Whereas a scientist might recognize a few of the programming languages but might know advanced techniques such as MapReduce and machine learning. Similarly, a person with a business background will also know some of the coding languages along with relevant software regulations such as SAS, scrum.
So it is wise to expect your first hire to be a ‘master’ engineer, rather than a ‘jack of all trades’. Also knowing the gaps in the expertise of your first hire will help you plan a hiring roadmap for your data science team. It is often wise to hire a data engineer first to build the data-pipelines, and then hire a business intelligence expert or a statistician to optimize the output.
Make sure they have an original portfolio
While interviewing for a position many candidates will show you a fantastic portfolio of what they have already built. This may include things like a program that can automatically detect words or a chat-bot that can work as a virtual assistant. It is important to remember that a lot of these projects are available in open-source data science forums. While interviewing for a position make sure that you clearly understand what the candidate has contributed to the project.
While hiring your first data scientist, your mantra should be to “Hire for talent, train for tech skills.” Often the best hire is an experienced software programmer who is interested in making a career move into data science. Since data science is an emerging field, it is a good idea to encourage them to continuously learn from forums, MOOCs or universities that offer relevant courses.
Make sure that they are not purely academic
Data science is an offshoot of statistics. As a result, a lot of academic researchers from fields like economics, physics, mathematics, computer science and engineering have jumped on the data science bandwagon to make a career switch to the industry. Though some of the best data scientists to come from universities and allied research laboratories, it is important to remember that these specialists should be only be hired to optimize and not build your data product.
Data-science is a marriage business, software engineering and statistics. Josh Wills describes a Data Scientist as, “A person who is better at statistics than any software engineer and better at software engineering than any statistician.” So your first hire should possibly be a person with multiple years of software engineering experience, as well as some management experience. This will ensure that he can prioritize the tasks and deliver an impact for your business.