As more organizations seek to analyze, understand, and get value from the terabytes and petabytes of data they collect, there has been a groundswell of interest in data science—and high demand for data scientists.
In fact, the role of data scientist is quickly extending beyond the data-crunching specialists who have deep skills and 100% focus on data analysis. We are seeing the emergence of citizen data scientists, who represent a potentially valuable capability for lines of business—and for CIOs like myself.
What exactly is a citizen data scientist? It is someone, much like how a citizen developer, who is a power user. Citizen data scientists are usually not in IT. Instead, they build programs that previously would have required a professional software developer.
A citizen data scientist brings many of the benefits of a dedicated data scientist, using powerful tools that make it possible to manipulate big data in a way that until recently necessitated a professional data scientist with a master’s degree or even a PhD.
Gartner defines a citizen data scientist as someone who “creates or generates models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics.” Rather, they perform “simple and moderately sophisticated analytical tasks.”
From a CIO’s point of view, citizen data scientists represent potential allies in the effort to use analytics to drive better business decisions and outcomes. But that relationship will be most effective when citizen data scientists and CIOs are working together toward those objectives.
Why should a CIO support Citizen Data Scientists?
- Shortage of data scientists: The World Economic Forum, ranks data analysts and scientists as the #1 growing job demand over the next four years, and the there are just not enough new people entering that career to keep up with the demand. A different approach is required to meet this growing need.
- Cost of data scientists: Those individuals who do pursue data science as a career can command a high salary that could put them out of reach for a small or mid-size company. A CIO has the responsibility of keeping IT costs low while delivering value through emerging technology. The amount that would have been needed to hire a full-time data scientist can instead be applied to the training and tools that empower the citizen data scientist.
- Business knowledge of existing workers who are not professional data scientists: The value brought to a company through data science can be spread out to the various departments. Rather than hiring focused data scientists, who must then learn the business and find ways to analyze and mine the data, a citizen data scientist is already a subject matter expert in their area of the business. They know what problems they are trying to solve and are best positioned to arrive at conclusions that are meaningful in the business context.
How can a CIO support Citizen Data Scientists?
- Training: Although the new tools make it much easier for a citizen data scientist, there is still enough complexity in the processes to make good training a must. It doesn’t require going back to college—there are plenty of good courses available online to help the prospective citizen data scientist get up to speed. Check out such sources as SimpliLearn, Udemy, and Codecademy.
- Tools: Before a citizen data scientist can dive into the data, they will need tools. These can range from cloud-based SaaS offerings to desktop applications and is best done under the guidance and supervision of the IT department. To avoid data silos and shadow (or rogue) IT, it is up to the CIO to help determine the best toolset that will provide consistent results across the enterprise, while also ensuring cybersecurity best practices and governance. Finding which tools are available, and how to select which ones are best for use by citizen data scientists is an in-depth topic which I plan to cover in a future article.
- Data: The tools and training won’t get the citizen data scientist very far unless they also have access to the data they need to analyze. This is one of the most important areas for a CIO to be involved. Data governance ensures that only the data which should be made available is accessible, and that the way the data is being used is consistent across departments. The lack of such oversight can result in flawed results which drive the wrong business decisions. An involvement in data governance also ensures that the CIO is aware of the various data science projects across the organization and can apply the necessary resources to support the efforts, another way to achieve big results while minimizing IT spend.