Data Science

When used effectively, data science can provide incomparable insight and evidence to drive business objectives.  It’s increasingly common to see organisations turning the output of their data science team into products, reports and actionable insights that enhance their customer or stakeholder engagement.

Why data science?

There are a number of areas of research where data science delivers results that have otherwise only been possible with extensive manual processing, if they have been possible at all. These are all basically forms of pattern matching, including:

1. Classification

Classification is as simple as it sounds – working through a list of entities and classifying each one into one of a set of classes. The power of classification is in the huge range of problems this can be applied to. For example, classifying your customers by their standard industry code just from looking at their website

2. Predictive analytics

Predictive analytics uses a broad set of inputs to calculate the probability of certain events happening and can also be applied to a wide range of problems. For example, predicting which of a range of markets to advertise in will deliver the best result.

3. Sentiment analysis

By using natural language processing it is possible to determine the sentiment of a message. For example, a customer service operative could be alerted that the customer they are about to talk to on a live chat is likely to be frustrated, based on their initial message.

4 Fraud detection

Similar to classification and predictive analytics, data science is increasingly being used as an effective mechanism for the early detection of fraud from what appears to be only a small amount of data.

Building the team

Data science is well on its way to being a mainstream occupation but the whole field has a long way further to develop. For that reason, ongoing professional development and training are critical for the team. A data scientist who is unable to learn new things will soon find their knowledge out of date.

Strategy for the team

It is vital to remember that you are building a science team, which is very different from a project team and so needs a different kind of strategy. The key difference is that results cannot be guaranteed as the team will be breaking new ground and not doing things that have been tried and tested before. This is the nature of research, which cannot be planned in the same way as a project. It is not uncommon for a data science team to experience repeated failure, or to have to abandon a particular line of research and start somewhere else.

Another area of difference from a standard business process is that it is generally not possible to validate ideas before implementation. Data science is by its very nature a lengthy and complex process of validation.

It should also be noted that the data science team might not be the best people to see the value in their results as that often requires a significant depth of business knowledge.

A strategy for a data science team should reflect this uncertainty:

  • Don’t set precise goals and targets for the team or expect validation early on, but give them the freedom to explore widely.
  • Accept that regular failure will be part of the process.
  • Ensure extensive engagement with people outside of the team, at all stages of the research, to help identify the value and possibility of the work being undertaken.

Tools and techniques

The tools and techniques of data science are evolving rapidly, including both the commercial tools and open source tools. As with many industries, the first tools are often open source, having been developed in academia, and then commercial tools develop as the market becomes clearer. Data science is still at the early stages and so open source tools dominate, while commercial tools can be expensive and niche. When new algorithms or techniques appear, it is often in the form of a new open source library or tool that is not yet polished.

A successful data science team gets to select their own tools. While most of the open source tools will be of high quality, some may not have the support or quality an enterprise expects, but that should not become an obstacle.

Finally, be prepared for some unusual requests for equipment or services. Big data requires big iron and it is not unusual for a data science team to request a thousand virtual servers, or a single machine with ten high-performance GPUs.

How we can help

We can help you in multiple ways:

  • Advise you on the right data science strategy for your organisation.
  • Help you build the team and the technology that supports them.
  • Accelerate the delivery of insights and value by designing a research programme that will deliver highly innovative results.
  • Review an existing data science function to ensure it is delivering best value.
  • Analyse your algorithms and the results they deliver to ensure their integrity and effectiveness.