What do telco operators need to succeed with machine learning

Machine learning is a prominent topic in the field of telecommunications right now. With the promise of automated operational efficiencies, personalised customer offerings and fraud mitigation, it’s no surprise that operators are investing in this advancing technology. In this blog, we discuss what they actually need to prove successful with machine learning.

Easy access to lots of data

One of the things we know is essential to machine learning is data, and lots of it! For telecoms operators, this usually isn’t a problem; there are more mobile devices than people across the globe, and behind every mobile device there is masses of data. Network operators have access to call detail records, service usage statistics, network equipment utilisation, network performance metrics, customer experience metrics and so on. The challenge actually arises when it comes to accessing and interpreting this data.

For many network operators, their big data is stored in multiple places, and accessed using disparate tool sets. To build a successful machine learning model, training data that consists of a set of parameters and a known outcome is required. Building a usable training data set from multiple storage locations, when often the outcome isn’t automatically associated to the defined parameters is a time consuming and semi-manual task. As well as this, operators must be confident in the data quality, and will need to deploy automated data quality fixing processes to ensure accurate results from their machine learning models. In fact, accessing high volumes, of quality data can prove to be one of the most time-consuming parts of developing a new machine learning use case.

The right use cases

In a competitive commercial environment such as telecommunications, finding the right use cases for machine learning is essential. It’s a hot topic in the industry right now, but that doesn’t necessarily mean that it is the right approach for everything. Andreas Vegas, Global Big Data Director at Telefonica explained at this years’ Telco Data Analytics that Telefonica have achieved a 20% CAPEX saving via a new network deployment optimisation tool, which has no machine learning capabilities whatsoever. So, finding the right use cases where machine learning has the potential to deliver real impact is essential. Have a read of our blog, 5 machine learning applications in telecoms for a flavour of the types of use cases beginning to emerge.

Programmers AND telco experts

Machine learning roles are usually associated with software engineers and data scientists, who have the knowledge and experience to define and build the complex algorithms required. The types of skills needed include computer science fundamentals, programming, an understanding of probability and statistics, software engineering and data modelling expertise. These skills can sit with dedicated in-house resource, external partnerships with universities or research centres, and with machine learning vendors. The telecommunications industry is no different to any other in this respect, however another resource is also required; telco data experts.

Telecoms data is notoriously complex with incoherent naming conventions, entangled relationships and trends that are difficult to decipher. To build a successful machine learning model, an understanding of telecoms data is essential, not only to understand what it means but also understand the relationship between different types of data. Programmers and telco data experts must work together to identify machine learning use cases, and define the relevant data sets.

Computing power

The question of hardware requirements for machine learning is one that comes up often. Machine learning models can quite easily be built using an off-the-shelf computer, in fact Microsoft have even managed to carry out machine learning on a Raspberry Pi. But the challenges arise when it comes to actually training the model. Generally, the more data to train with means a more accurate model as a result, but it also means more processing power and longer processing times. And when you are running hundreds of data sets with small tweaks for each epoch, the processing time can soon add up.

Most telecoms operators already have access to an abundance of hardware since they usually run their own data centres. And for many machine learning use cases this can easily be sufficient, especially in the early days. If they also run a virtual environment, operators can quite easily spin-up/spin-down multiple workload processes as and when required to further resource their machine learning models. However, as machine learning use cases expand, or real-time use cases emerge, GPU accelerated hardware should be considered. A GPU offers more cores than traditional CPU’s, allowing the processing of parallel workloads much more efficiently. They are also better designed for multiple repetitive tasks with just small tweaks, such is the case with training a machine learning model. Of course, this comes at a cost; NVIDIA have designed the GDX-1 specifically for machine learning with 5 GPUs, coming it at around £100,000.