This is the fourth post of a blog series by Gianluigi BAGNOLICesare CalabriaStuart ClarkeDayanand KaralkarYatsea LiJacob Tan and me, aiming at showing how, as a partner, you can build your custom application with SAP Business Technology Platform, to help your customers becoming more data driven and sustainable.

Previously in this series we have introduced the case of a manufacturing industry and the high level architecture of a partner application to cope with their business challenges (An overview of sustainability on top of SAP BTP). Then, we have seen how SAP AI Core works (Introduction of end-to-end ML ops with SAP AI Core) and how it can be used to develop an image segmentation model to automate defect detection in the factory (BYOM with TensorFlow in SAP AI Core for Defect Detection).

In this blog post I will present you another use case for SAP AI Core: sound based predictive maintenance. This is a good example on how deep learning can help with sustainability. In the following, first I will explain why it is so. Then I will go through some basics about sound, how deep learning can be applied to it, and how to use Convolutional Neural Networks (CNNs) to classify sound clips. Lastly, I will tell you why SAP AI Core and SAP AI Launchpad are particularly suitable for this use case.

Of course, you should take this story for what it is, just an example. However, I hope it can bring some food for thought about two facts:

  • data can (and should!) be used to foster sustainability.
  • SAP BTP can support you in building robust, scalable applications leveraging the most advanced Artificial Intelligent techniques.

I believe these key messages are relevant and applicable to plenty of use cases in no matter which line of business.

What is Predictive Maintenance  (PdM) and how does it relate to Sustainability?

There are three possible approaches to industrial maintenance, summarised in the figure below [1]:


Maintenance types [1]

Of all these strategies, PdM is the way to go for those who wish to find the optimum balance between preventing failures and avoiding over-maintenance: arm your factory with relatively inexpensive sensors, monitoring temperature, vibrations, motion data, apply predictive techniques to schedule maintenance when a failure is about to occur, and you’ll end up with a nice cut in operational costs.

In fact, during my experience as AI technical consultant, the value proposition naturally associated to PdM was all about money: increasing production, cutting unnecessary costs. Interestingly, PdM is good for our planet too.

Various references highlight the potential of PdM for the improvement of industry sustainability, primarily via a reduction of energy consumptions: …by implementing proper maintenance and energy monitoring continuously, one can detect inefficiencies and malfunctioning equipment and then any necessary actions could be taken to restore the equipment energy performance to an acceptable level by a user. Even, simple maintenance actions can have a dramatic effect in maintaining energy performance of an equipment [2].

Additionally, the improvement in energy and production efficiency cascades into other aspects of sustainability: reduced production costs, reduced cost of environmental compliance, reduced waste-disposal costs, better product quality, higher utilisation factor, higher reliability and better worker safety. It might sound as an overstatement at first, but taking into account the social importance of industry and, in particular, of manufacturing in our societies, while considering its huge impact on energy consumption, on the use of physical resources and emissions to the environment, Sustainable Manufacturing (SMA) can be considered as one of the most important issues to address for pursuing Sustainable Development. [..] Maintenance has a significant role in reducing economic, environmental and social impact of industrial systems and activities [3].

Not to forget, in circular economy business models, not only production lines can be subject to PdM, but also end products themselves: for instance, a virtuous business can lend washing machines instead of selling them, and then monitor how they are doing at their foster homes to optimise energy consumption and maintenance [4].

Here comes the sound 

With the Industry 4.0, PdM can rely on Internet of Things (IoT), artificial intelligence (AI), streaming analytics, cloud and/or edge computing, and it is becoming more and more feasible and advanced. Typically, PdM can be realised through sensors taking real-time measurements of equipment-related quantities such as speed, displacement, position, vibration, flow, temperature, voltage, power etc. AI techniques can be applied to detect anomalies in the behaviours or to predict the expected lifetime of the equipment before failure. SAP PdM solution, SAP Predictive Asset Insights, is based on this approach.

In the latest years, a funny kind of sensor has started to play a role: microphones. It’s not hard to imagine that in some cases, an expert technician can listen to the noise of a certain piece of equipment and tell whether it is doing fine, or it needs a little oiling. Good news is that the task of listening can be automated.

Computer audition is the art of teaching computers to make sense of the acoustic world, and it is now a hot research topic. As humans, we use sound for three major tasks: communicating (talking), entertaining ourselves (music), and simply understanding the world around us (I hear a truck horn, therefore I move). It figures that research is evolving around the same pillars: think about a virtual assistant taking an hair salon appointment for you or Spotify building playlists tailored to your liking. And as for the third pillar, you can think of all sorts of applications, ranging from monitoring bird migration to fighting against urban noise pollution. Acoustic based PdM is also part of this picture.

If you are interested in the state of the art of Machine Learning (ML) models applied to machine fault prediction, you should check out the DCASE Challenge, where academic and industrial research groups come together to foster exchange and progress. Here you can find baseline models and high-quality open datasets for research purposes.

However, do not think sound-based condition monitoring is only something for academic geeks: it is indeed achievable and profitable in real industry. If you need proofs, check out this report on a commercial solution co-financed by the European Horizon2020 program and have a look at the figures.

Deep Learning on sound, in a nutshell

If you are curious on how exactly a machine can learn to understand sound, in this paragraph I will try to summarise the basics.

Sound is a pressure wave propagating through air. If you record whatever sound clip with a microphone, all you’re doing is sampling the amplitude of this wave at fixed time intervals, for instance, in case of CD quality audio, 44100 times a second, at a sampling rate of 44.1 kHz. The most common uncompressed audio file format is called wave format (.wav) and contains an array of sound wave amplitude values plus some metadata.

If you want to start an analysis on an audio file, the first thing you want to do is having a look at its two possible representations: time domain representation, i.e., simply the sound waveform, and the frequency domain representation, i.e., the sound spectrogram. This latter represents how different frequencies are distributed in the sound clip as a function of time. Obtaining the spectrogram is easy as there are many open-source libraries that can do it for you, for instance, in python, librosa. Here below you can see the waveform and spectrogram of a sample of industrial machine noise.


Sound representations and feature extraction

From either sound representation you can extract numerical features that can be used for statistical analysis, machine learning, even deep learning. There is no standard feature or technique to be applied, it all depends on the specific task you are trying to solve and what works best in capturing the peculiarity of your data. librosa offers a rich set of feature extraction methods, but you can also engineer your custom features if required.

Something that is used frequently for ML applications, however, is the so called LogMel spectrogram. This is like the spectrogram, but the frequency axis, instead of being linear, is deformed to mirror the sensitivity of human hear, which discriminates high frequencies better than low ones. Additionally, Mel Frequency Cepstral Coefficients, or MFCCs, a lower dimensional version of the LogMel spectrogram, are also widely used, especially in the description of human vocal tract. Here you can see how the spectrogram, LogMel spectrogram and MFCCs look like for our machine noise sample:


Spectrogram, LogMel Spectrogram and MFCC

Now, these kinds of features are nothing but 2D matrices of numbers, and in a way, even if they do not represent objects belonging to the visual world, they can be seen as images.  Therefore, algorithms developed to make computer understand images can be useful for sound as well. Now, as you learned in BYOM with TensorFlow in SAP AI Core for Defect Detection, Convolutional Neural Networks (CNNs) are the most successful class of such algorithm. Well, the same CNN architecture that can be trained to tell dogs from cats can also be trained to tell cat meows from dog barks. Or, if you wish, healthy equipment noise from suffering equipment noise.

Let’s get back to the case of our light guide plate manufacturer Bagnoli&Co. Let’s imagine each machine in the factory is being  monitored with a microphone and sound samples have been collected for several weeks.


BagnoliCO production line

Technicians who know the machines by heart can help to annotate this sound dataset, in other words, to label sounds clips as normal or anomalous, maybe even discriminating in different categories of “anomalous”.  We can then extract acoustic features from each sound clip and build a classifier for each machine in the factory, all based on CNNs.

You can find an end-to-end example of such classifier, built with librosa and TensorFlow in this notebook. The main steps are displayed in the figure below. Both LogMel spectrogram and MFCCs are used as input features. The network architecture consists in 2 convolutional blocks, and it is freely inspired to the algorithm presented in [5]. The dataset is totally made up just for demo purposes and consists of a few hundreds of 1.5 seconds sound clips sampled at 16kHz.


Steps to train a sound classification model

If you fancy playing with CNNs applied to sound, clone the repository and help yourself. Just keep in mind this is one of the many examples you can find if you happen to do your own research on the topic. There are several architectures and approaches that can be applied to solve this task, and you might also want to explore unsupervised anomaly detection rather than supervised classification.

What I like about SAP AI Core

The use case presented in this post is at the bleeding edge of deep learning applications. Open-source access to datasets, common deep learning frameworks, standardised model formats is what enables research to evolve fast, and industry to follow quickly. Deep learning models are particularly resource and time-consuming objects: they need huge datasets to be trained on, and the training process itself it’s a computing-intensive, time-consuming, tricky process.  Any ML developer working on a similar solution should make full use of the high-quality content available in the open-source community to achieve good results in reasonable time.

When I start working on a deep learning project, I like to look for related papers on the web and I often start by replicating the same network architecture and practice on the same dataset. Then, I progressively adapt the network to capture the specific features of my problem. In most cases, I use pretrained networks to be relieved from the burden to start the training process from scratch.

SAP AI Core enables AI developers to proceed exactly this way. I will not go through all the details of the solution, since this was covered in Introduction of end-to-end ML ops with SAP AI Core. I’d just like to stress that it leaves developers complete freedom of choice in the deep learning library and models to be used.

Once the code is ready and packed in a docker image, the user just needs to write declarative files in yaml format to specify which containers he wants to run, and on what kind of infrastructure: whether GPUs are required or not, how many replicas are needed for a deployment to ensure resilience, etc. Based on these declarative files, SAP AI Core will configure the Kubernetes cluster and orchestrate your containers as required. Scalability is guaranteed, which ensures a quick and smooth transition from the prototyping phase to production.

From my perspective, the great benefit of SAP AI Core is enabling ML experts to build their model with maximum freedom, and at the same time, making it operational at production-level standards, in a fast, resilient, and scalable way. Additionally, it supports multitenant business models, which is advantageous for partners who may want to sell their AI solution to several customers [6]: you can maintain 1 algorithm, and train it on N different datasets belonging to different customer effortlessly. Data segregation is ensured through the concept of resource groups.

How can you get started with an example provided with source code

In the previous blog post we have seen step-by-step how to train and deploy a computer vision model in SAP AI Core. You can easily replicate the procedure for this sound classification model following the steps and code available here. A YouTube playlist by SAP HANA Academy is also available for you to learn and review the steps in detail.

Steps to train and deploy the model in SAP AI Launchpad

Besides using the ai-core-sdk python library, as documented in the notebooks linked in the previous paragraph, it is also possible to use SAP AI Launchpad to go through your ML lifecycle.

SAP AI Launchpad is a multitenant software-as-a-service application available in SAP BPT. It is designed mostly for AI administrators to manage all the Enterprise AI resources as simply as clicking buttons on an intuitive UX. Thanks to the SAP unified AI API, AI models can be trained and deployed in the same way, no matter what kind of models they are: SAP AI Core models, SAP Data Intelligence Pipelines, SAP AI Business Services, or even external AI services by other Cloud providers.

Let’s assume you have already created a resource group in SAP AI Core for our sound based PdM scenario. You can access SAP AI Core from SAP AI Launchpad and select that resource group, which will give you access to the related resources.

SAP AI Launchpad: resource groups

Entering in Scenarios, you will be able to inspect all the training workflows and serving executables related to the sound applications. These executables are constantly synched with the corresponding templates in GitHub.

SAP AI Launchpad: workflows and serving executables

You can then prepare a training configuration specifying which executable and training data you want to use, and in a few clicks, you are ready to submit an execution and monitor it until completion.

SAP AI Launchpad: model training

Similarly, to deploy the trained model, you just need to create a serving configuration, specifying this time the serving executable and the version of the trained model you desire, and again, in few clicks, you can obtain an API endpoint for your AI model. That’s the starting point to integrate real time AI prediction to your custom application, as you will see in the next blog post of this series.

SAP AI Launchpad:model deployment

For a detailed end-to-end guide to using SAP AI Launchpad for model training and deployment, you can check out here. 

Final remarks on PdM

In this post, we have seen the potential of PdM in the context of Sustainable Manufactory, and we have seen the analytical background required to get started with sound based PdM. There is an important concept that hasn’t been covered, and that’s however worth mentioning.

Implementing PdM always requires some optimisation: whatever AI model you build, it’s response it will never be as clear as “you need to schedule a maintenance on machine XYZ”. AI will tell you something like “I’ve detected a sound anomaly with confidence level of 80%”, and based on that, you will then need to decide whether it’s worth recommending your customer to stop the production line for maintenance operations, or not. In other words, you need to find the threshold in the algorithm response above which it’s more convenient to act rather than keep production flowing.

The optimal working point for PdM is often searched for based on strategical KPIs: maximising production efficiency, minimising downtimes, etc. Too rarely energy related KPIs are considered. So, our piece of advice is: consider adding energy efficiency as a goal of your AI solution. If you optimise energy consumptions or CO2 emissions, it will be a win-win for your customer, as sustainability and profit nowadays go side-by-side. And surely, this doesn’t apply only to PdM.

What’s next

In the next blog post of this series, we will see how to build a custom application in BTP integrating SAP AI Core model. As a reminder, here are all the episodes of our blog series:


[1] A systematic literature review of machine learning methods applied to predictive maintenance

[2] Maintenance for Energy Efficiency: a Review

[3] Maintenance for Sustainability in the Industry 4.0 context


[5] Acoustic-Based Emergency Vehicle Detection Using Convolutional Neural Networks

[6] SAP AI Core: Multitenancy


Sara Sampaio

Sara Sampaio

Author Since: March 10, 2022

0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x