DataBruin

It is an open access and web-based graphical programming environment developed at UCLA for preprocessing multi-sensor monitoring data and developing deep learning solutions in general, and for reliability and maintenance areas in particular. In DataBruin, implementation of the data flow diagrams in an intuitive drag-and-drop manner not only offers a fast prototyping and code-free platform by helping practitioners to focus on the concepts rather than the syntax but also makes the opportunity to guide them toward error-free prototyping of deep learning-based Prognostics and Health Management (PHM) models.  Also, DataBruin provides standardization on the structure of deep learning PHM projects by offering a comprehensive path from preparing datasets all through the predictions and necessary assessments, besides the standard auto-generated Python code for each analysis step.

DataBruin’s web application is divided into four independent modules:

Preprocessing Studio
This module adds a graphical programming layer on top of the Pandas library and includes several predefined blocks for data input/output, indexing, type conversions, filtering, missing handling, resampling, time windowing, and reorganizing the datasets.  It standardizes and facilitates the data cleaning and preprocessing of tabular data and is optimized for multi-sensor monitoring data sources.

Feature Analysis Studio
This module provides a toolbox for Principal Component Analysis (PCA) and clustering and is developed on top of Scikit-Learn. In real-world multi-dimensional machinery datasets, often several features show a high level of correlation.  The reason is that multiple sensors may be located very near each other within the industrial process or may record similar parameters resulting in a huge amount of redundant information in the dataset.  Transforming data from the original physical features into the mathematical principal components not only improves the performance of any further model by removing redundant information but also significantly increases the computational efficiency of the modeling system.

Deep Learning Studio
This module is an application developed based on Tensorflow and Keras frameworks.  It divides any deep learning-based PHM project into five standard steps:

  • Dataset preparation,

  • Neural network architecture implementation,

  • Model compiling,

  • Tasks timeline arrangement,

  • Tasks runtime & assessments.

The module includes options for splitting the dataset into train-test segments, feature scaling, and label encoding.  It is assumed that any other sophisticated cleaning and preparation is already done in Preprocessing Studio and Feature Analysis Studio.  A comprehensive list of core neural network layers, besides an extensive collection of convolutional, pooling, and recurrent layers implements complex neural network architectures as easy as arranging the corresponding blocks in a graphical programming environment.  Along with almost all the existing layer types in Tensorflow 2.0, a few custom layers (i.e. blocks) are added to facilitate the implementation of specific architectures like variational autoencoders and GANs.

The models, their metrics, loss functions, and optimizers are fully customizable.  It is possible to create complex models as a stack of other ones, and based on the specific needs, each sub-model could be frozen during the training process.  Several tasks are included for fitting models to datasets, evaluating trained models using some test data, making new predictions, assessing the quality of the predictions, performing autoencoder-based anomaly detection, and training GAN models.