4.1. Laboratory
For the experimental forming test of the demonstration component, the designed smart forming tool and a hydraulic press (Rapp & Seidt) are used. The press indicates a maximum tappet force of 1200 kN (see Fig. 4 pos. 1, 2 & 3). Further, a corresponding control system is provided to adjust the temperature of the zones to examine the Warm forming process or other heat-assisted forming process routes. For the forming tests, the anti-friction agent Omega 35 is used due to its temperature resistance and good friction properties. Further, the sheet metal forming is carried out as a 1-stage process with a forming speed of 10 mm/s. In addition, a Scara robot and a two-part conveyor belt are used for partial automation (see Fig. 4 pos. 4 & 5). The optical measuring system (GOM ARAMIS®) and the 3D laser imaging device (Gocator®) are used for inline data evaluation. The generated data is analyzed in real-time and uploaded to ThingWorx (see Fig. 4 pos. 6 & 7).
To produce both good and bad parts, on the one hand, the insertion position of the sheet was changed to obtain slanted parts or incompletely formed parts with wrinkles. On the other hand, the temperature of the forming tool is changed to produce greater distortion or even component cracks. Fig. 8 provides an overview of defects that may occur during the forming process of the sample part. In particular, the formation of wrinkling (e) and cracks (f) are a representation of the heatmap generated by the AI. The heatmap shows the position of the defect and the accuracy of detection.
4.2. Implementation of the DNN for quality assessment
The DNN for quality assessment is implemented using the Matlab®-API. Here, predefined modules are available in the Deep Learning Toolbox with which the network architecture of a convolutional neural network is implemented. When implementing a convolutional neural network, the first step is to create a labeled image data set. For this purpose, the images are sorted into "defective" and "good" folders according to their quality information. The images are generated as described in Sections 3 and 4. The initial scan dataset has a size of 267 images, the simulation data set 256 images and the hybrid data set 322 images. To implement the code architecture for training the neural network, the generated images had to be uploaded to an image datastore in Matlab. These images are then split 90% into a training data set and 10% into a validation data set. The training data set is augmented using random rotation and random translation. Finally, the data set is resized to the resolution of 227x227x3 according to the input layer of the DNN.
The architecture of a convolutional neural network (CNN) consists of several layers, with the convolution layers being the most important component. These convolutional layers consist of filters whose parameters include trained weights. The size and number of the filters in the network architecture are determined via hyperparameters. Filters are multiplied with the weights and the sum is added with the bias to the input matrices at intervals determined by the stride’s hyperparameter. The output matrices calculated from this are submitted into an activation function. As with all newer networks, Squeezenet uses the ReLU activation function (compare Fig. 5). The ReLU function covers a range between [0, ∞]. In contrast, the sigmoid function covers a range between [0, 1] and can therefore only be used to model probabilities. However, all positive real numbers can be modeled using ReLU. When calculating CNNs, the main advantages of the ReLU function are that there are no vanishing gradients and that the training efficiency is higher [12].
Another important component of CNN is pooling layers as they summarize the outputs of neighboring groups of neurons in the same kernel map. The convolution layers usually result in more output than input parameters; pooling reduces the resolution and thus the subsequent parameters and increases the robustness to noise and distortions [13]. In general, the groups of neurons clustered by adjacent pooling units do not overlap. More precisely, a pooling layer can be thought of as consisting of a grid of pooling units spaced s (stride) pixels apart, each summarising a cluster of size \(z x z\)centred on the position of the pooling unit. Setting s = z results in traditional local pooling as commonly used in CNNs. Using max-pooling, the kernel map is summarised as the maximum unit value of the kernel map and using average-pooling, the kernel map is summarised as the average value of all units of the kernel map [14]. To reduce overfitting, dropout layers are used. Overfitting results in the DNN indicate better results when using the training data set, but worse results when using the validation and test data sets. The main idea of a dropout layer is to randomly remove units (along with their connections) from the neural network during training to prevent units from co-adapting. During training, samples are removed from an exponential number of different "thinned" networks. At test time, the effect of averaging the predictions of all networks can be approximated by simply using a single non-thinned network with smaller weights. This significantly reduces overfitting and results in significant improvements compared with other regularization methods [15]. When applying neural networks for classification, the Softmax activation function for the output parameter is used to interpret the output values as probabilities (compare Fig. 6).
The Squeezenet architecture consists of 69 Layers whereby eight so-called “fire modules” (compare Fig. 7) are a special feature. One fire module consists of a 1x1 convolutional layer, which is used as a squeeze convolution layer and feeds after a ReLU-function into an expand layer, that has a mix of 1x1 and 3x3 convolution filters. If the number of filters in the squeeze layer is set smaller than the sum of filters in the expand layer, the number of input channels of the 3x3 convolutional filters decreases, and thus the number of parameters decreases overall [12].
After the 9th fire module, a 50% dropout layer is implemented. The trained Squeezenet-DNN is used for initial testing and adapted individually for the investigated use case. As the Squeezenet is already trained for 1000 classes, the classification layer has to be replaced with a 2-class classification layer. In addition, a fully connected layer is added. In the presented use case, an Adam-optimization function was used to optimize the parameters. Compared to stochastic gradient descent, Adam-optimization often leads to a faster initial decrease in training loss [16] and has a default learning rate that works well across problem settings. In comparison to optimizers such as the stochastic gradient descent with momentum, the Adam optimizer does not feature a fixed learning rate, instead, it is calculated individually for each time step. The updated parameters are calculated as shown in formula (1.1). \({\theta }_{t}\) represents the updated parameters, \(\alpha\) is set as the learning rate and \(ϵ\) is a constant which is added to avoid divisions by zero. Also, the algorithm updates are exponential moving averages of the gradient (\({m}_{t}\)) and the squared exponential moving averages of the gradient (\({v}_{t}\)). The hyper-parameters\({\text{ß}}_{1}\), \({\text{ß}}_{2}\in [0, 1)\)control the exponential decay rates of these moving averages. Both averages are calculated according to the formulas below, which are for the exponential moving averages of the gradient \({m}_{t}\) formula (1.2) and for the exponential squared moving averages of the gradient \({v}_{t}\) formula (1.3) is used. \(\nabla E\left({\theta }_{t-1}\right)\) represents the gradient of the loss function at the point of the parameter vector [16].
$${\theta }_{t}={\theta }_{t-1}-\frac{\alpha {m}_{t}}{\sqrt{{v}_{t}}+ϵ}$$
1.1
$${m}_{t}={\text{ß}}_{1}{m}_{t-1}+\left(1-{\text{ß}}_{1}\right)\nabla E\left({\theta }_{t-1}\right)$$
1.2
$${v}_{t}={\text{ß}}_{2}{v}_{t-1}+\left(1-{\text{ß}}_{2}\right){\left[\nabla E\left({\theta }_{t-1}\right)\right]}^{2}$$
1.3
The parameters\(\alpha\), \(ϵ\), \({\text{ß}}_{1}\) and \({\text{ß}}_{2}\) are set in the training options. In the presented case a base learning rate of 1e-4 is used. The default values are used for \(ϵ={10}^{-8}\), \({\text{ß}}_{1}=0.9\) and \({\text{ß}}_{2}=0.999\). Due to a limited computing memory, the complete data set is divided into batches of 32. During training, the network was validated every 30 iterations with the validation data set.
4.3. Comparison of different training strategies for the AI
On the one hand, the adaptation of the simulation parameters is based on real-time process data (zone temperature, forming speed and duration, etc.) collected via the RTM and on the other hand on perception-based data collected via a 3D imaging sensor [17]. The following section describes the perception-based quality assessment process under the use of a DNN-based method. As a basis, a software interface for point-cloud data acquisition of a 3D laser image sensor is developed using a TCP/IP interface.
The interface provides the opportunity to capture 3D data by a LED sensor with a resolution of up to 0.06–0.09 mm in the XY direction and 0.0047 mm in the Z direction. Further, the field of view is 142 x 190 mm and the scan rate is 4 Hz. The sensor provides a fully autonomous system including data processing, measurement, database, and a web server. Part dimensions and surface quality can be inspected inline using predefined features within the sensor system. Data can be accessed via TCP/ IP from the Edge Micro-Server to provide them as process performance data for condition monitoring tasks. Furthermore, the additional image-based method is an advanced quality inspection instance that uses the sensor point cloud data for an AI-based analytics tool that provides the ability to identify the intensity of occurring errors by a heatmap (compare Fig. 8).
A widespread DNN based on SqueezeNet architecture is used for the reference training and result evaluation [18]. The training is performed at a workstation with an Intel Xenon® Quad-Core E5- 2690 processor. One of the biggest issues for using this method in a wide range is to overcome the problem of providing a huge amount of sample parts that represent defects and anomalies. For this purpose, the conventional point cloud scan data provided by the 3D imaging sensor are enriched by forming simulation data. The simulation produces generic defects e.g. cracks and surface deformation and provides them as a grayscale image. The main challenge at this step is to equalize the image data representative of the simulation with the 3D-scan data provided by the sensor. Figure 9a) shows a sample of the scan data. Figure 9b) represents the filtered simulation data image. The data are converted into grayscale representation and filtered to match. The results are divided into “good” and “defective” parts that are mixed in different training folders. The training based on the hybrid dataset is performed with 20% real scan images and 80% simulated images.
Figure 10 represents the confidential score of the verification runs with the trained DNN. For verification, “defective” and “good” parts are randomly provided to the sensor. Based on the test results, the following statements can be made:
-
For the “good”-sample the DNN trained with only simulation data reaches a median confidential score of 0.228 as “defective” and 0.772 as “good” (see Fig. 10). All components are classified correctly as “good” parts (see Fig. 11).
-
The “defective” parts are classified under the use of simulation training data with 0.996 as “defective” and 0.004 as “good”. Figure 11 indicates that 17.95% of the defective scan-samples note are correctly classified.
-
The DNN trained with “good” scan images reaches a median of 0.254 as “defective” score and 0.746 for “good” score. 11.11% are not correctly classified (see Fig. 11).
-
Using “defective” scan images, a median of 1.000 as “defective” score and 0.000 as “good” score is reached. 4.76% are not correctly classified.
-
The DNN trained with “good” hybrid images reaches a median of 0.274 as “defective” score and 0.726 as “good” score. 20.59% are not correctly classified.
-
The DNN trained with “defective” hybrid images reaches a median of 1.000 as “defective” score and 0.000 as “good” score. 1.67% are not correctly classified.
3Advanced general-purpose multiphysics simulation software used by automobile, aerospace, construction and civil engineering, military, manufacturing, and bioengineering industries. Developed by Dr. John O. Hallquist at Lawrence Livermore National Laboratory (LLNL).
4Standalone design optimization and probabilistic analysis package with an interface to LS-DYNA.
5Industrial 3D-Sensor developed by GOM GmbH, Zeiss.
63D-Smart Sensor for industrial quality inspection developed by LMI-Technologies.
7Commercial software of the US company MathWorks for solving mathematical problems.
8Non-linear activation function used in multi-layer neural networks or deep neural networks.