A. space magnetic flux density acquisition
When the metal target is far away from the observation point, we can approximately think that the magnetization field model of the metal target is equivalent to the magnetic dipole model [17]. The magnetic field distribution of the magnetic dipole in space can be expressed by the following formula [18].
\({\mathcal{B}_{dipolar}}\left( r \right)=\frac{{{\mu _0}}}{{4{\text{\varvec{\pi}}}}}\frac{1}{{{r^3}}}\left[ {3\left( {\mathcal{m} \cdot \mathcal{\hat {r}}} \right)\mathcal{\hat {r}} - \mathcal{m}} \right]\)
(1)
Among them, m is the dimension of the magnetic moment [L2I]; r denotes the distance between the magnetic dipole's center and the observation point; \(\mathcal{\hat {r}}\) denotes the unit vector between the magnetic dipole's center and the observation point; and µ0 denotes the vacuum permeability.
In order to obtain more data of spatial magnetic anomalies and ensure that the distance between each fluxgate sensitive unit is 400mm, we adopted an array structure arrangement of 8 three-axis fluxgate magnetometers (model HSF923-2H5-AA Xi'an Huashun). We call this structure Fluxgate magnetometer cube arrangement structure (FMCAS), as shown in Fig. 1. Each fluxgate magnetometer obtains xyz three-axis magnetic flux data, so FMCAS obtains 8 xyz three-axis magnetic flux data, a total of 24 groups of data.
We acquire an east-west magnetic field data measurement line north of the detection target. As shown in Fig. 2., We connect the FMCAS to the Sliding Block through copper bolts and place it on the Sliding Track made of aluminum alloy. The laser distance sensor (model L-GAGE) records the distance information of the position of the FMCAS relative to the sensor. The material of the sliding block is wood, in order to facilitate sliding, we smeared grease on the sliding track. Each time the assistant pulls the sliding track with the rope, we obtain experimental data on a measuring line.
Through the above experimental method, we can get the magnetic field data of a survey line every time we slide the sliding block, which contains the target magnetic anomaly. We intercept the middle part of the survey line (the data at both ends fluctuates greatly when sliding starts and ends), divide the intercepted part of the survey line into 100 equal parts, and obtain 101 position coordinates. Corresponding the FMCAS magnetic flux density data of each position to obtain a set of 101×24 size magnetic flux data about the position information.
We might as well define the pseudo-total field xyzi of the fluxgate magnetometer (because the fluxgate magnetometer has not been calibrated) as follows:
$$xy{z_i}=\sqrt {{x_i}^{2}+{y_i}^{2}+{z_i}^{2}} \left( {i=1,2, \cdot \cdot \cdot ,8} \right)$$
The pseudo-total field of the ith fluxgate magnetometer is represented by xyz in the formula. The ith fluxgate magnetometer's x-axis magnetic field output is xi; the ith fluxgate magnetometer's y-axis magnetic field output is yi; and the ith fluxgate magnetometer's z-axis magnetic field output is zi.
We construct the magnetic flux tensor matrix as shown in Fig. 3, the matrix size is [101, 8, 4]. The first dimension includes 101 position information; the second dimension includes the label information of 8 fluxgate magnetometers; the third dimension includes the xyz pseudo total field, the x-axis magnetic field, the y-axis magnetic field, and the component information of the z-axis magnetic field.
Later, we will use the above-mentioned magnetic flux tensor matrix to train the recognition algorithm.
B. ResNet-18
Four Chinese, including Microsoft Research Institute's Kaiming He, proposed ResNet (Residual Neural Network). With an error rate of 3.57% on top5, a 152-layer neural network was successfully trained and won the championship in the ILSVRC2015 competition utilizing ResNet Unit. ResNet's topology allows for rapid neural network training while also considerably improving model accuracy.
When deepening the neural network, the problem of gradient explosion and disappearance occurs. And we also use normalized initialization [19–21] and intermediate normalization layers [22] to solve this problem. Due to the existence of the nonlinear activation function Relu, each input-to-output process is almost irreversible (information loss) [23].It is difficult to reverse the full input from the output, which also makes it very unlikely that the features will be fully preserved as the forward propagation layer by layer. The Residual Learning module is introduced into the deep neural network, and the output before the previous layer (or layers) is added to the output calculated by this layer by jumping, and the summation result is input into the activation function as the function of this layer. output [24]. In this way, the depth of the network can be greatly increased. The main structure of ResNet is shown in Table 1.
Table 1
ResNet architectures with different layers
layer name
|
output size
|
18-layer
|
50-layer
|
conv1
|
112×112
|
7×7, 64, stride 2
|
conv2_x
|
56×56
|
3×3max pool, stride 2
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{3}} \times {\text{3, 64}}} \\ {{\text{3}} \times {\text{3, 64}}} \end{array}} \right] \times {\text{2}}\)
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{1}} \times {\text{1, 64}}} \\ \begin{gathered} {\text{3}} \times {\text{3, 64}} \hfill \\ {\text{1}} \times {\text{1, 256}} \hfill \\ \end{gathered} \end{array}} \right] \times {\text{2}}\)
|
conv3_x
|
28×28
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{3}} \times {\text{3, 128}}} \\ {{\text{3}} \times {\text{3, 128}}} \end{array}} \right] \times {\text{2}}\)
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{1}} \times {\text{1, 128}}} \\ \begin{gathered} {\text{3}} \times {\text{3, 128}} \hfill \\ {\text{1}} \times {\text{1, 512}} \hfill \\ \end{gathered} \end{array}} \right] \times {\text{4}}\)
|
conv4_x
|
14×14
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{3}} \times {\text{3, 256}}} \\ {{\text{3}} \times {\text{3, 256}}} \end{array}} \right] \times {\text{2}}\)
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{1}} \times {\text{1, 256}}} \\ \begin{gathered} {\text{3}} \times {\text{3, 256}} \hfill \\ {\text{1}} \times {\text{1, 1024}} \hfill \\ \end{gathered} \end{array}} \right] \times 6\)
|
conv5_x
|
7×7
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{3}} \times {\text{3, 512}}} \\ {{\text{3}} \times {\text{3, 512}}} \end{array}} \right] \times {\text{2}}\)
|
\(\left[ {\begin{array}{*{20}{c}} {{\text{1}} \times {\text{1, 512}}} \\ \begin{gathered} {\text{3}} \times {\text{3, 512}} \hfill \\ {\text{1}} \times {\text{1, 2048}} \hfill \\ \end{gathered} \end{array}} \right] \times {\text{3}}\)
|
1×1
|
average pool, 1000-d fc, softmax
|
We are conducting research on object classification and recognition, utilizing the main network architecture of ResNet-18. In practice, we make the following modifications to the original ResNet-18:
Step 1. Removed the first layer of 7×7 convolution in the original network;
Step 2. Modify the original second layer 3 × 3 maximum pooling layer to 3 × 3 convolution, and set 64 convolution kernels.
Step3. Finally, modify the 1000 neurons in the final fully connected layer to 3 neurons.
Through the above processing we get an improved ResNet-18 network. The magnetic feature data is in the data format of 101×8×4, and the convolution operation is performed through a 3×3 convolution kernel, and then it is input into 8 ResNet blocks in turn, and the corresponding convolution operation is performed. The data is then globally pooled before being fed into the fully connected layer. Figure 4 depicts the network structure. The data is then globally pooled before being fed into the fully connected layer. Figure 4 depicts the network structure.
2. training ResNet-18 model
On an NVIDIA GeForce RTX 2080 Ti GPU with 32 GB of memory, all model training and test evaluation experiments were carried out. The TensorFlow GPU version operates on Windows 10 and is installed on the Anaconda 3 platform.
A. Dataset generation
We used three various sizes of iron balls as neural network training and recognition targets in the experiment. The 5m-long sliding track is located on the north side of the detection target, and the laser distance sensor used to record the FMCAS position information is located on the east side of the sliding track. The data collector (model: PXIe-4309-NI) is located 4m northeast of the slide rail, as shown in Fig. 5.
A spatial Cartesian coordinate system is developed, as shown in Fig. 6, with the X axis facing north, the Y axis facing west, and the unit being meters. The start of the sliding track is at coordinates (0.5, -2), and the finish of the sliding track is at coordinates (0.5, -2). The FMSC slides from the sliding track start point to the sliding track end point. The sliding distance is 4 meters. During the sliding process, the laser distance sensor records the distance (0 ~ 4) of the FMSC relative to the starting point of the slide rail in real time.
The starting position of each probe target is (-0.2, -0.2) and the end position is (0.2, 0.2). The target moves 0.1 meters along the x-axis each time. When the x-axis coordinate of the target reaches 0.2 meters, the target moves 0.1 meters along the y-axis. Repeatedly adjust the x-axis position of the target to complete a cycle. The target has a total of 25 position points, and each position point FMCAS slides 40 times. Each target to be tested has a total of 1000 groups of data, which are used as the training set of deep learning. The targets are randomly arranged in 5 random positions within a square area from (-0.2, -0.2) to (0.2, 0.2), and the FMCAS is also slid 40 times, with a total of 200 groups of data for each target, as the test set of deep learning. During the experiment, the three targets we measured and their effective data are shown in Table 2.
Table 2
Statistical table of measured data
Detection Target
|
Properties (Radius/mm)
|
Valid data (Groups)
|
Invalid data (Groups)
|
No. 1 iron ball
|
51
|
1238
|
0
|
No. 2 iron ball
|
56.5
|
1223
|
0
|
No. 3 iron ball
|
67.5
|
1240
|
1
|
During the experiment, we set the sampling frequency to 100Hz. The space where the experiment is located is disturbed by the power frequency signal, and the magnetic anomaly caused by the target we are concerned about is the DC component. Therefore, in signal processing, we use the Butterworth filter to low-pass filter the collected time-domain signal. We set the bandpass frequency of the filter to be 2Hz, the bandstop frequency to be 12Hz, and the order of the filter to be 6th order. The comparison before and after low-pass filtering is shown in Fig. 7. The red curve represents the unfiltered time-domain signal, and the black line represents the low-pass filtered time-domain signal.
We choose the starting point of the magnetic tensor matrix at the position of 1000mm from the FMCAS to the laser distance sensor, and the end point at the position of 4000mm. We divide 101 points evenly within the distance of 3000mm, and obtain the three-axis magnetic flux density output of each fluxgate magnetic sensor at these distance points. We use the position of the FMCSA as the independent variable, and use the three-axis magnetic flux density of each fluxgate magnetic sensor as the dependent variable for interpolation calculation. We use the position of the FMCAS as the independent variable, and use the three-axis magnetic flux density of each fluxgate magnetic sensor as the dependent variable for interpolation calculation. Using the linear interpolation method of sample points, according to the position we need, the three-axis magnetic flux density distribution of each fluxgate magnetic sensor at this position. In this way, we can obtain the 3-axis magnetic field data of the 8-fluxgate sensor at 101 equally spaced positions between 1000mm and 4000mm. The interpolation process is shown in Fig. 8.
According to the above processing method, we process each set of data into a set of magnetic flux tensor matrices. and map the resulting matrix to the labels one-to-one.
B. Training of recognition network
We separated the data set into 3088 training sets, 513 test sets, and 100 validation sets, according to Table 2. The three sets are mutually exclusive and completely separate from one another. After 100 iterations, the model took 4 hours to train, and the iteration curve is illustrated in Fig. 9.
For the sample set D, the classification accuracy is defined as the following formula.
The f function represents the completely trained neural network, and m indicates the total number of all samples. The accuracy of the training set steadily increases as the number of iterations grows, eventually converges to 1; the accuracy of the test set gradually increases after the 60th iteration, eventually converges to roughly 0.9. We added 7 sets of untargeted ambient magnetic fields to the validation set, totaling 107 groups, to further verify the recognition effect. Table 3 shows the test labels in the final validation set.
Table 3
Validation set contains label table
Detection Target
|
Valid data (Groups)
|
No. 1 iron ball
|
40
|
No. 2 iron ball
|
32
|
No. 3 iron ball
|
28
|
No see
|
7
|
For the problem of target recognition accuracy, we add an untargeted environmental magnetic field. We set the target recognition threshold to 0.75. During the recognition process, if it is inferred that the recognition target has a probability of more than 75% and is identified as a target in the target library, we consider this target recognition to be effective. If it is inferred that the recognition probability of the recognized target is 75%, then we think that the target we detected is not in the target library, and the detected target is No see. Bringing the validation set into the model resulted in an accuracy of 84.1%. The recall rate, accuracy rate, and F1 value are used as assessment indicators to assess the model's performance.
\(Recall=\frac{{TP}}{{TP+FN}}\)
\(Precision=\frac{{TP}}{{TP+FP}}\)
\(Specificity=\frac{{TN}}{{TN+FP}}\)
\({F_1}=\frac{{2 \times {\text{ }}Recall{\text{ }} \times {\text{ }}Precision{\text{ }}}}{{{\text{ }}Recall{\text{ }}+{\text{ }}Precision{\text{ }}}}\)
TP (True Positive) denotes a true case; FP (False Positive) denotes a false positive example; and FN (False Negative) denotes a false negative example. Figure 10 depicts the confusion matrix derived by calculation.
The model's accuracy rating is 84.1% after calculations. Table 4 shows the precision rate, recall rate, single accuracy rate, and F1 value.
Table 4
Model reference index table
|
Precision
|
Recall
|
Acc_single
|
F1
|
No. 1 iron ball
|
88.2%
|
75%
|
86.9%
|
0.81
|
No. 2 iron ball
|
87.5%
|
87.5%
|
92.5%
|
0.87
|
No. 3 iron ball
|
96.4%
|
96.4%
|
96.3%
|
0.96
|
No see
|
38.5%
|
71.4%
|
90.6%
|
0.5
|