0% found this document useful (0 votes)

16 views16 pages

Field Road Segmentation Method Based On Two Channe

This paper presents TFFNet, a novel field road segmentation model utilizing two channel feature fusion to enhance the detection of drivable areas in agricultural field roads, which is crucial for autonomous navigation of intelligent agricultural machinery. The model incorporates various advanced techniques such as atrous space pyramid pooling, deep separable convolution, and asymmetric atrous convolution, achieving superior segmentation performance compared to existing models like PSPNet and DeepLabV3+. A comprehensive dataset of field road images under different climatic conditions was created to validate the model's effectiveness, demonstrating improved accuracy and speed in segmentation tasks.

Uploaded by

Ferdika Pradana Putra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views16 pages

Field Road Segmentation Method Based On Two Channe

Uploaded by

Ferdika Pradana Putra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Multimedia Tools and Applications

https://doi.org/10.1007/s11042-024-20071-8

Field road segmentation method based on two channel

feature fusion

Wei Depeng1 · Long Teng1

Received: 11 August 2022 / Revised: 8 June 2024 / Accepted: 11 August 2024

Abstract
Field road segmentation is one of the key technologies to realize autonomous navigation
of intelligent agricultural machinery. For the complex and changeable road conditions of
agricultural field roads, the existing road segmentation models cannot meet the needs of
fast and accurate detection of drivable areas on field roads. In this paper, we propose a field
road segmentation model (TFFNet) based on two channel feature fusion. First, the atrous
space pyramid pooling module is used to extract and fuse multi-scale pooling features.
Secondly, the deep separable convolution module and the asymmetric atrous convolution
module (AACM) are used to extract deep features from two paths and combine them. It is
fused to obtain a feature map with rich global scene context information. Finally, the chan-
nel attention module is introduced to update the weights of the feature channels at each
stage, so that the extracted feature information can help improve the accuracy of the model
in field road segmentation. Experiments show that, compared with the existing PSPNet,
ENet and DeepLabV3 + models, the road segmentation model proposed in this paper has
better segmentation performance in our multi-climate field road dataset, and can achieve
better segmentation accuracy and speed. Balanced to meet the needs of intelligent agricul-
tural machinery for field road detection.

Keywords Convolutional neural network · Attention mechanism · Asymmetric

convolution · Road segmentation

1 Introduction

Agriculture is the basic industry for the development of the national economy. With the
continuous advancement of urbanization, the rural young and middle-aged labor force is
transferring to cities and towns, resulting in agricultural production is facing the prob-
lem of labor shortage and productivity decline, thus restricting the quality and efficiency
of agricultural production to a certain extent [1]. Improving the level of automation and
intelligence in the agricultural production process and developing intelligent agricultural

* Long Teng
[email protected]
1
Guangxi University, Nanning 530007, China

Vol.:(0123456789)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Multimedia Tools and Applications

machinery and equipment play an important role in promoting agricultural modernization,

and it is also an important means to solve the problem of labor shortage [2]. At present,
intelligent agricultural machinery and equipment have been widely used in agricultural
operations such as seeding, fertilization, weeding and harvesting, and have achieved good
performance in all aspects [3–5]. The detection of the drivable area of the field road is
one of the key technologies to realize the autonomous navigation operation of intelligent
agricultural machinery. The accurate identification of the drivable area of the field road
can provide an important basis for intelligent agricultural machinery and equipment to effi-
ciently and safely carry out path planning and decision-making control, so as to achieve It
performs autonomous navigation and obstacle avoidance on field roads [6].
In the early days, the detection of the drivable area of the road was mainly realized by
the method based on traditional image processing. Scharwachter T et al. [7] combined low-
level features such as color, texture, and depth, and realized pixel-level semantic segmenta-
tion of street scenes based on the random decision forest method. Das S et al. [8] used the
method of manually extracting features to complete the road recognition, but the parameter
adjustment of this method is more complicated and the robustness is poor. Tao Siran et al.
[9] converted the road image to the HIS color space to segment the road gray-scale consist-
ency area, and combined the spatial gradient information to refine the segmentation result.
G. Cheng et al. [10] applied object-based feature extraction to obtain rough road regions
and then performed pixel-based road segmentation. Through observation, the above-men-
tioned road detection methods based on traditional image processing are mainly based on
the combination of one or more of artificially designed surface features such as texture,
color and shape, so as to realize the identification of the drivable area in the road scene.
It has a good detection effect for structured roads with clear road boundaries and regular
shapes. However, the road detection method based on traditional image processing is eas-
ily interfered by factors such as illumination changes and road shapes, and it is difficult to
meet the road drivable area detection of unstructured roads such as agricultural field roads
with indistinct road boundaries and irregular road shapes.
In recent years, convolutional neural networks have been widely used in image clas-
sification [11, 12], smart agriculture [13–15] and medical image segmentation [16–18]
and have shown excellent performance. In the study of road segmentation methods based
on convolutional neural networks, Long J et al. [19] proposed a fully convolutional net-
work (FCN), which was the first to apply deep learning to the task of road recognition and
implemented it in complex urban road scenes. Road area extraction. Wang J et al. [20]
proposed a method to estimate urban road layout and segment urban scenes using a com-
bination of relative location priors and semantic segmentation. Zhang Z et al. [21] added
skip residual connections [22] in the UNet structure, which promoted the dissemination
of information, increased the trainable parameters of the network, reduced the generation
of noise points in road segmentation, and improved the efficiency of Road segmentation
accuracy at the intersection of lanes. Zhe C et al. [23] proposed a convolutional neural
network (RBNet) for road boundary detection, which can detect roads and road boundaries
in a single process. Chen L et al. [24–26] combine atrous convolution with spatial pyramid
pooling (SPP) module, which has good results for edge segmentation of obstacles in the
road. Zhang Z et al. [27] used multi-sensor information fusion technology to improve the
accuracy of urban traffic road scene understanding. Although the above-mentioned road
segmentation methods based on convolutional neural networks can achieve good road seg-
mentation results, most of them are proposed to solve the road segmentation problem in
specific structured road scenarios, and generally have large parameters and computational
complexity. High speed and slow reasoning speed, etc., it is difficult to meet the real-time

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

needs of intelligent agricultural machinery and equipment to quickly detect the drivable
area of the road.
This paper proposes a field road segmentation framework (TFFNet) based on chan-
nel attention mechanism combined with two channel feature fusion. Firstly, the atrous
space pyramid pooling module is used to extract and fuse multi-scale pooling features of
the input image; Secondly, multiple deep separable convolution modules and asymmetric
atrous convolution modules (AACM) are successively used to extract and fuse multi-scale
pooling features from the two paths. Its deep feature information is extracted, and the fea-
ture information extracted by the two paths is fused; Then the channel attention module
is introduced to update the weight of the feature map extracted at each stage according to
the importance of its feature channel, so as to filter out the features with the feature chan-
nel of important feature information; Finally, the extracted feature map with rich feature
information is decoded to obtain the segmentation result of the drivable area of the field
road image. In order to verify the performance of our TFFNet model, experiments are con-
ducted on a multi-climate field road dataset made. Experiments show that compared with
the existing PSPNet, ENet and DeepLabV3 + models, our proposed TFFNet model has
better road segmentation performance and can meet the needs of intelligent agricultural
machinery for field road detection.
In summary, our main contributions are as follows:

• We propose a field road segmentation model (TFFNet) based on two channel feature
fusion. The model uses two paths to extract feature information of different scales from
the input image and fuse them to obtain a feature map with rich feature information,
thereby effectively improving the performance of field road segmentation.
• We propose an asymmetric convolution module (AACM). This module can obtain
a larger receptive field without increasing the amount of model parameters, so as to
obtain richer multi-scale context information, thereby effectively improving the feature
representation performance of the model.
• A dataset of field road images with various climatic conditions is constructed. We
evaluate the performance of our method with existing PSPNet, ENet and Deep-
LabV3 + models on the produced multi-climate field road dataset. The results show that
our TFFNet has better segmentation performance and can meet the needs of intelligent
agricultural machinery for field road detection.

2 Dataset

In order to better evaluate the performance of our proposed field road segmentation model,
we produced a dataset containing field road images of various weather scenes, referred to
as the MFR dataset.

2.1 Image acquisition

In order to obtain field road images conveniently, we used GoPro HERO 9 motion cam-
era, Feiyu VIMBLE 2 stabilizer and Devastator crawler mobile platform to build an image
acquisition vehicle. When the image acquisition vehicle is collecting images, the image
acquisition vehicle drives on the field road at a constant speed of 2 m/s, and the motion
camera captures a road image every 3 s, and the resolution of each image is 1920 × 1280.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

All images in the MFR dataset were acquired by image acquisition vehicles in two prov-
inces, Guangxi and Henan, from April to November 2021. In addition, in order to make the
dataset more realistically reflect the field road scenes in actual production activities, and
to more objectively evaluate the performance of our proposed method, we conducted road
images collection under three climatic conditions: rainy, cloudy, and sunny, respectively.
Finally, after filtering out the unclear images, we collected a total of 4286 images, of which
there are 1025, 1497 and 1764 images for rainy, cloudy and sunny weather, respectively.

2.2 Image preprocessing

In order to obtain a more accurate road segmentation dataset and effectively increase the
diversity of data features, we performed a series of preprocessing on the collected field
road images. First, in order to reduce the occupation of computer memory when the net-
work model performs feature extraction on images, we scale the resolutions of all collected
images to 512 × 512. Secondly, since the collected original images do not have semantic
labels, we first use the open-source pixel-level labeling tool Labelme to semantically label
the two object categories "road" and "background" in the scaled image and save the label
files as. jason format files, and then convert the.jason format files containing semantic label
information into.png format semantic label images through batch conversion file technol-
ogy, so as to obtain binary images of field roads and backgrounds. We show some exam-
ples of field road scene images and their semantic labels from the MFR dataset, as shown
in Fig. 1. Then, in order to improve the generalization ability of the network model and
effectively avoid over fitting during model training, we enhanced the data through geo-
metric transformation [28], so that the number of images in the MFR dataset increased
from 4286 to 38574, including 9225, 13473 and 15876 pictures in rainy, cloudy and sunny
weather respectively. Geometric transformation is mainly to flip, rotate and scale the origi-
nal image and semantic label image, so as to expand the number of images. Among them,
flip is to flip the image horizontally and vertically respectively; Rotation is to rotate the
image from 0 to 180 degrees at an interval of 30 degrees; Scaling is to scale the image with

Original image Label Original image Label

Fig. 1 Example image of some field road scenes

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

scaling coefficients of 0.8, 0.9 and 1.1 respectively [29]. As shown in Fig. 2, we show some
field road images processed by geometric transformation. Finally, we divided the 38,574
images obtained by the expansion into training set, validation set and test set according to
the ratio of 7:2:1, among which, there are 27,002 images in training set, 7,715 images in
validation set and 3,857 images in test set.

3 Methodology

The field road segmentation model (TFFNet) based on two channel feature fusion proposed
in this paper adopts an asymmetric encoder-decoder structure. In this section, we will
introduce the encoding network structure, decoding network structure and overall network
structure of the TFFNet model in detail respectively.

3.1 Encoder structure

The coding network of our dffnet model is mainly composed of four parts: atrous space
pyramid pooling module (asppm), efficient channel attention mechanism module (ECAM),
deep separable convolution module (DSCM) and asymmetric atrous convolution module
(AACM), as shown in Fig. 3.
First, the number of channels of the input image is adjusted by a 1 × 1 convolution layer
and downsampled by a max pooling layer to reduce the dimension of the feature map and
retain effective feature information, and then use the atrous space pyramid pooling module
( ASPPM) to extract and fuse multiple scale feature information to obtain more effective
global scene context feature information.As shown in Fig. 4a, the atrous spatial pyramid
pooling module (ASPPM) consists of a 1 × 1 convolutional layer, three atrous convolu-
tional layers with dilation rates of 6, 12 and 18 respectively, and an adaptive global pool-
ing. Multiple atrous convolution layers with different dilation rates are used to extract dif-
ferent scale feature information.
Then, the efficient channel attention mechanism module (ECAM) is used to update
the weight of the feature channel of the extracted multi-scale feature map according to

(a)Original image (b)Rotate (c)Flip (d)Zoom

Fig. 2 Example of data enhancement

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

Fig. 3 Encoding network structure diagram of TFFNet model

(a) ASPPM structure diagram (b) ECAM structure diagram

(c) DSCM structure diagram (d) AACM structure diagram

Fig. 4 Module structure diagram of each part of TFFNet encoder

its importance, so as to obtain more effective feature information. As shown in Fig. 4b,
the efficient channel attention mechanism module (ECAM) first compresses the input
feature map into a 1 × 1 × C vector through the global pooling layer, then using a one-
dimensional convolution layer with convolution kernel size K to obtain the importance
of each feature channel and generate a new weight vector, and then multiply the origi-
nal input feature vector with the generated weight vector to obtain more effective fea-
ture information. Among them, the size of convolution kernel K of one-dimensional

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

convolution layer is adaptively determined by the number of channels, and the calcula-
tion method is shown in formula (1).
| log C + b |
K = || 2 |
| (1)
| σ |odd
where, ∥odd represents an odd number; C represents the number of channels; According to
the literature [30], the values of σ and b are 2 and 1, respectively.
Finally, the feature map output by the efficient channel attention mechanism module
(ECAM) is down-sampled by a max pooling layer to further reduce the dimension of the fea-
ture map, then five cascaded deep separable convolution modules (DSCM) and five cascaded
asymmetric atrous convolution modules (AACM) are used to extract and fuse deep feature
information of different scales from two paths respectively to obtain a feature map with rich
feature information, thus completing the feature encoding of the original input image. As
shown in Fig. 4c, assuming that the number of channels of the input feature map is C, the deep
separable convolution module (DSCM) uses a 3 × 3 convolutional layer to perform feature
extraction on the feature map of each channel to obtain C feature maps, and then use a 1 × 1
convolutional layer to fuse the obtained C feature maps. As shown in Fig. 4d, the asymmetric
atrous convolution codule (AACM) consists of 1 × 5 convolutional layers, 5 × 1 convolutional
layers, 1 × 1 convolutional layers, and 3 × 3 atrous convolutions with dilation rate 2. The fea-
ture maps of input images were extracted through 1 × 5 convolution layer, 5 × 1 convolution
layer and 3 × 3 atrous convolution layer with dilation rate of 2 respectively, and then the three
obtained feature maps were fused by the way of formula (2) ~ (4). Due to the different size
of convolution kernel, the feature extraction operation of asymmetric convolution not only
reduces the information redundancy brought by ordinary convolution, but also introduces non-
linear activation function between asymmetric convolutions. Therefore, the ability of adding
the features of the convolution model to each other can be improved, and the ability of fitting
the asymmetric features of the convolution model can also be improved.

� 𝛾j Yj (j) ̂
𝛾j
F (j) = F (j) + F + ̂(j)
F (2)
𝜎j 𝜎j 𝜎
̂j

μj γj μj γj ̂
μĵ
γj
bj = − − + β j + βj + ̂
βj (3)
σj σj ̂
σj

∑C � (j)
̂ ∶,∶,j =
O∶,∶,j + O∶,∶,j + O M∶,∶,k ∗ F∶,∶, + bj (4)
k=1

where, j represents the j-th convolution kernel; μj and σj represent the batch normalized
channel mean and standard deviation; γj and βj represent the scaling factor and offset; F �(j)
(j) (j)
represents the fused convolution kernel; bj represents the offset; F(j), F and ̂ F represent
the convolution kernel after the fusion of 3 × 3 atrous convolution layer, 1 × 5 convolution
layer and 5 × 1 convolution layer respectively; O∶,∶,j, O∶,∶,j and O
̂ ∶,∶,j represent the outputs
of the 3 × 3 atrous convolution layer, 1 × 5 convolution layer and 5 × 1 convolution layer
respectively; M∶,∶,k represents the feature map of the k-th channel of the input feature map.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

3.2 Decoder structure

The decoding network of our proposed TFFNet model is mainly composed of two upsam-
pling bottleneck modules (UBM) and four common bottleneck modules (CBM) inter-
spersed, as shown in Fig. 3. The feature map extracted by the encoder is upsampled twice
by the upsampling bottleneck module (UBM) to restore the feature map to the size of
the original image, and four common bottleneck modules (CBM) are used to decode the
feature map to get the result of field road segmentation. The common bottleneck module
(CBM) consists of two 1 × 1 convolutional layers and one 3 × 3 convolutional layer, inter-
spersed with batch normalization layers (BN) and PReLU layers, as shown in Fig. 5a. The
structure of the upsampling bottleneck module (UBM) is similar to that of the common
bottleneck module (CBM), which is mainly composed of three 1 × 1 convolutional layers
and one 2 × 2 deconvolutional layer, and the batch normalization layer (BN) and PReLU
layers are interspersed, as shown in Fig. 5b.

3.3 TFFNet structure

Our proposed TFFNet model adopts an asymmetric encoder-decoder structure, which con-
sists of a large encoding network and a small decoding network, as shown in Fig. 3.
Firstly, the TFFNet model down-samples the field road image to compress the feature
dimension, and extracts the feature information of multiple scales through the atrous spa-
tial pyramid pooling module (ASPPM) to obtain more effective global scene context fea-
ture information.
Secondly, the efficient channel attention mechanism module (ECAM) calibrates the
weight of each feature channel of the extracted multi-scale feature map to screen the fea-
ture layers that are beneficial to improve the performance of model segmentation.
Then, the filtered feature map is dowm-sampled for the second time to further reduce
the dimension of the feature map and reduce the amount of calculation, and then the deep
feature information of two different scales is further extracted from the two paths through
five cascaded deep separable convolution modules (DSCM) and asymmetric atruos convo-
lution modules (AACM), and then the two scale feature information is fused to obtain the
feature map with rich feature information.
Finally, the extracted feature map is decoded through two up-sampling bottleneck mod-
ules (UBM) and four common bottleneck modules (CBM), so as to segment the driveable
area of field roads.

(a) CBM structure diagram (b) UBM structure diagram

Fig. 5 Module structure diagram of each part of TFFNet decoder

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

4 Experimental results and analysis

In this section, to demonstrate the effectiveness of the proposed TFFNet model, we con-
duct road segmentation experiments on the constructed MFR dataset. We first introduce
the experimental details and evaluation metrics. Then, the experimental results of the
TFFNet model and the existing PSPNet, ENet and DeepLabV3 + models in the MFR
dataset are compared and analyzed.

4.1 Experimental details

To verify that the proposed TFFNet model has good segmentation performance for driv-
able regions of field roads, we conduct experiments on the MFR dataset. Our experi-
mental platform is a desktop computer with Intel Core i5-9400 CPU, NVIDIA Tesla
K80 GPU, 8 GB running memory, PyTorch 1.8.1, CUDA 11.1 and cuDNN V11.1.74.
In order to ensure the reliability of the experiment, the performance indicators of
all network models compared with our TFFNet model in the experiment are obtained
by training and testing under the same experimental parameter settings. When we train
the TFFNet model on the MFR dataset, we use the stochastic gradient descent (SGD)
method for end-to-end training, the batchsize is set to 16, the momentum factor is set to
0.9, the weight decay is set to 0.0001, and the initial learning rate is set to is 0.0001, the
learning rate scheduling policy is set to poly policy, epochs is set to 300 and the number
of iterations is 506400. In order to make the training of the model enter a stable learning
state as soon as possible, the learning rate is warmed up at the beginning of the training,
and the learning rate is linearly increased from 0 to 0.025 in the first 1000 batch train-
ing, and then the learning rate decreases by exponential transformed with the increase
of the number of iterations. In addition, in the model training process, the model is
saved once every 2 epochs of training are completed to avoid losses caused by power
failure and abnormal exit during long-term training.

4.2 Evaluation metrics

In order to more intuitively evaluate the performance of the TFFNet model for field
road segmentation, we use four indicators: pixel accuracy (PA), mean pixel accuracy
(mPA), mean intersection ratio (mIoU), model parameters and road segmentation time.
It was evaluated for performance. Among them, the pixel accuracy (PA) represents the
ratio of the number of correctly predicted pixels to the total number of image pixels; the
average pixel accuracy (mPA) represents the ratio of the number of correctly predicted
pixels in each category to the total number of pixels in this category, and then the aver-
age value of all categories is calculated; The mean intersection-over-union ratio (mIoU)
represents the ratio of the intersection and union of the number of predicted pixels for
each class and the actual number of pixels, and then takes the average of all classes.
Assuming that the dataset has K categories, P ij represents the number of pixels whose
ii represents
true value is the i-th category but is predicted to be the j-th category, and P
the number of correctly predicted pixels, then the calculation of the three indicators of

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

the pixel accuracy (PA), average pixel accuracy (mPA) and the average intersection over
union ratio (mIoU) is shown in formulas (5) ~ (7):
∑K−1
pii
PA = ∑K−1i=0∑K−1 (5)
i=0
p
j=0 ij

1 �K−1 Pii
mPA = ∑K−1 (6)
K i=0
P
j=0 ij

1 �K−1 Pii
mIoU = ∑ K−1 ∑K−1 (7)
K i=0
j=0 Pij + j=0 Pji − pii

4.3 Results and analysis

We calculated the parameters and floating-point calculations (FLOPs) of the proposed

TFFNet model and the existing mainstream road segmentation model, and the results are
shown in Table 1. According to the data in Table 1, the parameters and floating-point cal-
culations (FLOPs) of our proposed TFFNet model are only 0.21 M and 2.07G respectively.
Among them, the parameters of TFFNet model are reduced by 98.3%, 40% and 99.6%
respectively compared with the existing PSPNet, ENet and DeepLabV3 + models; Simi-
larly, the number of floating-point calculations (FLOPs) is 96.1%, 8% and 97.5% lower
than the existing PSPNet, ENet and DeepLabV3 + models. It can be shown that the com-
plexity of our proposed TFFNet model is lower, and the requirements for memory usage
and graphics card computing power are lower when the model is trained, and it can meet
the requirements of mobile terminals such as smart agricultural machinery.
In Fig. 6, we show the loss plots of the TFFNet model and the existing PSPNet, ENet
and DeepLabV3 + models trained on the MFR dataset. It can be found from Fig. 6 that
the losses of the existing PSPNet, ENet and DeepLabV3 + models when reaching con-
vergence are 0.288, 0.429 and 0.236 respectively, while the loss of our proposed TFFNet
model when reaching convergence is only 0.180, which is 37.5%, 58% and 23.7% higher
than PSPNet, ENet and DeepLabV3 + models respectively. This shows that our proposed
TFFNet model can reach the convergence state faster when trained on the MFR dataset,
and has better performance in segmentation of drivable areas on field roads.

Table 1 Parameters and floating- Model Parameter FLOPs

point calculation times of each
model
DeepLab V3 + 54.70M 82.98G
PSPNet 12.62M 53.03G
UNet 26.35M 223.43G
SegNet 29.48M 562.5G
ENet 0.35M 2.25G
TFFNet 0.21M 2.07G

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

Fig. 6 The training loss diagram of each model

Table 2 Performance comparison Model PA mPA mIoU

of various models
DeepLab V3 + 93% 90.5% 86.1%
PSPNet 92.4% 89.4% 82.9%
ENet 86.8% 85.3% 78.2%
TFFNet 94.5% 93.7% 88.4%

We test the existing PSPNet, ENet and DeepLabV3 + models and the proposed TFFNet
model on the MFR dataset respectively. The test results of each performance index are
shown in Table 2. According to the data in Table 2, for pixel accuracy (PA), our proposed
TFFNet model achieves 94.3%, which is 2.2%, 8.1% and 1.6% higher than PSPNet, ENet
and DeepLabV3 + models respectively; For average pixel accuracy (mPA), our proposed
TFFNet model achieves 93.7%, which is 4.6%, 8.9% and 3.4% higher than the PSPNet,
ENet and DeepLabV3 + models respectively; For the mean intersection over union (mIoU),
the TFFNet model proposed by us reaches 88.4%, which is 6.2%, 11.5% and 2.6% higher
than the PSPNet, ENet and DeepLabV3 + models respectively. This shows that our pro-
posed TFFNet model can extract and fuse the feature information of different scales
through two channel features to obtain the feature information with rich context informa-
tion. At the same time, the efficient channel attention mechanism module (ECAM) is used
to screen the features that are effective to improve the segmentation performance, so that
the TFFNet model has good adaptability in field road scenes, and has better performance
when segmenting the drivable area of field roads.
In order to more intuitively show the field road segmentation performance of the pro-
posed TFFNet model, we randomly select three images from the MFR dataset, and use
DeepLabV3 + , PSPNet, ENet and the proposed TFFNet model to segment them and

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

(a)Original (b)Ground Truth (c)TFFNet (d)DeepLabV3+ (e)PSPNet (f)ENet

Fig. 7 Schematic diagram of field road segmentation results for each model

visualize the segmentation results, as shown in Fig. 7. It can be seen from Fig. 7c that
our proposed TFFNet model has high accuracy and precision for the segmentation of the
drivable area of the field road under the three climatic conditions of sunny, cloudy and
rainy. The TFFNet model can effectively avoid the influence of factors such as stagnant
water, light and weeds during road segmentation by fusing multi-scale contextual feature
information and using channel attention mechanism to screen effective feature informa-
tion. From Fig. 7d-e, it can be seen that the DeepLabV3 + and PSPNet models are prone
to misdetection of “pedestrians” and “rice fields” as “roads” when segmenting the driv-
able area of the field roads and unrecognized small targets. In addition, they are easy to be
disturbed by environmental factors such as light, standing water and weeds. It can be seen
from Fig. 7f that the ENet model also misidentifies "mountains", "rice fields" and "pedes-
trians" as "roads" and the detection is incomplete. This is mainly because the ENet model
does not achieve a balance between speed and accuracy, resulting in poor adaptability to
environmental factors such as standing water and weeds, which makes it less effective for
segmentation of drivable areas on field roads. This shows that the TFFNet model proposed
by us can still accurately identify the drivable areas of field roads in sunny, cloudy and
rainy weather, and has good anti-interference.
To test the speed of our proposed TFFNet model for detection of drivable areas on
field roads, we randomly select 10 road images from the test set of the MFR dataset,
then using DeepLabV3 + , PSPNet, ENet and the proposed TFFNet model to segment
the drivable area in the selected 10 images and recording the time required for each

Table 3 Comparison of time Model Time

consumed by various models for
road segmentation
DeepLab V3 + 205ms
PSPNet 175ms
ENet 126ms
TFFNet 121ms

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

model to complete an image segmentation, and then the average of the recorded 10 seg-
mentation times is used to represent the time required for the model to detect the drive-
able area of a single field road image, as shown in Table 3. From the data in Table 3, it
can be seen that the detection time of DeepLabV3 + , PSPNet and ENet models for a
single road image is 205 ms, 175 ms and 126 ms respectively, while the time used by
our proposed TFFNet model to detect a single road image is only 121 ms, which is 41%,
30.8% and 3.9% higher than that of DeepLabV3 + , PSPNet and ENet models respec-
tively. This shows that, compared with the three existing models of DeepLabV3 + ,
PSPNet and ENet, our proposed TFFNet model can detect drivable areas in field road
images faster, achieve a balance between segmentation accuracy and speed, and meet
the needs of real-time detection of the driveable area of the road when the intelligent
agricultural machinery is driving on the field road.

5 Conclusions

This paper proposes a field road segmentation model (TFFNet) based on two chan-
nel feature fusion. Firstly, the atrous space pyramid pooling module is used to extract
and fuse multi-scale pooling features of the input image; Secondly, multiple deep
separable convolution modules and asymmetric atrous convolution modules (AACM)
are successively used to extract and fuse deep feature information from the two paths
respectively; Then a channel attention module is introduced to filter out features with
important feature information. Finally, the extracted feature map with rich feature infor-
mation is decoded to obtain the segmentation result of the drivable area of the field
road image. Experiments show that compared with the existing PSPNet, ENet and Dee-
pLabV3 + models, the performance of our proposed TFFNet model is improved in all
aspects. In terms of parameter quantity, the parameter quantity of TFFNet model is only
0.21 M; In terms of pixel accuracy (PA) and average intersection ratio (mIoU), TFFNet
model reaches 94.3% and 88.4% respectively; The detection time of a single road image
is only 121 ms, which is 41%, 30.8% and 3.9% higher than the DeepLabV3 + , PSP-
Net and ENet models, respectively. In summary, the road segmentation model (TFFNet)
proposed in this paper has good segmentation performance in field road images under
various weather conditions, and can meet the needs of intelligent agricultural machinery
for real-time detection of field roads.

Funding The authors did not receive support from any organization for the submitted work.

Data availability Data sharing not applicable to this article as no datasets were generated or analysed during
the current study.

Code availability Not applicable.

Declarations
Ethics approval Not applicable.

Competing interests The authors declare that they have no known competing financial interests or personal
relationships that could have appeared to influence the work reported in this paper.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

References
1. Qingkuan Meng, Xiaoxia Yang, Man Zhang et al (2021) Recognition of unstructured field road scene
based on semantic segmentation model[J]. Trans Chinese Soc Agri Eng (Transactions of the CSAE)
37(22):152–160.in Chinese with English abstract. https://doi.org/10.11975/j.issn.1002-6819.2021.22.
017 http://www.tcsae.org
2. Chengliang L, Hongzhen L, Yanming Li et al (2020) Analysis on status and development trend of
intelligent control technology for agricultural equipment[J]. Trans Chinese Agric Machinery 51(1):1–
18 (in Chinese with English abstract)
3. Chattha HS, Zaman QU, Chang YK et al (2014) Variable rate spreader for real-time spot-application of
granular fertilizer in wild blueberry[J]. Comput Electron Agric 100:70–78
4. Onishi Y, Yoshida T, Kurita H, et al. An automated fruit harvesting robot by using deep learning[C]//
Tokyo: The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec), 2018:
6–13.
5. Jianguo C, Yanming Li, Chengjin Q et al (2018) Design and test of capacitive detection system for
wheat seeding quantity[J]. Trans Chinese Soc Agri Eng (Transactions of the CSAE) 34(18):51–58 (in
Chinese with English abstract)
6. Man Z, Yuhan Ji, Shichao Li, Cao Ruyue Xu, Hongzhen ZZ (2020) Research1 Progress of Agricultural
M lachinery Navigation Technology [J]. Trans Chinese Soc Agri Machinery 51(04):1–18
7. Scharwachter T, Franke U (2015) Low-level fusion of color, texture and depth for robust road scene
understanding[C]//. IEEE In Intelligent Vehicles Symposium (IV) 2015:599–604
8. Das S, Mirnalinee TT, Varghese K (2011) Use of salient Features for the design of a multistage
framework to extract roads from high-resolution multispectral satellite images[J]. IEEE Trans Geosci
Remote Sens 49(10):3906–3931
9. Siran T (2019) Road segmentation of high-spatial resolution remote sensing images by considering
gradient and color information [J]. Sci Technol Eng 19(31):263–269
10. Cheng G, Zhu F, Xiang S, Pan C (2016) Road centerline extraction via semi-supervised segmentation
and multi direction nonmaximum suppression. IEEE Geosci Remote Sens Lett 13(4):545–549
11. Thenmozhi K , Reddy U S . Crop pest classification based on deep convolutional neural network and
transfer learning - ScienceDirect[J]. Computers and Electronics in Agriculture, 164:104906–104906.
12. Liu S, Huang S, Xu X, et al. Efficient visual tracking based on fuzzy inference for intelligent transpor-
tation systems[J]. IEEE Transactions on Intelligent Transportation Systems, 2023.
13. Duong LT, Nguyen PT, Sipio CD et al (2020) Automated fruit recognition using EfficientNet and
MixNet[J]. Comput Electron Agric 171:105326
14. Jiang H, Zhang C, Qiao Y et al (2020) CNN feature based graph convolutional network for weed and
crop recognition in smart farming[J]. Comput Electron Agric 174:105450
15. Liu X, Hou S, Liu S et al (2023) Attention-based multimodal glioma segmentation with multi-attention
layers for small-intensity dissimilarity[J]. J King Saud Univ Comput Inform Sci 35(4):183–195
16. Gómez O, Mesejo P, Ibáez O et al. (2020) Deep architectures for high-resolution multi-organ chest
X-ray image segmentation. Neural Comput Appl 32(2)
17. Zhang M , Li X , Xu M , et al. (2020) Automated Semantic Segmentation of Red Blood Cells for
Sickle Cell Disease. IEEE J Biomed Health Inform (99):1–1
18. Liu S, Wang S, Liu X et al (2021) Human memory update strategy: a multi-layer template update
mechanism for remote visual monitoring[J]. IEEE Trans Multimedia 23:2188–2198
19. Long J, Shelhamer E, Darrell T (2015) Fully convolutional cetworks for semantic segmentation[J].
IEEE Ransactions Patt Anal Mach Intell 39(4):640–651
20. Wang J, Kim J (2017) Semantic segmentation of urban scenes with a location prior map using lidar
measurements[C]// IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Vancouver, BC, Canada 661–666
21. Zhang Z, Liu Q, Wang Y (2017) Road extraction by deep residual U-Net[J]. IEEE Geosci Remote Sens
Lett 32(99):1–5
22. He K, Zhang X, Ren S, et al. (2016) Deep residual learning for image recognition[C]// Conference on
Computer Vision and Pattern Recognition, IEEE 5410–5418
23. Zhe C , Chen Z (2017) RBNet: A Deep Neural Network for Unified Road and Road Boundary
Detection[C]// International Conference on Neural Information Processing
24. Chen LC, Papandreou G, Kokkinos I et al (2016) DeepLab: Semantic image segmentation with deep
convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Trans Pattern Anal Mach
Intell 40(4):834–848
25. Chen L C, Papandreou G, Schroff F, et al. (2017) Rethinking atrous convolution for semantic image
segmentation[C]// Computer Vision and Pattern Recognition, IEEE 3061–3070.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Multimedia Tools and Applications

26. Chen L C, Zhu Y, Papandreou G, et al. (2018) Encoder-decoder with atrous separable convolution for
semantic image segmentation[C]//Computer Vision and Pattern Recognition, IEEE 4040–4048
27. Zhang Z, Xu C, Yang J et al (2018) Deep hierarchical guidance and regularization learning for end-to-
end depth estimation[J]. Pattern Recogn 83:430–442
28. Li Y, Wang H, Dang LM et al (2020) Crop pest recognition in natural scenes using convolutional neu-
ral networks[J]. Comput Electron Agri 169:105174
29. Wang J, Li Y, Feng H et al (2020) Common pests image recognition based on deep convolutional neu-
ral network[J]. Comput Electron Agric 179(1):105834
30. Wang Q , Wu B , Zhu P , et al. (2020) ECA-Net: Efficient Channel Attention for Deep Convolutional
Neural Networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR). IEEE

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and applicable
law.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:

1. use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
2. use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at

[email protected]

BCA Course Module
No ratings yet
BCA Course Module
11 pages
Iv2012 Ieee
No ratings yet
Iv2012 Ieee
6 pages
LCMorph Exploiting Frequency Cues and Morphologica
No ratings yet
LCMorph Exploiting Frequency Cues and Morphologica
19 pages
A Survey of Deep Learning Road Extraction Algorith
No ratings yet
A Survey of Deep Learning Road Extraction Algorith
31 pages
Sensors: Fast Attention CNN For Fine-Grained Crack Segmentation
No ratings yet
Sensors: Fast Attention CNN For Fine-Grained Crack Segmentation
10 pages
Freedom-Ticket 01-2 Notes
No ratings yet
Freedom-Ticket 01-2 Notes
10 pages
03 Content
No ratings yet
03 Content
5 pages
Detecting Lane and Road Markings at A Distance With Perspective Transformer Layers
No ratings yet
Detecting Lane and Road Markings at A Distance With Perspective Transformer Layers
6 pages
Data Science Interview Questions (Healthcare)
No ratings yet
Data Science Interview Questions (Healthcare)
19 pages
A Convolutional Neural Network Approach For Road Anomalies Detection in Bangladesh
No ratings yet
A Convolutional Neural Network Approach For Road Anomalies Detection in Bangladesh
7 pages
04 Abstract
No ratings yet
04 Abstract
3 pages
Resumenes FiltroVA
No ratings yet
Resumenes FiltroVA
13 pages
Pereira 2019
No ratings yet
Pereira 2019
4 pages
Paper 2
No ratings yet
Paper 2
5 pages
Smart Traffic Management of Vehicles Using Faster R-CNN Based Deep Learning Method
No ratings yet
Smart Traffic Management of Vehicles Using Faster R-CNN Based Deep Learning Method
17 pages
Robust RCNN
No ratings yet
Robust RCNN
10 pages
Mattyus DeepRoadMapper Extracting Road ICCV 2017 Paper
No ratings yet
Mattyus DeepRoadMapper Extracting Road ICCV 2017 Paper
9 pages
Kulambayev - Real - Time - Road - Damage - Detection - System-Q3-WoS
No ratings yet
Kulambayev - Real - Time - Road - Damage - Detection - System-Q3-WoS
11 pages
Detection of Road Extraction From Satellite Images With Deep Learning Method
No ratings yet
Detection of Road Extraction From Satellite Images With Deep Learning Method
10 pages
A Novel Convolutional Neural Network For Enhancing The Continuity of Pavement Crack Detection
No ratings yet
A Novel Convolutional Neural Network For Enhancing The Continuity of Pavement Crack Detection
20 pages
A Review of Research On Road Feature Extraction Through Remote Sensing Images Based On Deep Learning Algorithms
No ratings yet
A Review of Research On Road Feature Extraction Through Remote Sensing Images Based On Deep Learning Algorithms
5 pages
Mattyus HD Maps Fine-Grained CVPR 2016 Paper
No ratings yet
Mattyus HD Maps Fine-Grained CVPR 2016 Paper
9 pages
Multiscale Residual Convolution Neural Network and Sector Descriptor-Based Road Detection Method
No ratings yet
Multiscale Residual Convolution Neural Network and Sector Descriptor-Based Road Detection Method
16 pages
2023 11 20 Edip Selections
No ratings yet
2023 11 20 Edip Selections
123 pages
Rig No.: 314 Well Name: Date: 0.00 Drill Pipe: 0.00 Bha: 0.00 Kelly: Depth 0.00 Page #: 1
100% (1)
Rig No.: 314 Well Name: Date: 0.00 Drill Pipe: 0.00 Bha: 0.00 Kelly: Depth 0.00 Page #: 1
7 pages
RCYOLO An Efficient Small Target Detector For Crack Detection in Tubular Topological Road Structures Based On Unmanned Aerial Vehicles
No ratings yet
RCYOLO An Efficient Small Target Detector For Crack Detection in Tubular Topological Road Structures Based On Unmanned Aerial Vehicles
14 pages
DIGITAL LOGIC ASSIGNMENT 1 Multcoplexer
No ratings yet
DIGITAL LOGIC ASSIGNMENT 1 Multcoplexer
9 pages
Cognitive-Based Crack Detection For Road Maintenance An Integrated System in Cyber-Physical-Social Systems
No ratings yet
Cognitive-Based Crack Detection For Road Maintenance An Integrated System in Cyber-Physical-Social Systems
16 pages
Deep Learning Based Last Mile Deliveries - RS
No ratings yet
Deep Learning Based Last Mile Deliveries - RS
7 pages
Erp Manager
No ratings yet
Erp Manager
2 pages
SAP CATS Target Hours Calculation
No ratings yet
SAP CATS Target Hours Calculation
2 pages
Road Extraction with DDU-Net
No ratings yet
Road Extraction with DDU-Net
12 pages
Presentation Matrix COSEC For End Users
No ratings yet
Presentation Matrix COSEC For End Users
147 pages
Key Points Estimation and Point Instance Segmentat
No ratings yet
Key Points Estimation and Point Instance Segmentat
7 pages
An Optimized Yolov5 Based Approach For Real-Time Vehicle Detection at Road Intersections Using Fisheye Cameras
No ratings yet
An Optimized Yolov5 Based Approach For Real-Time Vehicle Detection at Road Intersections Using Fisheye Cameras
16 pages
SDPT Semantic-Aware Dimension-Pooling Transformer For Image Segmentation
No ratings yet
SDPT Semantic-Aware Dimension-Pooling Transformer For Image Segmentation
13 pages
BSSNet A Real-Time Semantic Segmentation Network For Road Scenes Inspired From AutoEncoder
No ratings yet
BSSNet A Real-Time Semantic Segmentation Network For Road Scenes Inspired From AutoEncoder
15 pages
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
No ratings yet
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
6 pages
1 TGRS
No ratings yet
1 TGRS
16 pages
Infrastructures 08 00090
No ratings yet
Infrastructures 08 00090
13 pages
Poullis 2010
No ratings yet
Poullis 2010
17 pages
Consent Form Version 6
No ratings yet
Consent Form Version 6
2 pages
Sensors 23 07395
No ratings yet
Sensors 23 07395
18 pages
Daftar Harga Produk TIENS
No ratings yet
Daftar Harga Produk TIENS
2 pages
Ref 6
No ratings yet
Ref 6
8 pages
3is Activity Sheets Quarter 1
No ratings yet
3is Activity Sheets Quarter 1
17 pages
Sensors 22 07707
No ratings yet
Sensors 22 07707
20 pages
Improved Deep Network for Self-Driving Scene Classification
No ratings yet
Improved Deep Network for Self-Driving Scene Classification
14 pages
CM Bc9000-Eng-Int-B-Catalogue
No ratings yet
CM Bc9000-Eng-Int-B-Catalogue
20 pages
RS232 Standard
No ratings yet
RS232 Standard
27 pages
MF-Tyre/MF-Swift 6.2: Help Manual
No ratings yet
MF-Tyre/MF-Swift 6.2: Help Manual
58 pages
Warning!
No ratings yet
Warning!
6 pages
A Real-Time Collision Detection System For Vehicles
No ratings yet
A Real-Time Collision Detection System For Vehicles
6 pages
Leveraging Topology For Domain Adaptive Road Segmentation in Satellite and Aerial Imagery
No ratings yet
Leveraging Topology For Domain Adaptive Road Segmentation in Satellite and Aerial Imagery
12 pages
A Pavement Crack Detection and Evaluation Framework For A
No ratings yet
A Pavement Crack Detection and Evaluation Framework For A
24 pages
Multi-Scale Vehicle Detection in Weather
No ratings yet
Multi-Scale Vehicle Detection in Weather
18 pages
A Novel Video-Based Application For Road Markings Detection and Recognition
No ratings yet
A Novel Video-Based Application For Road Markings Detection and Recognition
6 pages
Rev. - 20 (3) JUNAID - 5
No ratings yet
Rev. - 20 (3) JUNAID - 5
19 pages
Mobile Application User Guide
No ratings yet
Mobile Application User Guide
13 pages
1 Info Packet 1 (April 2022)
No ratings yet
1 Info Packet 1 (April 2022)
10 pages
IPR - Quiz 1 2024
No ratings yet
IPR - Quiz 1 2024
1 page
SH Verion
No ratings yet
SH Verion
29 pages
ECON 246 Study Guide 4
No ratings yet
ECON 246 Study Guide 4
5 pages
Intelligent Traffic-Monitoring System Based On YOLO and Convolutional Fuzzy Neural Networks
No ratings yet
Intelligent Traffic-Monitoring System Based On YOLO and Convolutional Fuzzy Neural Networks
14 pages
Load Schedules For Lighting Panel Admin BLD 6 - 11-2023
No ratings yet
Load Schedules For Lighting Panel Admin BLD 6 - 11-2023
1 page
Haier: Service Manual
No ratings yet
Haier: Service Manual
31 pages
Sample
No ratings yet
Sample
7 pages
Detection and Evaluation of Road Cracks and Potholes For Smart City Initiatives Using Deep Learning
No ratings yet
Detection and Evaluation of Road Cracks and Potholes For Smart City Initiatives Using Deep Learning
16 pages
ML Research Paper
No ratings yet
ML Research Paper
8 pages
Innopolis University Robotics Problems
No ratings yet
Innopolis University Robotics Problems
4 pages
SwinURNet: Real-Time Road Segmentation
No ratings yet
SwinURNet: Real-Time Road Segmentation
16 pages
Road Vec Net
No ratings yet
Road Vec Net
25 pages
Road Image Classification: Leonid Dashko
No ratings yet
Road Image Classification: Leonid Dashko
5 pages
AKSA Battery Charger
No ratings yet
AKSA Battery Charger
2 pages
Road Damage
No ratings yet
Road Damage
13 pages
Product Installation Manual: Digital Video Disc Player Model #'S: DVD-01x-x, DVD-01x-40x Document #540066
No ratings yet
Product Installation Manual: Digital Video Disc Player Model #'S: DVD-01x-x, DVD-01x-40x Document #540066
17 pages
Boxcars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance
No ratings yet
Boxcars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance
12 pages
FARSA: Fully Automated Roadway Safety Assessment
No ratings yet
FARSA: Fully Automated Roadway Safety Assessment
9 pages
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
No ratings yet
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
21 pages
Real-Time Traffic Scene Segmentation Based On Multi-Feature Map and Deep Learning
No ratings yet
Real-Time Traffic Scene Segmentation Based On Multi-Feature Map and Deep Learning
6 pages
Real-Time Scene Understanding
No ratings yet
Real-Time Scene Understanding
11 pages
Thesis Body Structure
100% (3)
Thesis Body Structure
7 pages
Aerial Road Topology Extraction
No ratings yet
Aerial Road Topology Extraction
38 pages
Alvarez Eccv 12
No ratings yet
Alvarez Eccv 12
14 pages
Feb2018
No ratings yet
Feb2018
226 pages
Number System Conversion
No ratings yet
Number System Conversion
30 pages
Audi A3 Technical Overview
100% (57)
Audi A3 Technical Overview
7 pages

Field Road Segmentation Method Based On Two Channe

Uploaded by

Field Road Segmentation Method Based On Two Channe

Uploaded by

Multimedia Tools and Applications

Field road segmentation method based on two channel

Wei Depeng1 · Long Teng1

Received: 11 August 2022 / Revised: 8 June 2024 / Accepted: 11 August 2024

Keywords Convolutional neural network · Attention mechanism · Asymmetric

machinery and equipment play an important role in promoting agricultural modernization,

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Original image Label Original image Label

Fig. 1 Example image of some field road scenes

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

(a)Original image (b)Rotate (c)Flip (d)Zoom

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 3 Encoding network structure diagram of TFFNet model

(a) ASPPM structure diagram (b) ECAM structure diagram

(c) DSCM structure diagram (d) AACM structure diagram

Fig. 4 Module structure diagram of each part of TFFNet encoder

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

(a) CBM structure diagram (b) UBM structure diagram

Fig. 5 Module structure diagram of each part of TFFNet decoder

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

4 Experimental results and analysis

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

4.3 Results and analysis

We calculated the parameters and floating-point calculations (FLOPs) of the proposed

Table 1 Parameters and floating- Model Parameter FLOPs

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 6 The training loss diagram of each model

Table 2 Performance comparison Model PA mPA mIoU

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

(a)Original (b)Ground Truth (c)TFFNet (d)DeepLabV3+ (e)PSPNet (f)ENet

Table 3 Comparison of time Model Time

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Code availability Not applicable.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

You might also like

4 Experimental results and analysis

4.3 Results and analysis