Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views5 pages

Lightweight Channel Estimation Networks For OFDM Systems

灯光信道估计的研究论文

Uploaded by

764024721
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

Lightweight Channel Estimation Networks For OFDM Systems

灯光信道估计的研究论文

Uploaded by

764024721
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2066 IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 11, NO.

10, OCTOBER 2022

Lightweight Channel Estimation Networks for OFDM Systems


Jinbao Li and Qi Peng , Member, IEEE

Abstract—Channel estimation using neural networks has (CV) and natural language processing but has also been widely
proven to be a breakthrough technology in the communications used in channel estimation [4]–[8].
field. However, to achieve good performance, the existing stud- The authors of [4] treated the communication system as a
ies mostly focus on building a complex neural network with a
large number of layers, which consumes a lot of storage space black box, utilized the powerful data processing capability of
and computing resources. In this letter, a lightweight channel the full connection layer, and proposed an end-to-end learn-
estimation Transformer (LCET) is proposed for an orthogo- ing communication system based on a DL network. In [5], a
nal frequency division multiplexing (OFDM) system with pilots, combination of traditional methods and neural networks was
based on the application of a convolutional neural network (CNN) proposed. Traditional methods are first used for rough cal-
and Transformer. The network uses a lightweight feature extrac-
tion CNN (LFEC) to extract channel features and then feeds culations; thereafter, a neural network is used to obtain the
the processed features into a lightweight-adjusted Transformer results. The time-frequency response of the communication
(LAT) for channel estimation. Specifically, LFEC extracts deep channel can be regarded as an image, allowing an image
channel features at a low computational cost and retains the super-resolution CNN (SRCNN) used in CV to be used for
high frequency information. The LAT solves the problem of long- channel estimation [6]–[8]. A combination of the SRCNN
range feature information loss with low resource consumption.
The experimental results revealed that LCET performed bet- and DnCNN was used in [6] for channel achieved satisfactory
ter than the traditional least squares estimation algorithm and results. A deep residual channel estimation network (ReEsNet)
state-of-the-art learning network. was used in [7] to estimate the channel. Furthermore, in [8],
Index Terms—Channel estimation, deep learning, orthogonal the channel distribution was modeled based on an image super-
frequency division multiplexing, lightweight network. resolution generative adversarial network (SRGAN), which
achieved state-of-the-art performance. In the training stage,
the GAN learns from the adversarial discriminator and gener-
ator. In the prediction stage, only the generator part must be
I. I NTRODUCTION considered, which can reduce the model deployment difficulty.
O MEET the increasing demand for communication,
T the data rate of wireless communication systems must
be improved. Orthogonal frequency division multiplexing
In a doubly selective fading channel, the extraction of
high-frequency and global features is the key to comprehen-
sive channel estimation. However, in previous studies, high-
(OFDM) [1] is widely used in communication systems owing frequency features are missing or not sufficiently extracted.
to its good anti-multipath fading ability and spectrum utiliza- The channel estimation methods based on deep learning tend
tion. It is one of the key technologies in current 4G and 5G to deepen the model depth to extract the features, thereby esti-
wireless communication networks. With the development of mating the channel more accurately. With the deepening of
modern transportation vehicles, the fast mobility and high data model depth, the neural network becomes bloated and reduces
rates of wireless communication systems will make the chan- the neural network’s grasp of long-range features.
nel have doubly selective fading characteristics [2]. This is a Motivated by this, this letter proposes a DL-based OFDM
challenge to the real-time performance of the system; there- channel estimation network, in which the time-frequency grid
fore, it is important to improve the estimation performance of of the channel response is regarded as a two-dimensional (2-D)
fading channels and ensure the reliability of mobile commu- image and channel estimation is realized by a neural network.
nication while maintaining the real-time performance of the The high-frequency and global features of the channel are
system. respectively extracted via the high-frequency and adaptive fea-
In OFDM systems, accurate acquisition of channel state ture extraction modules of a lightweight CNN in front of
information (CSI) is the key to transmitting data correctly, the network. The extracted channel features are then fed into
which directly affects the correlation detection and demodula- a lightweight adjusted Transformer network in the rear and
tion effect after channel estimation [3]. In recent years, deep comprehensively expressed and analyzed. In addition to sav-
learning (DL) has not only performed well in computer vision ing computing costs, the network also has a good channel
estimation performance.
Manuscript received 17 June 2022; accepted 18 July 2022. Date of pub-
lication 22 July 2022; date of current version 7 October 2022. This work
was supported by the National Natural Science Foundation of China, under II. S YSTEM M ODEL
Grant 61601354. The associate editor coordinating the review of this article
and approving it for publication was M. Derakhshani. (Corresponding author: This letter was based on a single-antenna OFDM system.
Qi Peng.) In a fast-moving scenario, consider an OFDM frame with M
The authors are with the School of Microelectronics, Xidian University,
Xi’an 710071, China (e-mail: [email protected]; [email protected]). subcarriers and N time slots(each time slot contains an OFDM
Digital Object Identifier 10.1109/LWC.2022.3193199 symbol). The ranges of the subcarrier index m and slot index
2162-2345 
c 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on January 19,2025 at 14:01:45 UTC from IEEE Xplore. Restrictions apply.
LI AND PENG: LIGHTWEIGHT CHANNEL ESTIMATION NETWORKS FOR OFDM SYSTEMS 2067

B. Overall Network Structure


We propose an LCET for selective channels, the structure
of which is illustrated in Fig. 2. The network consists of
three main parts: lightweight feature extract CNN (LFEC),
lightweight adjusted Transformer (LAT), and post-upscale
module. The LFEC is added before the LAT to completely
utilize the Transformer’s potential. First, we use a convolu-
Fig. 1. Pilot extraction process from the time-frequency grid. Colored dots
represent pilots, which are inserted into the OFDM frame grid according to tional layer to extract the preliminary characteristics(Fl ) from
a specific rule. HP . Fl is then sent into the LFEC to extract higher-level
features(Fh ). The output of the LFEC is fed into the LAT
to process the channel features extracted by the LFEC. Then
n are [0, M − 1] and [0, N − 1], respectively. The received the output of the LAT(FT ) and Fl are fed into the post-upscale
signal Y(m, n) with the cyclic prefix removed and discrete module simultaneously to restore the feature map to the chan-
Fourier transform performed at the mth subcarrier and the nth nel size, the outputs of the upscale modules are added. Finally,
slot can be expressed as follows: the features are adjusted through convolution to obtain the
estimated channel.
Y (m, n) = H (m, n)X (m, n) + W (m, n), (1)
C. Lightweight Feature Extract CNN
where Y , H , X , W ∈ CM ×N , and X(m, n), H(m, n), and
As the front end of the LCET, the LFEC extracts potential
W(m, n) represent the transmitted symbol, channel response,
channel features in advance so that the model has an initial
and additive white Gaussian noise with zero mean and variance
2 at the mth subcarrier and the nth time slot, respectively. channel estimation capability. To reduce the computation time
σw
of the network, LFEC is proposed to reduce the size of the
To estimate the channel response, pilot symbols with a spe-
processing feature. However, the reduction in the feature map
cific arrangement are usually inserted into the OFDM frame.
size often leads to the loss of feature details, leading to inaccu-
The pilot structure used in [4], [5] is a block-type pilot in
rate channel estimation. To solve this problem in the LFEC, we
which pilots are placed in all subcarriers. This pilot struc-
extract the channel features by combining the Adaptive Feature
ture has a large pilot overhead, which is not conducive for
Extract (AFE) and High-frequency feature extract (HFE) while
improving the transmission efficiency. A lattice-type pilot that
reducing the size of the feature map.
can save pilot overhead and improve transmission efficiency
The structure of the LFEC is shown in Fig. 2. First, adaptive
was used in this letter, as shown in Fig. 1.
feature and high-frequency feature extractions are conducted
The received symbol at the pilot position YP ∈ CMP ×NP
using AFE and HFE, respectively, and then the extracted fea-
can be expressed as follows:
tures are concatenated. The concatenated features go through
a convolution layer (obtain F1 ) to integrate the different fea-
YP = HP XP + W P , (2)
tures of the two branches, reducing the number of channels and
parameters. For F1 , three AFEs are used to further extract the
where XP ∈ CMP ×NP represents the pilot symbols, and HP 
potential features (denoted as F1 ) for completing the channel
and WP ∈ CMP ×NP are the channel response and channel 
estimation. Subsequently, F1 is concatenated with F1 , which
noise of the pilot position, respectively. The term MP denotes
has a residual connection with parameters, to preserve the ini-
the number of pilot symbols placed along the subcarrier axis,
tial details and obtain feature F2 . Because F2 is concatenated
whereas NP denotes the number of pilot symbols placed along
by two features, a conventional convolution layer with a 3 × 3
the time-slot axis.
kernel is used to reduce the number of channels. The channel
Because the pilot is artificially designed, the value at the
attention module [9] is then used to highlight channels with
pilot can be extracted from a known position. The goal
high activation values. Subsequently, AFE is used to extract
of channel estimation is to estimate H according to the
the final features. Finally, the input feature (Fin ) is added to
information of the pilot position HP .
the output (Fout ) using the global residual connection to learn
the residual information from the input and stabilize training.
1) AFE: This letter presents an adaptive feature extraction
III. P ROPOSED M ETHOD
module for basic feature extraction, which has the advan-
A. Image of Channel Response Time–frequency Grid tages of a small number of parameters and good performance.
In this letter, the channel time–frequency response matrix The adaptive feature extract structure is shown in Fig. 3a;
(of size M × N) with complex numbers between the OFDM after two reduction-expansion modules(REMs), the channel
system transmitter and receiver is regarded as two 2D images feature depth gradually increases, and high-level features are
(one image represents the real part, and the other represents extracted. To alleviate the problem of gradient disappear-
the imaginary part). The channel matrix at the pilot position is ance caused by the depth increase, residual connections with
regarded as an low-resolution channel matrix (of size MP × parameters were added in the expansion process. The residual
NP × 2), and is restored to an high-resolution channel matrix connection with parameters can adaptively adjust the resid-
(of size M × N × 2) by image processing technology. ual features, which is equivalent to adding a simple attention

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on January 19,2025 at 14:01:45 UTC from IEEE Xplore. Restrictions apply.
2068 IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 11, NO. 10, OCTOBER 2022

Fig. 2. Structure of LCET.

this method will destroy the correlation between each subcar-


rier in the channel matrix and reduce the channel estimation
performance.
To process the 2-D matrix of the channel, unfolding and
folding were used to perform pre- and post-Transformer oper-
ations on the features before and after the Transformer, as
shown in Fig. 2. Specifically, the pre-Transformer operation
involves unfolding the initial features (of size C × H × W)
into patch sequences, which have P = H W
Fig. 3. (a) AFE and (b) HFE structures.
a × b patches
and size of C × a × b, using a kernel size of (a, b); i.e.,
mechanism to better extract channel features, improve gradi- Fip ∈ RC ×a×b , i = 1, 2, . . . , P . These sequences were fed
ent flow, and automatically adjust the content of the residual directly into the LAT, where the input and output had the same
feature map input. Notably, to have good channel estimation sizes. After the LAT, the post-Transformer operation recreates
performance and minimize parameters, in the AFE, which is the output fold of the LAT to its original feature size using
used many times in the main network, Depthwise separable a kernel of the same size as the pre-Transformer operation.
convolution (DSC) [10] is used to replace the traditional con- Notably, the pre-Transformer operations reflect the location
volution, thereby reducing the model parameters and achieving information of each patch; therefore, the position embedding
a lightweight design. In addition, the REM in Fig. 3a is com- steps in the standard Transformer were removed to further
posed of a half-channel DSC, a double-channel DSC, and a reduce the model parameters and save computing costs.
residual connection with parameters. The main structure of the LAT is illustrated in Fig. 2. For
2) HFE: The high-frequency channel feature contains simplicity and efficiency, only one self-attention structure and
abundant contour information and is an indispensable part of one feedforward structure were used. In addition, layer nor-
an accurate channel estimation. Based on this, the HFE module malization (Norm) was used ahead of both structures, and
is used to extract high-frequency channel features, as shown in residual connections were added after both structures. It is
Fig. 3b. The goal of the HFE is to estimate the high-frequency worth noting that to further reduce the model size, we set the
features of the image from the HP space. Transformer head number to 1.
First, an average pooling layer is applied to the input fea- 1) Self-Attention: Self-attention is an important component
tures. The kernel of the pooling layer is (3, 1), which indicates in Transformer encoder. Specifically, the input is mapped to
that each value of the output feature map of the average pool- three different spaces: q (query: to match others), k (key: to
ing layer is the average intensity within the 3 × 1 area of the be matched), and v (to be matched).
input feature map. The bilinear interpolation is then applied to qi = Fip Wiq , ki = Fip Wik , vi = Fip Wiv , (3)
restore the output of the average pooling layer to the size of
q
the HFE input. Finally, the input features subtract the average where Wi , Wik , and Wiv represent projection parameter
input features element by element to obtain high-frequency matrices. Let q do attention on k and multiply v to obtain
features. the result according to what q gives attention from k. The
calculation of the self-attention can be written as follows:
 
D. Lightweight-Adjusted Transformer
QK T
Typically, the standard Transformer [11] takes a one- Attention(Q, K , V ) = softmax √ V (4)
dk
dimensional (1-D) sequence as an input and learns the long-
distance dependencies of the sequence. For channel estimation where Q, K and V is the matrix of q, k and v, respectively,
tasks, the input is always a 2-D matrix. A common way and dk represents the dimension of k.
to convert a 2-D matrix into a 1-D sequence is to arrange 2) Feed Forward: In addition to the self-attention layer, the
each element of the matrix in row or column order. However, Transformer encoder contains a fully connected feed-forward

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on January 19,2025 at 14:01:45 UTC from IEEE Xplore. Restrictions apply.
LI AND PENG: LIGHTWEIGHT CHANNEL ESTIMATION NETWORKS FOR OFDM SYSTEMS 2069

TABLE I
S IMULATION E NVIRONMENT PARAMETERS

Fig. 4. Ablation study for AFE, HFE and SNR-based coefficient.


network consisting of two linear layers with ReLU activation
between them.

E. Implementation Details
For the post-upscale part, we used deconvolution [12] to
convert the last small upscale into a channel dimension and
used one convolution layer before and after the deconvolu-
tion operation to aggregate the upscale information before and
after. In the LFEC, we set the initial value of the learnable
weights to 1. To save GPU memory, (a, b) in the pre- and
post-processing of the LAT was set to (2,1).
To ensure reliable communication quality, communication
systems usually operate under appropriate SNR. LCET needs Fig. 5. Channel estimation MSE for five estimation methods at a receiver
to extract the high-frequency features of the channel for chan- speed of 80 km/h. (a) 18 × 7 pilots. (b) 12 × 7 pilots.
nel estimation, but the high-frequency features of the channel
are easily affected by noise, while the channel data in low
SNRs contains a lot of noise, which is not conducive to chan- The dataset used for training and testing was generated
nel estimation. The loss function used to train the LCET offline based on the VehicularA channel model. The dataset
network was L1 (L1 loss), which treats data signals with dif- contains 24,000 sets of channel samples, of which 60% were
ferent SNRs equally, but compared with data signals with low used for training, 20% for validation, and 20% for testing.
SNRs, data signals with high SNRs have a high degree of cred- The dataset used in this letter was a combination of six sets
ibility. For a network more suitable for processing channel data of data with an SNR ranging from 5 to 30 dB (at intervals
with a mixed signal-to-noise ratio (SNR), the loss function was of 5 dB), such that the trained model can adapt to a realistic
multiplied by an SNR-based coefficient as follows: environment, contributing to the robustness of the model. For
LCET, the training rate is set to 0.001 with training batch size
1 of 64 and total epoch of 200. In order to speed up the con-
Loss = L1 (5)
1 + e −0.4(SNR−12) vergence, the Adam algorithm is used to optimize the model.
In this letter, we pay more attention to channel data with This simulation was performed on a self-built server platform
SNR higher than 15 dB. We choose a step-type function and (Python 3.8, PyTorch 1.19.0, GPU: Nvidia 3090).
adjust the parameters so that the step position is around 15 dB. In order to verify the necessity of AFE and HFE in the
In order to increase the robustness of the proposed network, the LCET structure and the improvement of channel estimation
channel data with low SNRs cannot be completely abandoned, performance by SNR-based coefficient in the training stage, an
so we adjust the parameters to make the weight of the proposed ablation study was conducted in this letter, as shown in Fig. 4.
loss function transition smoothly between low SNR and high The results show that AFE and HFE play an important role
SNR. in channel estimation by LCET. In addition, the results also
demonstrate that the channel estimation performance of the
network in high SNRs can be improved by assigning different
IV. E XPERIMENT R ESULTS weights to the loss functions of different SNR signals during
In this letter, we considered a single-antenna situation in network training.
which an OFDM signal was generated according to the long- Generally, the performance of channel estimation is directly
term evolution standard, and we used the lattice-type pilot to proportional to the number of pilots; however, an increase in
insert the pilot along the time and frequency axes. We used the the number of pilots reduces the transmission efficiency. With
VehicularA channel model, which is a multipath time-varying this trade-off, we simulated two cases, where the speed of the
channel model based on generalized stationary uncorrelated receiver was 80 km/h, and the number of pilots was 18 × 7 or
scattering. The corresponding simulation parameters are listed 12 × 7. The mean square errors (MSEs) of six channel estima-
in Table I. tion methods for two pilot numbers are shown in Fig. 5. The

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on January 19,2025 at 14:01:45 UTC from IEEE Xplore. Restrictions apply.
2070 IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 11, NO. 10, OCTOBER 2022

neural network-based channel estimation requires hardware


acceleration design to meet the real-time requirements. For the
channel estimation network proposed in this letter, the accel-
erated architecture should focus on optimizing the memory
architecture, off-chip memory access, data path and processing
elements by using heterogeneous multi-core architecture.

V. C ONCLUSION
This letter proposed a lightweight neural network for
channel estimation in OFDM systems. In this network, the
Fig. 6. The BER for six channel estimation methods and perfect channel. channel features are extracted on the basis of retaining the
high frequency features of the channel. Transformer struc-
TABLE II
C OMPLEXITY C OMPARISON OF F OUR DL-BASED C HANNEL
ture is used to integrate channel features and solve the
E STIMATION M ETHODS problem of information loss caused by the increase of network
depth. Moreover, the actual deployment requirements such
as reducing model parameters and improving channel esti-
mation performance were considered in the network design.
The model proposed in this letter minimizes model parame-
ters while ensuring reliable channel estimation performance.
Experimental results show that the LCET proposed in this
letter has fewer parameters and improved channel estimation
LS estimations use the pilot location to obtain information of performance compared with the SRGAN, which is superior to
the non-pilot location through interpolation, and it performs the traditional LS channel estimation algorithm and close to
the worst among the six estimation methods. The LMMSE the LMMSE algorithm for multiple pilot numbers.
is the MMSE for jointly Gaussian distributed random vari-
ables [13], and LMMSE usually requires second-order channel
R EFERENCES
statistics and noise variance as prior information, which can
be used as the lower bound for best performance. For the [1] L. J. Cimini, Jr., “Analysis and simulation of a digital mobile chan-
nel, using orthogonal frequency division modulation,” IEEE Trans.
other three deep learning-based methods, the ChannelNet [6] Commun., vol. C-33, no. 7, pp. 665–675, Jul. 1985.
network model is large and the performance is not adequate; it [2] E. P. Simon, L. Ros, H. Hijazi, and M. Ghogho, “Joint carrier frequency
is only better than the LS estimation. The performance of the offset and channel estimation for OFDM systems via the EM algorithm
in the presence of very high mobility,” IEEE Trans. Signal Process.,
LCET in the range of low SNR is 1–2 dB better than those of vol. 60, no. 2, pp. 754–765, Feb. 2012.
the ReEsNet [7] and SRGAN [8] with the same small model. [3] C. Wen, W. Shih, and S. Jin, “Deep learning for massive MIMO CSI
The performance of the LCET in the range of high SNR is also feedback,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751,
Oct. 2018.
better than that of the SRGAN by approximately 1 dB. As the [4] H. Ye, G. Y. Li, and B. Juang, “Power of deep learning for channel esti-
number of pilots decreases, the channel estimation becomes mation and signal detection in OFDM systems,” IEEE Wireless Commun.
more challenging. As shown in Fig. 5b, the LCET performance Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
[5] X. Gao, S. Jin, C. Wen, and G. Y. Li, “ComNet: Combination of deep
slightly decreases, but it is still superior to that of the SRGAN. learning and expert knowledge in OFDM receivers,” IEEE Commun.
Fig. 6 shows the bit error ratio(BER) for six channel esti- Lett., vol. 22, no. 12, pp. 2627–2630, Dec. 2018.
mation methods and perfect channel. The results show that the [6] M. Soltani, V. Pourahmadi, A. Mirzaei, and H. Sheikhzadeh, “Deep
learning-based channel estimation,” IEEE Commun. Lett., vol. 23, no. 4,
BER of LCET is lower than that of other three deep learning- pp. 652–655, Apr. 2019.
based channel estimation networks and LS channel estimation, [7] L. Li, H. Chen, H. Chang, and L. Liu, “Deep residual learning meets
and comparable to than that of LMMSE. ofdm channel estimation,” IEEE Wireless Commun. Lett., vol. 9, no. 5,
pp. 615–618, May 2020.
In the online prediction stage, the floating point opera- [8] S. Zhao, Y. Fang, and L. Qiu, “Deep learning-based channel estimation
tions(FLOPs), parameters and prediction time of four DL- with SRGAN in OFDM systems,” in Proc. IEEE Wireless Commun.
based channel estimation methods are shown in Table II. The Netw. Conf. (WCNC), Nanjing, China, 2021, pp. 1–6.
[9] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation
results show that the number of parameters of LCET is the networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 8,
least among the four models, which is beneficial to reduce pp. 2011–2023, 1 Aug. 2020.
the resource consumption when the model is deployed to [10] A.G. Howard, “MobileNets: Efficient convolutional neural networks for
hardware. Due to the lightweight design considerations of mobile vision applications,” 2017, arXiv:1704.04861.
[11] A. Vaswani, N. Shazeer, and N. Parmar, “Attention is all you need,”
LCET, the FLOPs of LCET is less than ChannelNet and 2017, arXiv:1706.03762.
SRGAN, and only more than ReEsNet. In order to extract [12] V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep
channel features for channel estimation, LCET has a rela- learning,” 2016, arXiv:1603.07285.
[13] D. Neumann, T. Wiese, and W. Utschick, “Learning the MMSE channel
tively deep network and requires a longer prediction time estimator,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2905–2917,
than other channel estimation networks. However, the existing Jun. 2018.

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on January 19,2025 at 14:01:45 UTC from IEEE Xplore. Restrictions apply.

You might also like