145
THE USE OF REINFORCEMENT LEARNING FOR LTE NETWORKS
A.A. Qodirov (TUIT named after Muhammad al-Khwarizmi, Department of Data
Communication Systems and Networks, assistant and Doctoral student)
Artificial Intelligence and Machine Learning are much trending and also
confused terms nowadays. Machine Learning (ML) is a subset of Artificial
Intelligence. ML is a science of designing and applying algorithms that are able to
learn things from past cases. If some behaviour exists in past, then you may predict if
or it can happen again. Means if there are no past cases then there is no prediction.
ML can be applied to solve tough issues like credit card fraud detection, enable
self-driving cars and face detection and recognition. ML uses complex algorithms
that constantly iterate over large data sets, analyzing the patterns in data and
facilitating machines to respond different situations for which they have not been
explicitly programmed. The machines learn from the history to produce reliable
results. The ML algorithms use Computer Science and Statistics to predict rational
outputs. There are 3 major areas of ML (Figure 1.):
Figure 1. Types of Machine Learning
The theory of Machine Learning (ML) is a field of computer science, where
computing and network devices have the ability to learn without the need to be
explicitly programmed. In other words, the theory of ML sets the goal of finding and
developing mathematical models that allow computers to independently develop
algorithms for solving a particular computational or network problem. Reinforcement
learning
(
RL), an area of ML introduces the concepts of agent, environments, and
rewards (agent, environment, and reward, respectively) that describe the process of
optimizing a particular network task. An agent has a certain set of actions by which it
interacts with the environment. Performing a certain action, the agent receives a
reward from the environment, and, starting from the value of the received reward,
forms a certain idea of the optimality of the choice made. The main objective of
reinforcement learning is the task of comparing actions with the current situation in
the interaction environment in order to maximize the value of the reward received.
RL is a learning process in which an agent can periodicaliy make decisions,
observe the results, and then automatically adjust its strategy to achieve the optimal
policy. However, RL process consumes a lot of time to reach the optimal policy as it
has to explore and gain knowledge of an entire system, Deep RL(DRL) using Deep
learning improves the learning speed and the performance. In communications and
networking, DRL can be used to effectively address various problems and
146
challenges. For example, IoT devices and mobile users need to make local and
autonomous decisions, e.g., spectrum access, modulation techniques, coding
techniques, data rate selection, transmit power control, etc., to achieve the goals of
different networks including, e.g., throughput maximization and energy consumption
minimization. Under uncertain and stochastic environments, most of the decision-
making problems can be modeled by a Markov Decision Process. However, the
modern networks are large-scale and complicated, and thus the computational
complexity of the techniques rapidly becomes unmanageable. DRL can be used to
overcome the challenge. Thus, it enables network controllers, e.g., base stations to
solve non-convex and complex problems, e.g., joint user association, computation,
and transmission schedule, to achieve the optimal solutions without complete and
accurate network information. It allows network entities to learn and build
knowledge about the communication and networking environment. The network
entities can learn optimal policies, e.g., base station selection, channel selection,
handover decision, caching and offloading decisions, without knowing channel
model and mobility pattern. It also provides autonomous decision-making. Network
entities can make observation and obtain the best policy locally with minimum or
without information exchange among each other. This not only reduces
communication overheads but also improves security and robustness of the
networks. In large-scale networks, e.g., IoT systems with thousands of devices, DRL
allows network controller or IoT gateways to control dynamically user association,
spectrum access, and transmit power for a massive number of IoT devices and
mobile users. Other problems in communications and networking such as cyber-
physical attacks, interference management, and data offloading can be modeled as
games, e.g., the non-cooperative game. DRL has been recently used as an efficient
tool to solve the games, e.g., finding the Nash equilibrium, without the complete
information.
At the physical layer, LTE network performance is affected factors associated
with the propagation characteristics of the electromagnetic signal in a wireless
transmission medium, namely: the level of attenuation, the presence of signal
reflections from obstacles, interference, as well as the level of electromagnetic noise
in the medium, both broadband and narrowband. First of all, this affects the
probability of correct reception and decoding of the transmitted information, which,
in turn, affects the behavior of layer 2 protocols, which should ensure the reliability
of information transfer at the data link layer. Ultimately, the negative conditions for
wireless signal transmission increase the delivery time of frames from one node to
another, since the ARQ mechanism of the second level introduces a delay in the
retransmission of frames. The project will propose methods for selecting adaptive
modulation and coding (AMC) schemes based on reinforcement learning. At the data
link layer, in addition to its main function - reliable information delivery over the
radio channel, in wireless networks it implements a multiple access control (MAC)
mechanism that controls the access of multiple devices to a common radio resource
in the frequency, time and spatial ranges. Obviously, the more devices compete for
147
access to a shared radio resource, the longer the access latency. Accordingly, the
trans-mission time of the frame to the recipient is directly proportional to the number
of simultaneously competing subscriber devices. In addition, an increased number of
wireless devices increases the likelihood of interference, which, in turn, increases the
delivery time of information due to the frame retransmission procedure in the ARQ
mechanism. The project will develop methods for multiple access, distribution of
radio resources and frame transfer (HARQ, ARQ) based on reinforcement learning.
RL comprises three parts: an environment and an interpreter. The goal of the
RL is to train the agent in such a way that for a given environment state, it chooses
the optimal action that yields the highest reward. Q-learning can learn a good policy
by updating an action-value function without an operating model of the environment.
It can efficiently obtain an optimal policy when the state space and action space are
small. However, in practice, with complicated system models, these spaces are
usually large.As a result, the Q-learning algorithm may not be able to find the optimal
policy. Thus, Deep Q-Learning (DQL) algorithm is introduced to overcome this
shortcoming. When the state-action space is large and complex, deep Q network can
be used to approximate the Q function. Modern wireless networks are becoming
more and more complex. Their design needs higher computing capacity, bigger
datasets, faster and more intelligent learning algorithms, more flexible input
mechanism, etc. To achieve these, deep learning in wireless networks that can accept
a large number of network performance parameters, such as link signal-to-noise
ratios(SNRs), channel holding time, link access success/collision rates, routing delay,
packet loss rate, bit error rate, etc., and performs analysis on the intrinsic patterns is
needed.
References
1.
A. Geron. “Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow”.,the USA, 2019.
2.
N. C. Luong, D. T. Hoang. “Applications of Deep Reinforcement Learning in
Communications and Networking: A Survey”, IEEE Communications Surveys & Tutorials, 2019.
3.
S. S. Chitnavis. “Cross Layer Routing in Cognitive Radio Network Using Deep Cross Layer
Routing in Cognitive Radio Network Using Deep Reinforcement Learning”, A Thesis Submitted in
Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer
Engineering. 2018.
4.
https://marutitech.com/artificial-intelligence-and-machine-learning/
Do'stlaringiz bilan baham: |