Fig. 10.
An illustration of the structure of a 2D bio-inspired neural network.
Source:
Reproduced from
Cao and Peng
(
2018
).
using Dempster–Shafer (
Pagac et al.
,
1998
) information rule.
Ni et al.
(
2017
) proposed an improved dynamic bio-inspired neural network for
AUV path planning. The network changes dynamically according to
the detection range of sensors and introduces virtual targets into the
environment. Their experiments show that it can solve the complex
calculation problem of AUV in a very large 3D environment and the
problem that the target direction cannot be found when the size of
the obstacle is larger than the detection range of the sensor. Consid-
ering that the distance between obstacles and AUV should not be too
close,
Cao and Peng
(
2018
) proposed a potential field bio-inspired
neural network. In the proposed method, a repulsive potential field
is used to enlarge the obstacle and an attractive potential field is
introduced to optimize the path, which can effectively keep AUV away
from the obstacle.
In addition, a leader–follower biological inspired neural network
was also proposed for multi-AUV obstacle avoidance by
Ding et al.
(
2014
). Specifically, the velocity and trajectory of a virtual AUV is
obtained using the position of the leading AUV, then the kinematics
AUV formation control law is designed by backstepping (
Kwan and
Lewis
,
2000
). Once obstacles are detected, the main AUV will change
the formation into a straight line to pass through the obstacle area
with a bio-inspired neural network. Moreover,
Wu et al.
(
2018
) added
a lateral inhibition effect of obstacles into the bionic neural network,
which is effective for solving various collision avoidance problems of a
single AUV as well as multi-AUV in dynamic environments.
Sun et al.
(
2018a
) applied the glasius bio-inspired neural network to multiple
AUVs coverage path planning. Glasius bionic neural network used
difference equations to calculate the activity value of neurons, which
can improve the self-adaptability of the algorithm and greatly reduce
the path planning time of AUVs.
4.5. Reinforcement learning
Reinforcement learning allows an agent to learn how to perform a
task by interacting with environment via trial and error (
Sutton and
Barto
,
1998
). For obstacle avoidance of AUV with reinforcement learn-
ing, AUV first detects the surrounding information (e.g., information of
obstacles) as its current state estimation
𝑠
𝑡
and performs an action
𝑎
𝑡
,
which will transition the state
𝑠
𝑡
of the environment to a next state.
Then AUV will get a reward signal
𝑟
𝑡
from the environment, which will
be used to update its policy. The objective of AUV is to learn an optimal
policy which will select actions maximizing the expected cumulative
reward.
Fig. 11
illustrates the mechanism of an AUV path planning with
reinforcement learning.
The introduction of reinforcement learning into obstacle avoidance
of AUV enables it to learn through its own experience and gradually
adapt to the environment without knowing the complete prior knowl-
edge or even the prior knowledge at all.
Kawano and Ura
(
2002b
)
Ocean Engineering 235 (2021) 109355
11
C. Cheng et al.
Do'stlaringiz bilan baham: |