Ocean Engineering 235 (2021) 109355
12
C. Cheng et al.
Moreover,
Hou et al.
(
2020
) designed a path planning algorithm based
on Deterministic Policy Gradient (
Heess et al.
,
2015
). According to the
nonlinear characteristics of AUV, it is not abstracted as a point, and
two suitable neural network approximators are developed to control
the propeller power and rudder position of AUV. The reward function
takes the distance of obstacles obtained by a sonar array and the
difference between the current distance (distance from current position
to target point) and previous distance (distance from previous position
to target point) into account. Their simulation results show that AUV
can plan a collision-free path in unknown continuous environment.
With deep reinforcement learning,
Havenstrøm et al.
(
2021
) directly
used 2D sonar images as one of the state inputs, and the output of
the system is processed by the low-pass filter to act on the control fin
for completing the path tracking and obstacle avoidance tasks at the
same time. In their method, the idea of course learning is introduced
by constructing different levels of scenes from no obstacles to the
introduction of obstacles and currents. In addition, a quadratic penalty
reward function is designed to analyze the proportion of path tracking
and obstacle avoidance in detail. They verified the effectiveness of their
method on a 3D simulation platform.
Although deep reinforcement learning has achieved good perfor-
mance for AUV path planning, problems such as sample inefficiency
still prevent it from being applied to AUV in the complex real world.
4.7. Other algorithms
Besides the above popular methods, there are also literatures about
the application of some other algorithms to AUV path planning. For
instance,
He and Zhou
(
2010
) proposed that the whole working space
of AUV should be divided dynamically for large-scale environments,
and the fast marching algorithm
Sethian
(
1999
) can be used to find
a collision-free path.
Wang et al.
(
2013
) proposed a vector-polar
histogram method for AUV obstacle avoidance, which can determine
an optimal movement direction for AUV when its sensor detects the
presence of multiple obstacles.
Sun and Zhu
(
2016
) used the
𝐷
∗
Lite
algorithm
Koenig and Likhachev
(
2005
) to repeatedly confirm whether
the path distance from the current point to the target point is the
shortest. Besides, it can also quickly replan the path of AUV when a
moving obstacle is detected.
Braginsky and Guterman
(
2016
) designed
an obstacle avoidance method for AUV considering both horizontal
and vertical directions. In the horizontal direction, a two-layer obstacle
avoidance algorithm is adopted which includes a pre-planning method
based on the BK product of fuzzy relations (
Bui and Kim
,
2006
) and
a reactive obstacle avoidance algorithm based on the potential field
and edge detection algorithm (
Borenstein and Koren
,
1991
). When the
obstacle fills the entire field of view of the sonar, it can stimulate
the reactive vertical travel. In addition,
Wang et al.
(
2016
) proposed
a rolling window optimization algorithm to avoid unknown obstacles.
Yan et al.
(
2018a
) used the largest polar angle algorithm to generate
obstacle avoidance contours for irregular obstacles. Moreover,
Liu et al.
(
2019
) proposed a learning fixed height histogram method based on
estimation of distribution algorithm (
Larranaga and Lozano
,
2001
) to
complete path planning in dynamic environments. In the proposed
method, a plan window which can dynamically change its size is
introduced to deal with moving obstacles.
4.8. Discussion
Table 2
compares advantages and disadvantages of above surveyed
popular methods for local path planning with unknown and/or dy-
namic obstacles. Among them, the RRT algorithm is based on prob-
ability sampling, which can consider algebraic constraints caused by
obstacles and differential constraints caused by dynamics of AUV at
the same time. Through random sampling of points, the search space
of RRT can be easily extended to the unexplored area, which is very
suitable for solving path planning problems in high-dimensional space.
Therefore, the RRT algorithm has a strong exploration ability in en-
vironments with unknown obstacles, though its real-time performance
is not too high compared with other methods. Artificial potential field
is simple in structure and easy to be implemented for AUV control at
the bottom layer. And it plays an important role in real-time obstacle
avoidance for AUV. However, artificial potential field does not consider
the constraints of dynamics of AUV and the obstacle size. When mul-
tiple obstacles are close to each other, AUV with artificial potential
field may fail to find the direction to travel and easily fall into the
local minimum point. The fuzzy logic algorithm has strong robustness
in dealing with practical problems, and has been widely used in AUV
to avoid unknown and dynamic obstacles. It does not need an accurate
mathematical model, and is suitable for solving highly complex and
nonlinear problems. However, the formulations of fuzzy rules and
membership degree in the fuzzy logic algorithm rely heavily on experts’
knowledge, which cannot be changed once it is determined. Therefore,
in unknown and uncertain environments, the fuzzy logic algorithm
might not work well since no prior knowledge can be obtained.
On the other hand, neural network can store empirical knowledge
and deal with nonlinear mapping problems by learning autonomously
with simple rules. However, traditional neural networks need to collect
samples before learning, which is a very time-consuming process and
might be difficult or even impossible in many situations for AUV
path planning. Bio-inspired neural network was proposed to solve this
problem. It does not need any pre-training process and is very suitable
to deal with unknown dynamic environments. Bio-inspired neural net-
work still has the shortcoming that it cannot explain the reasoning basis
for the output. Reinforcement learning (RL) has strong decision-making
ability and AUV with reinforcement learning can plan an optimal path
without any prior knowledge. Moreover, RL shows strong adaptabil-
ity and flexibility in complex and uncertain environments. However,
hand-crafted features need to be used for state representation in RL.
In addition, dimension disaster prevents AUV from planning a path
efficiently in high-dimensional environments and reward delay results
in a slow speed of convergence for AUV path planning with RL. Deep
reinforcement learning (DRL) does not require manual features for state
representation but learn the features automatically. It implements an
end-to-end learning from raw sensory input data to AUV’s actions,
which allows AUV to learn to plan an optimal path in high-dimensional
and complex environments. However, usually millions of samples are
needed for AUV with DRL to learn to plan an optimal path, which
prevents it from applying to real AUV platforms.
Do'stlaringiz bilan baham: |