Path planning and obstacle avoidance for auv: a review

Download 1,78 Mb.

Pdf ko'rish

bet	14/24
Sana	01.01.2022
Hajmi	1,78 Mb.
	#302313

1 ... 10 11 12 13 14 15 16 17 ... 24

Bog'liq
OceanEngineering2021-cheng

Fig. 11.

Illustration of an AUV path planning with reinforcement learning.

Source:

Modified from

Sutton and Barto

(

1998

proposed a path planning algorithm combining Q-learning (

Watkins

and Dayan

1992

), a teaching method and Bayesian network for non-

holonomic AUV. The teaching method includes intensive teaching of

suggestions for AUV choosing actions and global teaching of keeping a

distance from the target point in the whole learning process. Learning

experience is stored in a Bayesian network, which enables AUV to

deal with obstacles of any shape. In addition, the error caused by the

coupling of current and the yaw motion is taken as the input, and

the continuous iterative learning is used to better resist the current.

However, because of the influence of non-Markovian effect, the pro-

posed method is slow to converge. Therefore, they further proposed a

hierarchical reinforcement learning approach, in which the high level

is composed of a motion planning module considering the position of

AUV and the low level refers to the speed of AUV to stabilize the yaw

motion

Kawano and Ura

(

2002a

). As a result, the learning speed of

the algorithm is shown to be significantly improved. In consideration

of the high risk of trial and error,

Chen et al.

(

2009

) proposed to use

neural network and case-based Q learning (

Greenwald et al.

2003

)

for AUV path planning. The neural network with multi-layer error

feedback has a strong approximation ability which can improve the

generalization of Q learning, and case-based Q learning was used to

guarantee the convergence. With information provided by a multi-

beam forward sonar, the proposed method was shown to be able to

enable AUV to find an optimal path among multiple obstacles. In the

real-time obstacle avoidance of small AUVs, a single beam sonar is

used to measure information of obstacles in turn, and the steering

action is selected based on a reinforcement learning method (

Huang

et al.

2014

). When AUV approaches an obstacle, it gains a negative

enhancement, and when AUV moves away from the obstacle it receives

a positive reward. The simulation results show that AUV can safely

avoid the obstacles in the range of 90 degree open angle by learning to

control the change of propeller and course of AUV.

In the presence of dynamic obstacles,

Gore et al.

(

2019

) show

that by allowing AUV to obtain state information within a Markov

decision process, it can learn to take corresponding actions to obtain

the path with minimum deviation from obstacles.

Noguchi and Maki

(

2019

) applied SARSA(

𝜆

) to path planning of AUV, and they show

that it can find collision-free paths to capture sea urchins in com-

plex environments. In their method, considering the limitation and

fuzziness of the information obtained with sonar sensors, the map

based on occupancy probability is used to obtain the state information

of AUV. In addition,

Bhopale et al.

(

2019

) proposed a modified Q-

learning algorithm based on back-propagation neural network to deal

with unknown obstacles in the environment. The proposed method

overcomes the problem of dimensional disaster, and introduces a factor

Download 1,78 Mb.

Do'stlaringiz bilan baham:

1 ... 10 11 12 13 14 15 16 17 ... 24