Path planning and obstacle avoidance for auv: a review

Download 1,78 Mb.

Pdf ko'rish

bet	18/24
Sana	01.01.2022
Hajmi	1,78 Mb.
	#302313

1 ... 14 15 16 17 18 19 20 21 ... 24

Bog'liq
OceanEngineering2021-cheng

5.2. Path planning in uncertain environments

For local path planning, the fuzzy logic algorithm does not need ac-

curate mathematical models and have been tested to be able to achieve

Ocean Engineering 235 (2021) 109355

14

C. Cheng et al.

good results. However, the fuzzy rules are usually defined based on

the expert’s experience and cannot adapt to the environment. In com-

plex uncertain underwater environment, there is no prior knowledge

available and the construction of fuzzy rules would be difficult or even

impossible. The path planning of AUV with reinforcement learning (RL)

can plan an optimal path by interaction with the environment without

any prior knowledge. Therefore, AUV with reinforcement learning can

adapt flexibly and work well in complex and uncertain environments.

AUV with deep reinforcement learning (DRL) can even learn in high-

dimensional and complex environments from raw sensory input data in

an end-to-end way.

However, RL and DRL is sample inefficient and slow to converge

because of the scarce reward signals (

Goecks et al.

2020

). It is difficult

or even unpractical to design an efficient reward function for each task,

which makes applying traditional RL and DRL methods directly to path

planning of physical AUVs a great challenge (

Riedmiller et al.

2018

Nevertheless, sampling in a simulated environment is faster, cheaper

and safer than learning directly in the real world, but using the policy

trained in simulation directly in the real AUV is difficult and risky since

there is a gap between simulation and reality (

Kober and Peters

2014

Many sim-to-real algorithms have been proposed to solve this problem,

such as domain adaption (

Tzeng et al.

2015

), inverse dynamics model

(

Christiano et al.

2016

), domain randomization (

Tobin et al.

2017

)

and progressive network (

Shojania and Li

2007

), etc., but there seems

no work on path planning of AUV in terms of this prospect yet (

Zhao

et al.

2020

On the other hand, some researchers proposed to leverage human’s

knowledge to speed up AUV’s learning, e.g., by allowing a human

trainer to provide demonstrations, evaluative feedback (

Li et al.

2019a

)

etc. For example,

Chu et al.

(

2020

) proposed a deep imitation reinforce-

ment learning (DIRL) for motion control of the unmanned underwater

vehicles (UUVs). DIRL combines imitation learning from expert demon-

strations and used the learned policy to initialize the TD3 algorithm

(

Fujimoto et al.

2018

). In addition,

Zhang et al.

(

2020

) proposed

deep interactive reinforcement learning for AUV path tracking task

by allowing a human trainer to transfer her knowledge via delivering

evaluative feedback over the quality of AUV’s actions. Therefore, how

to make full use of human experience and knowledge to improve AUV

path planning would be an interesting research direction.

5.3. Combination of different path planning algorithms

The above surveyed path planning methods all have their own ad-

vantages and disadvantages in specific application scenarios. It would

be immensely useful to combine multiple path planning algorithms,

which can complement each other and better deal with unknown dy-

namic obstacles and complex situations. For instance, the combination

of fuzzy logic algorithm and reinforcement learning can improve the

control accuracy of the system in strong current (

Yang et al.

2009

Adaptive neuro fuzzy inference system and particle swarm optimization

algorithm can be used together to generate feasible path in environ-

ments full of moving targets (

Yan et al.

2018b

). Quantum particle

swarm optimization algorithm with selective differential evolution can

significantly shorten the time to generate the best path (

Lim et al.

2020a

). In addition,

Yao and Zhao

(

2018

) used the improved genetic

algorithm combined with the gray wolf optimization algorithm (

Mir-

jalili et al.

2014

) to optimize the improved interfered fluid dynamical

(

Yao et al.

2015

) coefficient, and verified that the combination of path

planning algorithm and some mathematical methods can better deal

with dynamic obstacles.

Download 1,78 Mb.

Do'stlaringiz bilan baham:

1 ... 14 15 16 17 18 19 20 21 ... 24