基于深度强化学习的机器人导航算法研究

doi:10.16749/j.cnki.jecjtu.20230209.001

首页 > 过刊浏览>2023年第40卷第1期 >67-74. DOI:10.16749/j.cnki.jecjtu.20230209.001

基于深度强化学习的机器人导航算法研究
DOI:
                        10.16749/j.cnki.jecjtu.20230209.001
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金项目(62067002，61967006，62062033)；江西省自然科学基金项目(20212BAB202008)；江西省交通厅科技项目(2022X0040)

Research on Robot Navigation Algorithm Based on Deep Reinforcement Learning

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

移动机器人穿越动态密集人群时，由于对环境信息理解不充分，导致机器人导航效率低且泛化能力弱。针对这一问题，提出了一种双重注意深度强化学习算法。首先，对稀疏的奖励函数进行优化，引入距离惩罚项和舒适性距离，保证机器人趋近目标的同时兼顾导航的安全性；其次，设计了一种基于双重注意力的状态价值网络处理环境信息，保证机器人导航系统兼具环境理解能力与实时决策能力；最后，在仿真环境中对算法进行验证。实验结果表明，提出的算法不仅提高了机器人导航效率还提升了导航系统的鲁棒性，主要表现为：在 500 个随机的测试场景中，碰撞次数和超时次数均为 0，导航成功率优于对比算法，且平均导航时间比最好的算法缩短了 2%；当环境中行人数量、导航距离发生变化时算法依然有效，且导航时间短于对比算法。

Abstract:

When the mobile robot passes through the dynamic dense crowd, due to the insufficient understanding of environmental information, the robot navigation efficiency is low and the generalization ability is weak. To solve this problem, a double-attention deep reinforcement learning algorithm is proposed. Firstly, the sparse reward function was optimized, and the distance penalty term and comfort distance were introduced to ensure that the robot approached the target while taking into account the safety of navigation. Secondly, a state value network based on double attention was designed to process environmental information to ensure that the robot navigation system has both environmental understanding ability and real-time decision-making ability. Finally, the algorithm was verified in the simulation environment. Experimental results show that the proposed algorithm not only improves the navigation efficiency, but also improves the robustness of the robot navigation system; The main performance is that in 500 random test scenarios, the collision times and timeout times are 0, the naviga-tion success rate is better than the comparison algorithm, and the average navigation time is 2% shorter than the best algorithm; When the number of pedestrians and navigation distance in the environment change, the algo-rithm is still effective, and the navigation time is shorter than the comparison algorithm.

参考文献

相似文献

引证文献

引用本文

黄晓辉,曾辉,舒垚淞,熊李艳.基于深度强化学习的机器人导航算法研究[J].华东交通大学学报,2023,40(1):67-74.
Huang Xiaohui, Zeng Hui, Shu Yaosong, Xiong Liyan. Research on Robot Navigation Algorithm Based on Deep Reinforcement Learning[J]. JOURNAL OF EAST CHINA JIAOTONG UNIVERSTTY,2023,40(1):67-74

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2025-07-02
出版日期:

学报首页

期刊简介

编委会

投稿须知

审稿须知

下载中心

学报党建

联系我们

English

引用本文

分享

文章指标

历史