基于深度强化学习的机器人导航算法研究
DOI:
作者:
作者单位:

华东交通大学信息工程学院

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(62067002, 61967006, 62062033);江西省教育厅项目(No.200604, No.GJJ190317);江西省自然科学基金面上项目(No.20212BAB202008)


Research on Robot Navigation Algorithm Based on Deep Reinforcement Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    移动机器人穿越动态密集人群时,由于对环境信息理解不充分,导致机器人导航效率低且泛化能力弱。针对这一问题,提出了一种双重注意深度强化学习算法。首先,对稀疏的奖励函数进行优化,引入距离惩罚项和舒适性距离,保证机器人趋近目标的同时兼顾导航的安全性;其次,设计了一种基于双重注意力的状态价值网络处理环境信息,保证机器人导航系统兼具环境理解能力与实时决策能力;最后,在仿真环境中对算法进行验证。实验结果表明,提出的算法不仅提高了导航效率还提升了机器人导航系统的鲁棒性,主要表现为:在500个随机的测试场景中,碰撞次数和超时次数均为0,导航成功率优于对比算法,且平均导航时间比最好的算法缩短了2%;当环境中行人数量、导航距离发生变化时算法依然有效,且导航时间短于对比算法。

    Abstract:

    When the mobile robot passes through the dynamic dense crowd, due to the insufficient understanding of environmental information, the robot navigation efficiency is low and the generalization ability is weak. To solve this problem, a double attention deep reinforcement learning algorithm is proposed. Firstly, the sparse reward function is optimized, and the distance penalty term and comfort distance are introduced to ensure that the robot approaches the target while taking into account the safety of navigation; Secondly, a state value network based on double attention is designed to process environmental information to ensure that the robot navigation system has both environmental understanding ability and real-time decision-making ability; Finally, the algorithm is verified in the simulation environment. Experimental results show that the proposed algorithm not only improves the navigation efficiency, but also improves the robustness of the robot navigation system. The main performance is that in 500 random test scenarios, the collision times and timeout times are 0, the navigation success rate is better than the comparison algorithm, and the average navigation time is 2% shorter than the best algorithm; When the number of pedestrians and navigation distance in the environment change, the algorithm is still effective, and the navigation time is shorter than the comparison algorithm.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-03-21
  • 最后修改日期:2022-04-02
  • 录用日期:2022-04-20
  • 在线发布日期: 2023-06-21
  • 出版日期: