基于深度强化学习的机器人导航算法研究
作者:
作者单位:

作者简介:

熊李艳(1968—),女,教授,硕士,硕士研究生导师,研究方向为交通大数据。E-mail:276477130@qq.com

通讯作者:

中图分类号:

U495;TP242

基金项目:

国家自然科学基金项目(62067002,61967006,62062033);江西省自然科学基金项目(20212BAB202008);江西省交通厅科技项目(2022X0040)


Research on Robot Navigation Algorithm Based on Deep Reinforcement Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    移动机器人穿越动态密集人群时,由于对环境信息理解不充分,导致机器人导航效率低且泛化能力弱。 针对这一问题,提出了一种双重注意深度强化学习算法。 首先,对稀疏的奖励函数进行优化,引入距离惩罚项和舒适性距离,保证机器人趋近目标的同时兼顾导航的安全性;其次,设计了一种基于双重注意力的状态价值网络处理环境信息,保证机器人导航系统兼具环境理解能力与实时决策能力;最后,在仿真环境中对算法进行验证。 实验结果表明,提出的算法不仅提高了机器人导航效率还提升了导航系统的鲁棒性,主要表现为:在 500 个随机的测试场景中,碰撞次数和超时次数均为 0,导航成功率优于对比算法,且平均导航时间比最好的算法缩短了 2%;当环境中行人数量、导航距离发生变化时算法依然有效,且导航时间短于对比算法。

    Abstract:

    When the mobile robot passes through the dynamic dense crowd, due to the insufficient understanding of environmental information, the robot navigation efficiency is low and the generalization ability is weak. To solve this problem, a double-attention deep reinforcement learning algorithm is proposed. Firstly, the sparse reward function was optimized, and the distance penalty term and comfort distance were introduced to ensure that the robot approached the target while taking into account the safety of navigation. Secondly, a state value network based on double attention was designed to process environmental information to ensure that the robot navigation system has both environmental understanding ability and real-time decision-making ability. Finally, the algorithm was verified in the simulation environment. Experimental results show that the proposed algorithm not only improves the navigation efficiency, but also improves the robustness of the robot navigation system; The main performance is that in 500 random test scenarios, the collision times and timeout times are 0, the naviga-tion success rate is better than the comparison algorithm, and the average navigation time is 2% shorter than the best algorithm; When the number of pedestrians and navigation distance in the environment change, the algo-rithm is still effective, and the navigation time is shorter than the comparison algorithm.

    参考文献
    相似文献
    引证文献
引用本文

熊李艳,舒垚淞,曾辉,黄晓辉.基于深度强化学习的机器人导航算法研究[J].华东交通大学学报,2023,40(1):67-74.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-02-23
  • 出版日期: