ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 学位論文
  2. 学位論文

深層学習による単眼SLAMシステムの研究

https://doi.org/10.18997/00009181
https://doi.org/10.18997/00009181
3cb87c2c-4665-4abf-9e3b-c1c603a113b4
名前 / ファイル ライセンス アクション
kou_k_573.pdf kou_k_573.pdf (95.1 MB)
アイテムタイプ 学位論文 = Thesis or Dissertation(1)
公開日 2023-04-10
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_db06
資源タイプ doctoral thesis
タイトル
タイトル Research on Monocular SLAM System Based on Deep Learning
言語 en
タイトル
タイトル 深層学習による単眼SLAMシステムの研究
言語 ja
言語
言語 eng
著者 Zhou, Shi

× Zhou, Shi

en Zhou, Shi

Search repository
抄録
内容記述タイプ Abstract
内容記述 Simultaneous localization and mapping (SLAM) is a foundational technology that builds a local map and locates the device in an unfamiliar environment at the same time. With the rapid development of the intelligent industry and the increasing demand for intelligent life, SLAM becomes a key technology and has attracted high attention in recent years. For mobile applications, such as sweeping robots, delivery robots, fire fighting robots, self-driving vehicles, etc., which require perceiving the surroundings and planning for the next step, SLAM is one of the essential technologies. This research aims to study SLAM for autonomous driving. The traffic scene is complex with static vehicles on the side of the road, dynamic vehicles driving on the road, and persons walking on the road. In addition, the traffic scene is outdoors with changing illumination, and various road styles exist. Under these challenges, the SLAM system for autonomous driving should maintain strong robustness. Nowadays, most SLAM systems are based on low-level features, such as lines, corners, and ORB features. These feature-based SLAM systems achieve high accuracy using these reliable and unambiguous features. However, these features are sensitive in the complex traffic scene. Meanwhile, along with the excellent performance of deep learning in image processing, numerous deep learning methods are proposed to estimate depth and pose from images. These deep learning-based methods have strong robustness, while the accuracy is low relative to feature-based methods. Especially, the deep learning-based pose estimation method has large cumulative errors without the optimization of back-end and loop closing. This leads to no purely deep learning-based SLAM method so far. Considering the high robustness of deep learning method and the high accuracy of feature-based methods, this research studies the joint of deep learning's high robustness in feature-based SLAM systems. In this way, this reseatch proposes a high accuracy and high robustness SLAM system. Furthermore, this research experiments on two public datasets and one self-created dataset to analyze the proposed SLAM system. Considering the different styles of traffic scenes in different places and the diversity of camera sensors on different vehicles, deep learning models that are trained on one dataset cannot adapt to other traffic scenes. This research proposes a self-supervised method for depth and pose estimation. According to the theory of scene reconstruction from multiple viewpoints, one image can be synthesized from the other image that corresponds to the same scene. This research leverages this relationship between adjacent frames and stereo images pairs to synthesize images. Additionally, the appearance differences between the synthetic image and the original `real' image are used to supervise the training of depth and pose estimation neural networks. In this way, this method has no dependency on labeled training data. Experiments results show that the image can be synthesized well during training and view synthesis can be used as the self-supervision signal. In addition, the depth and pose results on three datasets show that the proposed method not only estimates high-quality depth and pose but also can be adapted to different datasets. Higher accuracy depth leads to better SLAM system. Feeding low-accuracy depth to ORB-SLAM3, greatly influences the performance of ORB-SLAM3. To obtain high-accuracy depth, this research uses a knowledge distillation mechanism. Considering the advantages of different depth estimation methods, this research designs a stereo method, a two-frame method, and a single-frame method. During training, the mutual knowledge distillation mechanism chooses the best one as the teacher of the other two for each pixel. In this way, they are the teacher of each other, and they learn from each other. For each pixel, the best depth is the pseudo-ground truth. Each method obtains the advantages of itself and the others. In addition, this research designs an ASP module and a DCP module for depth estimation, so that the neural model can perceive large view-filed and multi-scale information from images. Experiments on the three datasets show that either the knowledge distillation mechanism or the designed network architecture can improve the depth map slightly, while their cooperation obviously improves the depth map. Therefore, the knowledge distillation mechanism and the designed network architecture are effective for improving depth estimation. The trajectory of the vehicle is built according to the pose between adjacent frames. Higher accuracy poses lead to higher accuracy trajectory. Inspired by the feature-based method, which uses reliable features to estimate high-accuracy pose, this research proposes to estimate pose using the projection of reliable pixels. Two masks, boundary mask and inferior projection mask, detect unreliable pixels in the boundary regions and un-boundary regions, respectively. In this way, all pixels are grouped into two categories, reliable pixels and unreliable pixels. During training pose estimation model, only the information from reliable pixels is used to calculate the gradient. Thus, the negative influences of unreliable pixels are eliminated well. Experiments on three datasets show that using one of the two masks cannot improve the pose, while using both masks can obviously improve the pose. This suggests that the cooperation of the two masks detects almost all the unreliable pixels in the whole image. Moreover, this illustrates using the projection of reliable pixels and ignoring the influence of unreliable pixels is an effective way of improving pose estimation. Finally, the deep learning-based depth and pose are applied to ORB-SLAM3. The depth and pose estimated from deep learning model are fed into the ORB-SLAM3 system along the frame so that a novel pseudo-RGBD-I SLAM system is built using only frames information. In fact, the real input is frames since the depth and pose conducted from the frame. Therefore, this is a monocular case with pseudo-depth and pseudo-IMU. Experiments on the three datasets show that after either joint or not joint the pseudo-depth and pseudo-IMU generate a comparable result for most of the image sequences. This suggests the application of deep learning-based depth and pose in ORB-SLAM3 is successful. For some image sequences, ORB-SLAM3 fails when the vehicle turns left or right, while the proposed method succeeds. In addition, the results of displacement on each axis show that the proposed method is more smooth than ORB-SLAM3. All these reveal the proposed method is more stable than ORB-SLAM3. Therefore, the proposed method successfully applies deep learning-based depth and pose to ORB-SLAM3 and improves the robustness of the SLAM system. Meanwhile, the proposed method also performs well in all three datasets. Therefore, the proposed method maintains strong adaptability.
目次
内容記述タイプ TableOfContents
内容記述 1 Introduction||2 Self-supervised Joint Learning of Depth and Pose||3 Higher Accuracy Depth Estimation Using Mutual Knowledge Distillation||4 Higher Accuracy Pose estimation Based on Reliable Projection||5 Applied Deep Learning-based Depth and Pose to ORB-SLAM3||6 Conclusion and Future Work
備考
内容記述タイプ Other
内容記述 九州工業大学博士学位論文 学位記番号: 工博甲第573号 学位授与年月日: 令和5年3月24日
キーワード
主題Scheme Other
主題 Deep Learning
キーワード
主題Scheme Other
主題 SLAM
キーワード
主題Scheme Other
主題 Depth Estimation
キーワード
主題Scheme Other
主題 Visual Odometry
キーワード
主題Scheme Other
主題 3D Computer Vision
キーワード
主題Scheme Other
主題 Autonomous Driving
アドバイザー
張, 力峰
学位授与番号
学位授与番号 甲第573号
学位名
学位名 博士(工学)
学位授与年月日
学位授与年月日 2023-03-24
学位授与機関
学位授与機関識別子Scheme kakenhi
学位授与機関識別子 17104
学位授与機関名 九州工業大学
学位授与年度
内容記述タイプ Other
内容記述 令和4年度
出版タイプ
出版タイプ VoR
出版タイプResource http://purl.org/coar/version/c_970fb48d4fbd8a85
アクセス権
アクセス権 open access
アクセス権URI http://purl.org/coar/access_right/c_abf2
ID登録
ID登録 10.18997/00009181
ID登録タイプ JaLC
戻る
0
views
See details
Views

Versions

Ver.1 2023-05-15 12:33:31.567893
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3