ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 学位論文
  2. 学位論文

時系列予測モデル・大規模言語モデルによる意味トリプルへの変換における自然言語記述の文法的複雑さの影響に関する研究

https://doi.org/10.18997/0002001057
https://doi.org/10.18997/0002001057
d298ccd9-64b8-4154-b38e-288e7caf8f35
名前 / ファイル ライセンス アクション
sei_k_498.pdf sei_k_498.pdf (12.3 MB)
アイテムタイプ 学位論文 = Thesis or Dissertation(1)
公開日 2024-11-21
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_db06
資源タイプ doctoral thesis
タイトル
タイトル The Effect of the Grammatical Complexity of Natural Language Descriptions in the Translation to Semantic Triples by Sequential Models: Its Performance and Internal Mechanisms
言語 en
タイトル
タイトル 時系列予測モデル・大規模言語モデルによる意味トリプルへの変換における自然言語記述の文法的複雑さの影響に関する研究
言語 ja
言語
言語 jpn
著者 Manu, Shrivastava

× Manu, Shrivastava

en Manu, Shrivastava

Search repository
抄録
内容記述タイプ Abstract
内容記述 Industry 5.0 has put machines at the forefront in various industries. Machines are involved in every aspect of human life making it very critical to keep them in proper working condition. These machines are highly complex and rapidly evolving, resulting in a scarcity of highly skilled manpower capable of repairing them. One way to bridge this gap is to develop a knowledge-based system that can understand various machine components, their working, and causes of machine failure and thus help in the maintenance of machines and reduce the dependency on expert technicians. These knowledge bases or ontologies define concepts, relationships, and properties within a particular domain and can generate new inferences based on predicate logic. Manual generation of these concepts and relationships from raw text is very time-consuming and therefore in recent years machine learning techniques have been employed for these tasks, where a sentence is taken as an input and concepts and relations between them are extracted. For these knowledge-based systems to be dependable the concepts and relationship between them should be accurate.
The task of extracting concepts and relationships between them for ontology creation is called ontology population task. This Ontology Population Task (OPT) can be formulated as a classification task or a Neural Machine Translation (NMT) task. In the classification task, a sentence is given to a neural network model, and the model finds the words that belong either to a concept or a relation. In the case of NMT, an input sentence is translated to output a Resource Description Framework (RDF) triple. The current work aims to improve the quality of NMT task. The input sentences to a machine learning model can have different structures, the impact of these sentence structures on sequential model performance is not well studied. Most of the Natural Language Processing (NLP) applications are trained using annotated data without any importance given to the structure of the sentences used in training, this may lead to training data being skewed in terms of sentence structure and the distribution of sentences in training may differ from the distribution of sentences in real scenario. In this work to improve the quality of concepts and relations extracted from natural text, we analyze the effect of sentence structure on sequential models based on Bidirectional long short-term memory and Transformer architecture. We provide insight into the learning behavior of sequential models using statistical analysis methods like Kolmogorov-Smirnov test (KS test) and Cramer Von Mises test (CvM test). We also evaluate the model behavior on extraction task based on mean seeking forward Kullback-Leibler Divergence (KLD) and mode seeking backward KLD loss function.
Finally, the thesis contributes by providing mechanisms to improve the quality of concepts and relations extracted from natural text. The performance of the sequential model differs based on the loss function used for learning, a Modified Jeffreys Divergence (MJD) proposed in this work that combines the mean seeking behavior of forward KLD and mode seeking behavior of backward KLD contributes to the quality improvement. Based on the insight gained from the statistical analysis: the sequential model’s performance is affected by the structure of the sentences used for training and therefore the data used for training should have a proper distribution of different types of sentence structure, our proposal of a Structure Dependent Weighted Loss Function and the mechanism of selecting different model checkpoints based on sentence type also helped in improving the performance of the sequential model.
目次
内容記述タイプ TableOfContents
内容記述 1 Introduction| 2 Literature Review| 3 Basic Concept| 4 Methodology and Results| 5 Discussion and Conclusion
備考
内容記述タイプ Other
内容記述 九州工業大学博士学位論文 学位記番号:生工博甲第498号 学位授与年月日:令和6年9月25日
学位授与番号
学位授与番号 甲第498号
学位名
学位名 博士(工学)
学位授与年月日
学位授与年月日 2024-09-25
学位授与機関
学位授与機関識別子Scheme kakenhi
学位授与機関識別子 17104
学位授与機関名 九州工業大学
言語 ja
学位授与年度
内容記述タイプ Other
内容記述 令和6年度
出版タイプ
出版タイプ VoR
出版タイプResource http://purl.org/coar/version/c_970fb48d4fbd8a85
アクセス権
アクセス権 open access
アクセス権URI http://purl.org/coar/access_right/c_abf2
ID登録
ID登録 10.18997/0002001057
ID登録タイプ JaLC
戻る
0
views
See details
Views

Versions

Ver.1 2024-11-21 04:30:16.671447
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3