ログイン
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 学位論文
  2. 学位論文

限られた計算機資源やデータを有効利用した意見マイニング

https://doi.org/10.18997/0002001049
https://doi.org/10.18997/0002001049
eba8a7a1-f3d7-4f08-83c2-76b85e9ea9dd
名前 / ファイル ライセンス アクション
jou_k_402.pdf jou_k_402.pdf (1.3 MB)
Item type 学位論文 = Thesis or Dissertation(1)
公開日 2024-11-20
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_db06
資源タイプ doctoral thesis
タイトル
タイトル Opinion Mining by Utilizing Limited Computational Resources and Data
言語 en
タイトル
タイトル 限られた計算機資源やデータを有効利用した意見マイニング
言語 ja
言語
言語 jpn
著者 Al-Mahmud,

× Al-Mahmud,

en Al-Mahmud,

Search repository
抄録
内容記述タイプ Abstract
内容記述 As the Internet’s role in daily communication expands, so does the practice of sharing
opinions on platforms like Facebook, Twitter, and Amazon. This proliferation of usergenerated content makes manual analysis of online opinion trends impractical, highlighting the need for automated opinion mining techniques. Opinion mining, a subfield of natural language processing (NLP), aims to extract and analyze opinions from textual data and plays a crucial role in market research, product feedback, and public opinion analysis. In NLP research, two primary approaches are used to categorize text: sentence-level and token-level classification. Sentence-level classification categorizes entire sentences into predefined classes, commonly utilized in sentiment analysis, topic classification, and document classification.
Conversely, token-level classification is more granular, assigning labels to individual words or tokens within texts, which is essential for tasks like named entity recognition (NER), part-of-speech (POS) tagging, and fine-grained sentiment analysis. This thesis utilizes limited computational resources (such as memory and processing power) and data in opinion mining by dealing with four tasks: opinion holder detection, sentiment analysis, aspect-opinion extraction, and intent analysis. In this thesis, opinion holder detection consists of two steps: detecting the presence of opinion holders in the text and identifying them. For the first step, we employ DistilBERT as a feature extractor with logistic regression (LR), namely DistilBERT+LR to utilize limited computational resources with better performance than BERT+LR. The second step employs a character-level contextual string embedding (CSE) model with conditional random field (CRF), namely CSE+CRF, which utilizes limited computational resources while exhibiting very competitive performance compared with the heavyweight models. In sentiment analysis for limited Bangla data, we apply stepwise learning utilizing transformers-based models. This technique leverages an auxiliary task with larger datasets to improve performance on the main task with smaller datasets.
The effectiveness of writers’ opinion expression styles (“nativeness”) between source and target data in stepwise learning is also explored. Aspect-opinion extraction focuses on Bangla, addressing the limitations of conventional sentiment analysis, which assumes a single sentiment per text. Aspect-opinion extraction identifies multiple targets and opinions within a text, providing a more granular understanding. We propose a method that combines feature embeddings from different transformers-based models followed by fine-tuning to utilize limited prepared Bangla data effectively for performance improvement. The intent analysis study extends conventional sentiment analysis by introducing more classes such as suggestion and sarcastic for deeper insights and handles the task at a granular level, differing from the token-level aspect-opinion extraction where there is no dealing with sarcasm. Due to no annotated Bangla data for this task, we generate ChatGPT data as auxiliary data. We also prepare user-generated limited Bangla data. Then we deploy a semi-supervised self-training with transformers-based models to exploit the auxiliary data to enhance performance on the prepared user-generated limited Bangla data.
目次
内容記述タイプ TableOfContents
内容記述 1 Introduction| 2 Basic Model and Technique| 3 Dataset Construction and Opinion Holder Detection Using Pre-Trained Models| 4 Demonstration of Effectiveness of Nativeness in Stepwise Learning by Performing
Bangla Sentiment Analysis| 5 Dataset Construction and Evaluation for Aspect-Opinion Extraction in Bangla
Fine-Grained Sentiment Analysis| 6 Demonstration of Effectiveness of Exploiting ChatGPT-Generated Data to the
Transformers-Based Models by Performing Bangla Intent Analysis| 7 Conclusion
備考
内容記述タイプ Other
内容記述 九州工業大学博士学位論文 学位記番号:情工博甲第402号 学位授与年月日:令和6年9月25日
学位授与番号
学位授与番号 甲第402号
学位名
学位名 博士(情報工学)
学位授与年月日
学位授与年月日 2024-09-25
学位授与機関
学位授与機関識別子Scheme kakenhi
学位授与機関識別子 17104
言語 ja
学位授与機関名 九州工業大学
学位授与年度
内容記述タイプ Other
内容記述 令和6年度
出版タイプ
出版タイプ VoR
出版タイプResource http://purl.org/coar/version/c_970fb48d4fbd8a85
アクセス権
アクセス権 open access
アクセス権URI http://purl.org/coar/access_right/c_abf2
ID登録
ID登録 10.18997/0002001049
ID登録タイプ JaLC
戻る
0
views
See details
Views

Versions

Ver.1 2024-11-20 02:52:58.582933
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3