WEKO3
アイテム
Dataset Construction and Verification for Detecting Factual Inconsistency in Japanese Summarization
http://hdl.handle.net/10228/0002001155
http://hdl.handle.net/10228/0002001155a9af61cc-ff85-4e4b-8c99-2cf453faf6cf
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
|
| Item type | 共通アイテムタイプ(1) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2025-01-24 | |||||||||||
| タイトル | ||||||||||||
| タイトル | Dataset Construction and Verification for Detecting Factual Inconsistency in Japanese Summarization | |||||||||||
| 言語 | en | |||||||||||
| 著者 |
Iwamoto, Keisuke
× Iwamoto, Keisuke
× 嶋田, 和孝
WEKO
13734
|
|||||||||||
| 著作権関連情報 | ||||||||||||
| 権利情報 | Copyright (c) 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |||||||||||
| 抄録 | ||||||||||||
| 内容記述タイプ | Abstract | |||||||||||
| 内容記述 | Abstractive document summarization is one of the most important tasks in natural language processing. Many approaches based on large language models have been proposed. However, it is known that the output of LLM often includes hallucinations, such as factual inconsistency. Therefore, detecting factual inconsistencies in a summary is one important task for summarization. One solution for the detection is to utilize machine learning techniques. In general, machine learning approaches require a large number of training data to generate a robust model. However, it is difficult automatically to collect article-summary pairs with factual inconsistency from the Web because hand-written summaries on the Web are usually correct. Moreover, some existing datasets are written in English. In this paper, we propose some approaches to construct Japanese datasets with factual inconsistency automatically. For this purpose, we utilize two approaches from previous studies: FactCC and SumFC. In addition, we propose a new approach to construct summaries with exaggerated expressions, as a variety of factual inconsistencies. We call the datasets JFactCC, JSumFC, and JExnoS. For JExnoS, we utilize a two-stage approach based on GPT-4 and BART for the generation of summaries with exaggerated expressions from correct article-summary pairs. We also verify the usefulness of each constructed dataset through an experiment about factual inconsistency detection with BERT. | |||||||||||
| 言語 | en | |||||||||||
| 備考 | ||||||||||||
| 内容記述タイプ | Other | |||||||||||
| 内容記述 | 2024 16th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), July 6 - 12, 2024, Takamatsu, Japan | |||||||||||
| 言語 | en | |||||||||||
| 書誌情報 |
en : 2024 16th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI) p. 243-248, 発行日 2024-10-15 |
|||||||||||
| 出版社 | ||||||||||||
| 出版者 | IEEE | |||||||||||
| キーワード | ||||||||||||
| 主題Scheme | Other | |||||||||||
| 主題 | Factual inconsistency detection | |||||||||||
| キーワード | ||||||||||||
| 主題Scheme | Other | |||||||||||
| 主題 | Dataset construction | |||||||||||
| キーワード | ||||||||||||
| 主題Scheme | Other | |||||||||||
| 主題 | GPT-4 | |||||||||||
| キーワード | ||||||||||||
| 主題Scheme | Other | |||||||||||
| 主題 | BART | |||||||||||
| キーワード | ||||||||||||
| 主題Scheme | Other | |||||||||||
| 主題 | BERT | |||||||||||
| 言語 | ||||||||||||
| 言語 | eng | |||||||||||
| 資源タイプ | ||||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||
| 資源タイプ | journal article | |||||||||||
| 出版タイプ | ||||||||||||
| 出版タイプ | AM | |||||||||||
| 出版タイプResource | http://purl.org/coar/version/c_ab4af688f83e57aa | |||||||||||
| DOI | ||||||||||||
| 識別子タイプ | DOI | |||||||||||
| 関連識別子 | https://doi.org/10.1109/IIAI-AAI63651.2024.00054 | |||||||||||
| ISBN | ||||||||||||
| 識別子タイプ | ISBN | |||||||||||
| 関連識別子 | 979-8-3503-7790-3 | |||||||||||
| 会議記述 | ||||||||||||
| 会議名 | IIAI International Congress on Advanced Applied Informatics | |||||||||||
| 言語 | en | |||||||||||
| 回次 | 16 | |||||||||||
| 開始年 | 2024 | |||||||||||
| 開始月 | 07 | |||||||||||
| 開始日 | 06 | |||||||||||
| 終了年 | 2024 | |||||||||||
| 終了月 | 07 | |||||||||||
| 終了日 | 12 | |||||||||||
| 査読の有無 | ||||||||||||
| 値 | yes | |||||||||||
| 研究者情報 | ||||||||||||
| URL | https://hyokadb02.jimu.kyutech.ac.jp/html/196_ja.html | |||||||||||
| 論文ID(連携) | ||||||||||||
| 値 | 10443066 | |||||||||||
| 連携ID | ||||||||||||
| 値 | 12445 | |||||||||||