WEKO3
アイテム
オンライン文法圧縮に基づく自己索引とそのアプリケーション
https://doi.org/10.18997/00006315
https://doi.org/10.18997/00006315454a47d6-e4fc-467d-8dd6-930098b49d89
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
|
| アイテムタイプ | 学位論文 = Thesis or Dissertation(1) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2017-08-24 | |||||||||
| 資源タイプ | ||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_db06 | |||||||||
| 資源タイプ | doctoral thesis | |||||||||
| タイトル | ||||||||||
| タイトル | Online Grammar-Based Self-Index and Its Applications | |||||||||
| 言語 | en | |||||||||
| タイトル | ||||||||||
| タイトル | オンライン文法圧縮に基づく自己索引とそのアプリケーション | |||||||||
| 言語 | ja | |||||||||
| 言語 | ||||||||||
| 言語 | eng | |||||||||
| 著者 |
髙畠, 嘉将
× 髙畠, 嘉将
|
|||||||||
| 抄録 | ||||||||||
| 内容記述タイプ | Abstract | |||||||||
| 内容記述 | Text collections including many repetitions of substrings, called highly repetitive text collections, have been increasing in various fields like the version control system and the genome database. For the efficient use of such texts collections, the importance of the compression algorithm and compressed indexes is rapidly increasing more and more. The grammar-based self-indexes are suitable for the problem of information retrieval on a compressed text because they support the random access on the compressed text. However, the constructing algorithms of existing grammar-based self-indexes are offline, that is, these algorithms require the whole input beforehand. Therefore, the memory consumption depends on the size of input text explicitly. In order to overcome this difficulty, we propose the first online algorithm for grammar-based self-index, called Online Edit-Sensitive Parsing index (OESP-index, for short). The proposed algorithm directly transforms the input text into the corresponding variable-length encoded string reading the input symbol one-by-one. Compared to the existing self-indexes, the memory consumption of our online algorithm depends on the size of output, that is, the size of compressed text. Additionally, we also present three applications based on the grammar-based compression: (i) the online pattern matching problem for string edit-distance with moves called online ESP (OESP); (ii) the string index for edit-distance with moves called siEDM; (iii) the online grammar compression for frequent pattern discovery in smaller space. For these applications, we demonstrate the performance of our algorithm for large-scale data. | |||||||||
| 目次 | ||||||||||
| 内容記述タイプ | TableOfContents | |||||||||
| 内容記述 | 1 Introduction||2 Preliminaries||3 Structure of the OESP-index||4 Online pattern matching for string edit distance with moves||5 siEDM: an efficient string index and search algorithm for edit distance with moves||6 Online grammar compression for frequent pattern discovery||7 Conclusions and future work | |||||||||
| 備考 | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | 九州工業大学博士学位論文 学位記番号:情工博甲第322号 学位授与年月日:平成29年3月24日 | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | Grammar-based self-index | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | Grammar Compression | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | Edit distance with moves | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | Frequent pattern discovery | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | Edit-sensitive parsing | |||||||||
| アドバイザー | ||||||||||
| 坂本, 比呂志 | ||||||||||
| 学位授与番号 | ||||||||||
| 学位授与番号 | 甲第322号 | |||||||||
| 学位名 | ||||||||||
| 学位名 | 博士(情報工学) | |||||||||
| 学位授与年月日 | ||||||||||
| 学位授与年月日 | 2017-03-24 | |||||||||
| 学位授与機関 | ||||||||||
| 学位授与機関識別子Scheme | kakenhi | |||||||||
| 学位授与機関識別子 | 17104 | |||||||||
| 学位授与機関名 | 九州工業大学 | |||||||||
| 学位授与年度 | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | 平成28年度 | |||||||||
| 出版タイプ | ||||||||||
| 出版タイプ | VoR | |||||||||
| 出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |||||||||
| アクセス権 | ||||||||||
| アクセス権 | open access | |||||||||
| アクセス権URI | http://purl.org/coar/access_right/c_abf2 | |||||||||
| ID登録 | ||||||||||
| ID登録 | 10.18997/00006315 | |||||||||
| ID登録タイプ | JaLC | |||||||||