WEKO3
アイテム
Research on Two-Dimensionalization Algorithms for Improving Emotion Recognition Accuracy in Speech Data and its Evaluation of Generalized Deployment
https://doi.org/10.18997/0002000700
https://doi.org/10.18997/0002000700bb86c1fd-b30d-448d-b40b-6657b2856ef0
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
|
| アイテムタイプ | 学位論文 = Thesis or Dissertation(1) | |||||||
|---|---|---|---|---|---|---|---|---|
| 公開日 | 2024-05-28 | |||||||
| 資源タイプ | ||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_db06 | |||||||
| 資源タイプ | doctoral thesis | |||||||
| タイトル | ||||||||
| タイトル | Research on Two-Dimensionalization Algorithms for Improving Emotion Recognition Accuracy in Speech Data and its Evaluation of Generalized Deployment | |||||||
| 言語 | en | |||||||
| その他のタイトル | ||||||||
| その他のタイトル | 感情認識精度向上のための音声データの二次元化アルゴリズムの研究およびその汎用展開の評価 | |||||||
| 言語 | ja | |||||||
| 言語 | ||||||||
| 言語 | eng | |||||||
| 著者 |
Zijun, Yang
× Zijun, Yang
|
|||||||
| 抄録 | ||||||||
| 内容記述タイプ | Abstract | |||||||
| 内容記述 | With the development of society and the intensification of competition, people face increasing life pressure in their daily life, which has a significant impact on the mental health of individuals. This study is dedicated to exploring how this psychosocial health issue can be attended to and addressed through speech emotion recognition. Speech, as a natural and intuitive way of expressing emotions, has been found to contain up to 38% of emotional information. Through in-depth sentiment analysis of speech, we can better understand the emotional state of individuals and provide feedback accordingly, thus helping to alleviate the stress they face in life In this study, we innovatively start from speech and use a novel time series analysis method to transform speech time series into 2D images. In this process, we employed Hilbert curves to map the time series to the image space. In this way, we successfully capture the dynamic features of speech into static images, which lays the foundation for subsequent emotion recognition. In order to realize the accurate recognition of speech emotion, we designed a neural network suitable for this image representation. This neural network can effectively extract the key features in the image, thus realizing the recognition of different emotions. Through a large number of experiments, we have proved that our method has achieved remarkable results in speech emotion recognition, providing a solid foundation for further research and application. Not only that, this study also optimizes other time series imaging algorithms. We improved the Gram’s Corner Field algorithm by using different downsampling techniques and designed a neural network model for Gram’s Corner Field. This optimization makes our method more versatile and able to adapt to different time series data, providing a wider range of possibilities for future applications. In order to understand the individual’s emotional state more comprehensively, this study introduces the CyTex method in the extension of the method and incorporates the concept of speech rate for the segmentation of time series. This innovative approach further improves the accuracy of speech emotion recognition and lays a solid foundation for future applications. In the segmentation processing of time series, we adopt the CyTex method, which effectively divides the time series while maintaining its continuity. This segmentation allows the neural network to learn the emotional information in each time period more precisely. Compared with the traditional holistic learning method, segmentation learning is more capable of capturing the subtle differences of emotional changes in speech, making the recognition results more accurate. At the same time, we introduce the concept of speech rate as a new analytical dimension to be incorporated into the time-series features. Speech rate is not only a surface feature of speech, but it also combines short-time features and rhythmic features to reflect the emotional information in speech more comprehensively. By considering speech rate in segmentation learning, we enable the neural network to be more sensitive to capturing emotional changes in speech, hus improving the accuracy of recognition. This approach experimentally demonstrates that the segmental learning approach, which introduces CyTex and speech rate, performs well in the speech emotion recognition task compared to the traditional holistic learning approach. This provides a more refined and accurate processing means for future speech emotion recognition applications and lays a more solid foundation for practical applications. Therefore, by adopting the CyTex method and introducing the concept of speech rate, we analyze the time series more carefully, which makes our algorithm achieve more satisfactory results in the emotion recognition task. This innovative approach provides new perspectives and methods in the field of speech emotion recognition and brings wider possibilities for future research and applications. This research transcends the confines of speech emotion recognition, extending its applicability to the realm of brainwave analysis. The methodologies, initially designed for speech, prove to be versatile as they are successfully applied to brainwave time series, achieving remarkable results in the identification of distinct epileptic seizure types. This breakthrough not only signifies the adaptability and efficacy of the proposed methods but also opens new avenues for applications in neurology and clinical diagnostics. In achieving excellence in epileptic seizure type recognition, the study sets the stage for future endeavors aimed at identifying depressive states and discerning emotional nuances through brainwave analysis. The envisioned expansion of research activities in these directions reflects the commitment to pushing the boundaries of knowledge and practical applications in mental health research. This forward momentum not only enhances our understanding of neurological disorders but also holds promise for the development of novel diagnostic tools and therapeutic interventions. The exploration of brainwave signals emerges as a powerful avenue for gaining profound insights into an individual’s mental state and emotional experiences. Through meticulous analysis of brainwave patterns, this study provides a nuanced understanding of cognitive processes, presenting itself as a valuable tool for researchers in psychology and neuroscience. The nuanced nature of brainwave data offers a rich tapestry of information, shedding light on the intricate interplay of emotions and mental states. In conclusion, this study, with a comprehensive scope spanning speech emotion recognition to brainwave analysis, has reached a pivotal milestone by excelling in epileptic seizure type identification. The transformative methodologies introduced in speech analysis seamlessly extend to the realm of brainwave time series, opening up new vistas for exploration. The fusion of innovative approaches with optimized time series imaging algorithms not only enables accurate emotional state recognition but also propels the research landscape into promising territories within neurology and mental health. With a commitment to ongoing research, the study serves as a beacon for future investigations, offering a wealth of tools and insights for understanding, mitigating, and addressing various aspects of individual life stress, mental health, and neurological disorders. |
|||||||
| 目次 | ||||||||
| 内容記述タイプ | TableOfContents | |||||||
| 内容記述 | 1 Introduction| 2 Proposal 1: Speech Emotion Recognition Based on Gramian Angular Field| 3 Proposal 2: Speech Emotion Recognition Based on CyTex and Speech Rate| 4 Proposal 3: Speech Emotion Recognition Based on Hilbert Curve| 5 Applications of the proposed two-dimensionalization algorithm in other fields| 6 Summary and discussion| 7 Acknowledgement| 8 Reference | |||||||
| 備考 | ||||||||
| 内容記述タイプ | Other | |||||||
| 内容記述 | 九州工業大学博士学位論文 学位記番号:工博甲第589号 学位授与年月日:令和6年3月25日 | |||||||
| 学位授与番号 | ||||||||
| 学位授与番号 | 甲第589号 | |||||||
| 学位名 | ||||||||
| 学位名 | 博士(工学) | |||||||
| 学位授与年月日 | ||||||||
| 学位授与年月日 | 2024-03-25 | |||||||
| 学位授与機関 | ||||||||
| 学位授与機関識別子Scheme | kakenhi | |||||||
| 学位授与機関識別子 | 17104 | |||||||
| 学位授与機関名 | 九州工業大学 | |||||||
| 言語 | ja | |||||||
| 学位授与年度 | ||||||||
| 内容記述タイプ | Other | |||||||
| 内容記述 | 令和5年度 | |||||||
| 出版タイプ | ||||||||
| 出版タイプ | VoR | |||||||
| 出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |||||||
| アクセス権 | ||||||||
| アクセス権 | open access | |||||||
| アクセス権URI | http://purl.org/coar/access_right/c_abf2 | |||||||
| ID登録 | ||||||||
| ID登録 | 10.18997/0002000700 | |||||||
| ID登録タイプ | JaLC | |||||||