Real-time subject recognition in spoken language using attention-enhanced BiLSTM networks
Abstract
Real-time analysis of linguistic logic during communication through intelligent speech recognition is of profound significance for its global promotion. This paper delves into the intricacies of linguistic structure and investigates subject recognition in spoken Japanese, employing deep learning methods. Firstly, we dissect the subjects within Japanese logic, categorizing them into four distinct groups for subsequent classification and labeling. Secondly, this study employs MFCC features to extract speech signal attributes, incorporating an attention-based BI-LSTM network for feature fusion to establish a robust foundation for high-precision classification. Lastly, an activation function is employed to execute subject classification in spoken Japanese, yielding a recognition accuracy of 96.1%. This not only serves as a stepping stone for future speech keyword recognition but also presents a fresh perspective for the exploration of linguistic logical structures.