نوع مقاله : مقاله پژوهشی
نویسندگان
1 دانشجوی کارشناسی ارشد، زبانشناسی رایانشی، مرکز زبانها و زبانشناسی، دانشگاه صنعتی شریف
2 استادیار، گروه زبانشناسی رایانشی، مرکز زبانها و زبانشناسی، دانشگاه صنعتی شریف
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
Spoken language understanding is considered as a specific domain of natural language understanding in which the uttered sentences are not as well-formed as written sentences. In the present paper, a text-based system of spoken language understanding is introduced for ticket reservation domain. This system is developed according to the data-driven approach and its architecture includes two main parts: first, extracting parameters of the model and second, assigning the most likely semantic tags to the sequence of words. "Hidden Markov Model" and "Viterbi" algorithm are applied in order to train the parameters and to tag the sequence of words. For this purpose, a corpus of commonly-used sentences in ticket reservation domain is collected and a specific tag is assigned to each word or a combination of words. In the training step, by using the tagged corpus, a sequence of possible tags is learned for a sequence of various words and in the testing step the most likely tag is assigned to a word or a combination of words according to the probabilities calculated in the previous step. Evaluation of the accuracy of system in recognizing the three key tags of departure, arrival and date is 91%.
کلیدواژهها [English]