درک معنا در سامانه محاورۀ مبتنی بر متن برای حوزۀ ذخیره بلیت

جمشیدلو, پریا; بحرانی, محمد

درک معنا در سامانه محاورۀ مبتنی بر متن برای حوزۀ ذخیره بلیت

نوع مقاله : مقاله پژوهشی

نویسندگان

پریا جمشیدلو ¹

محمد بحرانی ²

¹ دانشجوی کارشناسی ارشد، زبان‌شناسی رایانشی، مرکز زبان‌ها و زبان‌شناسی، دانشگاه صنعتی شریف

² استادیار، گروه زبان‌شناسی رایانشی، مرکز زبان‌ها و زبان‌شناسی، دانشگاه صنعتی شریف

چکیده

درک زبان محاوره حوزۀ خاصی از درک زبان طبیعی را شامل می‌شود که در آن جملات بیان‌شده توسط کاربر به اندازۀ جملات زبان نوشتاری تابع دستور زبان نیستند. در این مقاله، سامانه محاورۀ مبتنی بر متن برای استخراج معنای جملات محاوره‎ای مربوط به حوزۀ ذخیره بلیت معرفی میشود. در طراحی این سامانه از شیوه‌های مبتنی بر داده استفاده شده است. معماری آن شامل دو بخش اصلی استخراج متغیرها و انتساب محتمل‌ترین برچسب‌های معنایی به دنباله‌ای از کلمات است. برای این کار از الگوی مخفی مارکوف استفاده میشود. برچسب‌زنی معنایی دنبالۀ کلمات با استفاده از الگوریتم ویتربی صورت می‌گیرد. بدین منظور، ابتدا پیکره‌ای از جملاتِ مورد استفاده در حوزۀ ذخیره بلیت جمع‌آوری و سپس به هر کلمه یا ترکیبی از کلمات یک برچسب معنایی تخصیص داده میشود. در مرحلۀ آموزش با استفاده از پیکرۀ برچسب‌خورده، دنبالۀ برچسب‌های ممکن برای توالی کلمات مختلف یاد گرفته می‌شود. در مرحلۀ آزمون با استفاده از احتمالات استخراج‌شده از مرحلۀ آموزش، محتمل‌ترین برچسب معنایی برای هر کلمه یا ترکیبی از کلمات پیدا می‌شود. بر اساس آزمایش‌های انجام‌شده، دقت سامانه پیشنهادی در تشخیص سه برچسب کلیدیِ مبدأ، مقصد و تاریخ 91 درصد است.

کلیدواژه‌ها

درک معنا

سامانه محاوره‌ای

روش مبتنی بر داده

الگوی مخفی مارکوف

الگوریتم ویتربی

عنوان مقاله English

Understanding Meaning in a Text-Based Dialogue System for Specific Domain of Ticket Reservation

نویسندگان English

Paria Jamshidlu ¹

Mohammad Bahrani ²

¹ M.A. student, Computational Linguistics, Languages and Linguistics Center, Sharif University of Technology

² Assistant professor, Computational Linguistics Department, Languages and Linguistics Center, Sharif University of Technology

چکیده English

Spoken language understanding is considered as a specific domain of natural language understanding in which the uttered sentences are not as well-formed as written sentences. In the present paper, a text-based system of spoken language understanding is introduced for ticket reservation domain. This system is developed according to the data-driven approach and its architecture includes two main parts: first, extracting parameters of the model and second, assigning the most likely semantic tags to the sequence of words. "Hidden Markov Model" and "Viterbi" algorithm are applied in order to train the parameters and to tag the sequence of words. For this purpose, a corpus of commonly-used sentences in ticket reservation domain is collected and a specific tag is assigned to each word or a combination of words. In the training step, by using the tagged corpus, a sequence of possible tags is learned for a sequence of various words and in the testing step the most likely tag is assigned to a word or a combination of words according to the probabilities calculated in the previous step. Evaluation of the accuracy of system in recognizing the three key tags of departure, arrival and date is 91%.

کلیدواژه‌ها English

natural language understanding

spoken dialogue system

data-driven approach

Hidden Markov Model

Viterbi algorithm