54 Matching Annotations
  1. Mar 2019
  2. arxiv.org arxiv.org
    1. EVALUATING PREREQUISITE QUALITIES FOR LEARNING END-TO-END DIALOG SYSTEMS

    1. The goal here is explicitly not to improve the state of the art in the narrow domain of restaurantbooking, but to take a narrow domain where traditional handcrafted dialog systems are known toperform well, and use that to gauge the strengths and weaknesses of current end-to-end systemswith no domain knowledge

      本文的目标不是来提升在狭窄的酒店预定领域的效果,而是用一个传统的手工系统就有较好系统来对比没有领域知识的end-to-end系统的优劣。

      MEMORYNETWORKS

    2. Unsurprisingly, perfectly coded rule-based systems can solve the simulated tasks T1-T5 perfectly,whereas our machine learning methods cannot. However, it is not easy to build an effective rule-based

      最终结果说明,在给出的任务中基于规则的毫无疑问效果比模型的好,但是对于在真实场景的真实问题来说,MemNN效果更好

    3. SUPERVISEDEMBEDDINGMODELS

      和现在的架构很像

    4. We implemented a rule-based system for this task in the followingway. We initialized a dialog state using the 3 relevant slots for this task: cuisine type, location andprice range. Then we analyzed the training data and wrote a series of rules that fire for triggers likeword matches, positions in the dialog, entity detections or dialog state, to output particular responses,API calls and/or update a dialog state. Responses are created by combining patterns extracted fromthe training set with entities detected in the previous turns or stored in the dialog state. Overall webuilt 28 rules and extracted 21 patterns. We optimized the choice of rules and their application priority(when needed) using the validation set, reaching a validation per-response accuracy of 40.7%. Wedid not build a rule-based system forConciergedata as it is even less constrained.

      先用word匹配和正则等制定一个规则系统来作为baseline.

    5. LEARNING END-TO-END GOAL-ORIENTED DIALOG

    1. A Network-based End-to-End Trainable Task-oriented Dialogue System

      这个end-to-end的系统,在意图识别的阶段用的是cnn+LSTM 在状态管理(belief state tracking)也用的LSTM,在policy的时候自定义了一套算法,将前面的几个输出向量做了个线性模型,输出。

    2. Finally, the policy network output is generated bya three-way matrix transformation,

      策略生成是用前面的特征向量加和乘积

    3. a distributed representationgenerated by an intent network and a probabilitydistribution over slot-value pairs called the beliefstate

      造出来的一个belief state的概念:

      由intent网络生成的分布式表示和对slot-value组的概率表示叫做belief stat。

    1. An End-to-End Trainable Neural Network Model withBelief Tracking for Task-Oriented Dialog

    1. In learning such neural network based dialogmodel, we propose hybrid offline training and on-line interactive learning methods. We first let theagent to learn from human-human conversationswith offline supervised training. We then improvethe agent further by letting it to interact with usersand learn from user demonstrations and feedbackwith imitation and reinforcement learning.

      模型训练思路:

      • 1 首先离线有监督学习 人和人的对话数据
      • 2 然后让模型和人交互,基于反馈和模仿用强化学习来学习

      为了解决样本效率问题,提出了learning-from-user and learning-from-simulationl两个方案。

    2. We design neural net-work based dialog system that is able to ro-bustly track dialog state, interface with knowl-edge bases, and incorporate structured queryresults into system responses to successfullycomplete task-oriented dialog.

      基于神经网络的端到端的网络模型能够健壮的跟踪对话状态,和知识库交互,用结构化的信息来成功的完成任务驱动型对话。

    3. End-to-End Learning of Task-Oriented Dialogs

      端到端的task类型对话的鼻祖

    1. A potential draw-back with such pre-training approach is that themodel may suffer from the mismatch of dialoguestate distributions between supervised training andinteractive learning stages. While interacting withusers, the agent’s response at each turn has a di-rect influence on the distribution of dialogue statethat the agent will operate on in the upcoming di-alogue turns.

      策略学习也是对话过程很重要的一环。 最近的策略学习过程有用基于有监督的预训练然后线上强化学习再训练的来提高学习的方案。但是这种方案有个潜在的毛病,在离线的数据中受限于数据量,线上一旦碰到了不常见的情况,容易直接恢复不来。(这个问题应该只是推断吧?有什么实证么?)

      所以本文其实想说的是用一种方法来减轻线上和离线的差距。

    2. These system components areusually trained independently, and their optimiza-tion targets may not fully align with the overallsystem evaluation criteria (e.g. task success rateand user satisfaction). Moreover, errors made inthe upper stream modules of the pipeline propa-gate to downstream components and get amplified,making it hard to track the source of errors

      传统pipeline方案的问题点: 1 流程比较复杂,每步骤独立训练,但是流程输入和输出有依赖,错误放大,难以跟进。

    3. Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

      一个混合学习过程,在人类的指导教育和反馈下增强强化学习的过程

    1. Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling

      用一个模型来解决两个不同类型的问题,intent detect是分类,填槽是序列标注。都用基于attention机制的RNN来搞定了

    2. The attentionmechanism later introduced in [12] enables the encoder-decodermodel to learn a soft alignment and to decode at the same time.

      本文中用到的attention-RNN算法。

      D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine trans-lation by jointly learning to align and translate,”arXiv preprintarXiv:1409.0473, 2014

    1. We present a general solution towards building task-orienteddialogue systems for online shopping, aiming to assist on-line customers in completing various purchase-related tasks,such as searching products and answering questions, in a nat-ural language conversation manner. As a pioneering work, weshow what & how existing natural language processing tech-niques, data resources, and crowdsourcing can be leveragedto build such task-oriented dialogue systems for E-commerceusage. To demonstrate its effectiveness, we integrate our sys-tem into a mobile online shopping application. To the bestof our knowledge, this is the first time that an dialogue sys-tem in Chinese is practically used in online shopping scenariowith millions of real consumers. Interesting and insightful ob-servations are shown in the experimental part, based on theanalysis of human-bot conversation log. Several current chal-lenges are also pointed out as our future directions

      整体来说,无法验证,没有任何实质的创新点。

      说是构建了一个第一个中文电商机器人对话系统(really?)

      M = (I, C, A)

      I是intent,C是product category, A是商品attribute。 M是根据用户Query得到的信息的表示。

      意图分类:PhraseLDA 1000个topic

      产品分类: a CNN-based approach that resembles (Huang et al. 2013)and (Shen et al. 2014

    2. System Formalization

      整个对话系统的格式化定义还是比较有意思

    1. Retrieval-based MethodsRetrieval-based methods choose a response from candidateresponses. The key to retrieval-based methods is message-response matching. Matching algorithms have to overcomesemantic gaps between messages and responses [28].

      基于检索的是从候选的回复中选出一个。检索式的关键是message-response的匹配。

      B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neu-ral network architectures for matching natural lan-guage sentences. InAdvances in neural informationprocessing systems, pages 2042–2050, 2014.

      单轮的匹配 match(X,Y) = X^TAy

      X:message的向量表示, y:回复的向量表示。

      H. Wang, Z. Lu, H. Li, and E. Chen. A dataset for re-search on short-text conversations. InProceedings ofthe 2013 Conference on Empirical Methods in NaturalLanguage Processing, pages 935–945, Seattle, Wash-ington, USA, October 2013. Association for Compu-tational Linguistics

      Z. Lu and H. Li. A deep architecture for matchingshort texts. InInternational Conference on Neural In-formation Processing Systems, pages 1367–1375, 2013.

      B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neu-ral network architectures for matching natural lan-guage sentences. InAdvances in neural informationprocessing systems, pages 2042–2050, 2014

      M. Wang, Z. Lu, H. Li, and Q. Liu. Syntax-based deepmatching of short texts.InIJCAI, 03 2015

      Y. Wu, W. Wu, Z. Li, and M. Zhou. Topic augmentedneural network for short text conversation.CoRR,2016

      多轮匹配

    2. TASK-ORIENTED DIALOGUESYSTEMSTask-oriented dialogue systems have been an important branchof spoken dialogue systems. In this section, we will reviewpipeline and end-to-end methods for task-oriented dialoguesystems.

      任务型对话系统整体来说可以分为两类:

      • 1 pipeline,也就是包含SLU+DST+PL+NLG
      • 2 end-to-end
    3. 2.2 End-to-End Methods

      在传统的task-oriented对话系统中,尽管有很多特定领域的人工定制,很难推广其他领域,更进一步的是pipeline的方法有两个局限。

      • 一个是信用分配问题,一个用户的反馈很难传播到上游每个组件中。
      • 另一个是问题流程的相互依赖。一个组件的输入依赖上一个组件的输出。一部分变动其他都得动。(这个真的是问题么?)

      这俩文章介绍来一种基于网络的end-to-end的可训练的task-oriented对话系统,方法是把对话系统学习看成从对话历史到回复响应的mapping,并用encoder-decoder模型来训练整个模型。不过这个系统是以有监督的方式训练的,不仅需要大量的训练数据,并且由于在训练数据中缺乏对对话控制的探索也不能找到一个鲁棒的好策略。

      • A network-based end-to-end trainable task-oriented di-alogue system
      • Learningend-to-end goal-oriented dialog.

      下文中,首次提出了一个联合训练dialogue state tracking和policy learning来优化得到更鲁棒的系统行为。

      • Towards end-to-end learn-ing for dialog state tracking and management us-ing deep reinforcement learning

      task-oriented系统经常需要query外部知识库,前面的系统是通过发出一个符号请求到知识库基于属性来获得条目。

    4. TASK-ORIENTED DIALOGUESYSTEMS

      一个典型的pipeline方法构建的task-oriented对话系统包含四部分:

      • Language understanding.NLU/SLU,目标是解析理解用户输入为intent,slot

      • Dialogue state tracker. 根据当前对话输入信息结合历史信息给出当前会话状态。

      • Dialogue policy learning.基于当前对话状态给出接下来要采取的行动

      • Natural language generation(NLG). 将映射的选择的动作行为转换生成对应的输出回复。

    5. 2.1.3 Policy learning

      策略学习 基于前面state tracker的状态表示,策略学习(policy learning)是来生成下一个可用的系统行动。无论是监督学习或者强化学习都可以用来优化策略学习。 H. Cuayhuitl, S. Keizer, and O. Lemon. Strategic di-alogue management via deep reinforcement learning.arxiv.org, 2015.

      通常都用一个基于规则的agent来初始化系统。 Z. Yan, N. Duan, P. Chen, M. Zhou, J. Zhou, andZ. Li. Building task-oriented dialogue systems for on-line shopping. InAAAI Conference on Artificial Intel-ligence, 2017

      然后用监督学习来基于规则生成的规则来学习。Building task-oriented dialogue systems for on-line shopping. 强化学习,Strategic di-alogue management via deep reinforcement learning.结果据说比很多系统,rule based,superviesed都好

    6. A statistical dialog system

      状态管理。

      统计对话系统维护了一个对真实状态基于多重假设来描述的分布,以应对噪声场景和歧义。

      • S. Young, M. Gai, S. Keizer, F. Mairesse, J. Schatz-mann, B. Thomson, and K. Yu. The hidden informa-tion state model: A practical framework for pomdp-based spoken dialogue management. 在DSTC比赛中结果形式是每轮对话中每个slot的一个概率分布。各种统计学方法如下:
      • 规则集, Z. Wang and O. Lemon. A simple and generic belieftracking mechanism for the dialog state tracking chal-lenge: On the believability of observed information. InSIGDIAL Conference, pages 423–432, 2013
      • CRF S. Lee and M. Eskenazi. Recipe for building robustspoken dialog state trackers: Dialog state trackingchallenge system description. InSIGDIAL Conference,pages 414–422, 2013

        S. Lee. Structured discriminative model for dialogstate tracking. InSIGDIAL Conference, pages 442–451, 2013

      H. Ren, W. Xu, Y. Zhang, and Y. Yan. Dialog statetracking using conditional random fields. InSIGDIALConference, pages 457–461, 2013.

      • maximum entropy model J. Williams. Multi-domain learning and generaliza-tion in dialog state tracking. InSIGDIAL Conference,pages 433–441, 2013.

      • web-style ranking J. D. Williams. Web-style ranking and slu combina-tion for dialog state tracking

      深度学习的状态管理。用一个滑动窗口来在任意数量可能值上输出一个概率序列。 M. Henderson, B. Thomson, and S. Young. Deep neu-ral network approach for the dialog state tracking chal-lenge. InProceedings of the SIGDIAL 2013 Confer-ence, pages 467–471, 2013

      多领域的RNN状态跟进模型: B. Thomson, M. Gasic, P.-H. Su, D. Vandyke, T.-H. Wen, and S. Young. Multi-domain dialog state tracking using recurrent neuralnetworks.

      基于neural belief tracker(NBT)来检测slot-value对。 Neural belief tracker: Data-driven dia-logue state tracking.

    7. Dialogue State Tracking

      跟进对话状态是保障dialog system的robust的核心。主要目标是预测每轮对话的用户目标。经典的状态结构通常叫做slot-filling 或者 sematic frame.

      传统用手工规则的方法: D. Goddeau, H. Meng, J. Polifroni, S. Seneff, andS. Busayapongchai. A form-based dialogue managerfor spoken language applications. InSpoken Language,1996. ICSLP 96. Proceedings., Fourth InternationalConference on, volume 2, pages 701–704. IEEE, 1996

      基于规则的方法倾向于常见的错误,然后很多结果并不是想要的。 J. D. Williams. Web-style ranking and slu combina-tion for dialog state tracking. InSIGDIAL Conference,pages 282–291, 2014

    8. Slot filling

      填槽这个问题更多的是看成一个序列标注的问题。句子中的每个词都打上一个语义标签。输入是由词组成的句子,输出是每个词对应的slot/concept IDs.

      DBN 类的处理:

      • A Deoras and R. Sarikaya. Deep belief network basedsemantic taggers for spoken language understanding.

        L. Deng, G. Tur, X. He, and D. Hakkani-Tur. Use ofkernel deep convex networks and end-to-end learningfor spoken language understanding

      RNN:

      • G. Mesnil, X. He, L. Deng, and Y. Bengio. Investi-gation of recurrent-neural-network architectures andlearning methods for spoken language understanding.Interspeech, 2013.
      • K. Yao, G. Zweig, M. Y. Hwang, Y. Shi, and D. Yu.Recurrent neural networks for language understand-ing. InInterspeech, 2013
      • R. Sarikaya, G. E. Hinton, and B. Ramabhadran.Deep belief nets for natural language call-routing
      • K. Yao, B. Peng, Y. Zhang, D. Yu, G. Zweig, andY. Shi. Spoken language understanding using longshort-term memory neural networks. InIEEE Insti-tute of Electrical & Electronics Engineers, pages 189 –194, 2014
    9. Language Understanding

      目标是根据一个用户utterance/query 得到其对应的语义slot。slots是预先根据场景定于的。通常来说有两种类型的表示,一个是句子级别的类别,例如用户的意图和utterance的类别。另外一个是单词级别的信息抽取,例如命名实体和槽位填充。

      意图识别是根据一句话来检测用户的意图。 基于深度学习的意图识别: L. Deng, G. Tur, X. He, and D. Hakkani-Tur. Use ofkernel deep convex networks and end-to-end learningfor spoken language understanding. InSpoken Lan-guage Technology Workshop (SLT), 2012 IEEE, pages210–215. IEEE, 2012

      G. Tur, L. Deng, D. Hakkani-T ̈ur, and X. He. Towardsdeeper understanding: Deep convex networks for se-mantic utterance classification. InAcoustics, Speechand Signal Processing (ICASSP), 2012 IEEE Interna-tional Conference on, pages 5045–5048. IEEE, 2012.

      D. Yann, G. Tur, D. Hakkani-Tur, and L. Heck. Zero-shot learning and clustering for semantic utteranceclassification using deep learning. 2014.

      尤其是这个用CNN来抽取query vector进行query分类。 H. B. Hashemi, A. Asiaee, and R. Kraft. Query intentdetection using convolutional neural networks. InIn-ternational Conference on Web Search and Data Min-ing, Workshop on Query Understanding, 2016

      P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, andL. Heck. Learning deep structured semantic modelsfor web search using clickthrough data. InProceedingsof the 22nd ACM international conference on Confer-ence on information & knowledge management, pages2333–2338. ACM, 2013

      Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil.Learning semantic representations using convolutionalneural networks for web search. InProceedings of the23rd International Conference on World Wide Web,pages 373–374. ACM, 2014.

    1. The first step is lexical analysis, i.e. word segmentation and part-of-speech (POS)tagging. The words and POS labels are used as features in the subsequent models. Forthe shared task we used HanLP [1] as our Chinese lexical analyzer.

      SLU 模型做法:

      • 1 第一步是词汇分析,也就是分词,然后词性标注。本文用的是HanLP做词性分析。

      • 2 第二步是槽位边界检测。这个任务看成一个用BILOU进行序列标注的。我们用了基于字和词的序列标注。基于字的 版本是用一个window为7的CRF,用此法特征和词典特征,另外基于词的的CRF模型是window size为5的词法特征,词性特征和词典特征。词典特征是指“当前字词是否 prefix/infix/suffix 在实体词典中某个条目关系。”每个CRF输出n(3)个输出,这整个2n个输出用到下一步。用基于字的序列标注是为了弥补分词效果差带来的可能影响。

      • 3 第三部是槽位类型识别。用的是LR+L正则分类器,预测出的slot,上下文的字词,上下文的词性标注作为特征。

      • 4 第四步是槽位纠正。这个是为了解决因为ASR导致的错误识别造成的结果。用的是一个基于搜索的方法。鉴于已经有各种槽位类型的词典,如果一个预测出来的槽位s类型T没有在对应的槽位词典中,那么就用s作为查询词来在根据最小编辑距离来查询槽位词典中的记录。这个操作会进行两次,一个是s作为中文字符,另一个是s作为拼音来查询。最好的结果是从这两个查处的结果中重新排序后得到的。

      • 5 最后一步是意图分类。用的是XGBoost及其默认参数。用到的特征是单词token,query length,以及前面步骤预测出来的槽位。

    2. Each rule is of the form “if thequery q is listed in a particular lexicon L, and the preceding queries and their predicteddomain labels satisfy certain conditions, then q is assigned a certain intent label and,with the exception of short commands, the entire q is regarded as a slot of the typecorresponding to L.” The rules are arranged in sequential order in accordance with theirpriorities

      规则的具体形式是,"如果query q被列在了一个特定的词汇表L,并且其前面的queries和它们预测出来的领域标签满足特定条件,那么q就可以被打上一个特定的意图标签,并且对于短的命令来说,整个query q是当作对应于L的一个槽位类型".所有规则按照优先级顺序组织的。

    3. Figure 1 shows the framework of our SLU system, which consists of the context-dependent rules for entity-only queries and the context-independent model for querieswith IISPs. The entire system feeds the query to the rules first. If the rule-based compo‐nent returns null result, that means the query is judged to contain IISPs and the model-based component will continue to process it. Otherwise, it means the query is regardedas entity-only and the result of the rules is returned as the final output

      一个query首先经过基于规则的无明显意图词的判定过程,如果是空的话那就意味含有IISPs基于模型的组件会继续来处理,否则的话也就意味着query被看作是只有实体的,那么规则的结果就作为最终结果直接返回。

    4. s in real use cases of dialog systems, the queries in the shared task can be roughlydivided into two kinds, viz. queries with intent-indicating salient phrases and querieswithout. By intent-indicating salient phrase (IISP) it is meant a phrase in the query thatshows the intent of the query. E.g. the phrase “” in the query “” andthe phrases “” in the query “” are IISPs.

      可以把预料文本分成2类,一类是有明显的预示意图的词语,另一类是没有。

  3. Feb 2019
    1. Spoken language understanding (SLU) comprises two tasks, intent identification andslot filling. That is, given the current query along with the previous queries in the samesession, an SLU system predicts the intent of the current query and also all slots (entitiesor labels) associated with the predicted intent. The significance of SLU lies in that eachtype of intent corresponds to a particular service API and the slots correspond to theparameters required by this API. SLU helps the dialog system to decide how to satisfythe user’s need by calling the right service with the right information

      SLU有俩事,意图识别+填槽。

      实践中的困难:

      • 1 意图分类的复杂性
      • 2 世界知识
      • 3 用户状态
    1. 对话管理也可以看成是一个分类任务,即每个对话状态和一个合适的对话动作相对应.和其它有监督的学习任务一样,分类器可以从标注的语料库中训练得到.但是,在某状态下系统应该选择的动作不能仅仅是模仿在训练数据中同一状态对应的动作,而应该是选择合适的动作能够导致一个成功的对话.因此,把对话过程看成是一个决策过程更为合适,从而根据对话的整体成功来优化动作的选择过程[32].因而这是一个规划问题,并且可以用强化学习[33]方法学习获得最优的结果
    2. 对话系统从本体构成和业务逻辑角度,可分为领域任务型和开放型的信息交互.领域任务型系统针对具体应用领域,具有比较清晰的业务语义单元的定义、本体结构以及用户目标范畴,例如航班查询、视频搜索、设备控制等等,这类交互往往是以完成特定的操作任务作为交互目标;而开放型信息交互则不针对特定领域,或说面向非常广泛的领域,交互目标并非业务任务,而是满足用户其它方面的需求,例如开放的百科问答、聊天等.它虽然能一定程度上显示人工智能的力量,但因其并不专注于帮助人解决现实任务问题,其实际使用范围较为狭窄.近年来,随着移动终端的高速发展,面向任务的自然人机对话系统和相关的认知控制理论得到了越来越多的学术和产业界重视,这也是本文讨论的重点
    1. We com-plement recent work by showing the effec-tiveness of simple sequence-to-sequenceneural architectures with a copy mecha-nism. Our model outperforms more com-plex memory-augmented models by 7% inper-response generation and is on par withthe current state-of-the-art on DSTC2, areal-world task-oriented dialogue dataset

      用一个带有copy机制的简单seq2seq框架超过现有最好的真实DSTC2 7个点。

    1. Both NLU and NLG are implementedwith template-based models

      这个地方的NLU和NLG都是用基于模版的模型。

    2. Symptom ExtractionWe follow the BIO(begin-in-out) schema for symptom identification(Figure 1). Each Chinese character is assigned alabel of ”B”, ”I” or ”O”. Also, each extractedsymptom expression is tagged withTrueorFalseindicating whether the patient suffers from thissymptom or not. In order to improve the anno-tation agreement between annotators, we createtwo guidelines for the self-report and the conver-sational data respectively. Each record is anno-tated by at least two annotators. Any inconsis-tency would be further judged by the third one.The Cohen’s kappa coefficient between two anno-tators are71%and67%for self-reports and con-versations respectively

      症状数据抽取,BIO格式。每个中文字符标注为“B","I","O".每个抽取出的症状根据病人真实情况打标为“True","False"。3人2个都标过的才有效,第三人评判。Cohhen kappa 相关性来作为标注标准。

    3. In this paper, we make a move to builda dialogue system for automatic diagno-sis. We first build a dataset collected froman online medical forum by extractingsymptoms from both patients’ self-reportsand conversational data between patientsand doctors. Then we propose a task-oriented dialogue system framework tomake the diagnosis for patients automat-ically, which can converse with patients tocollect additional symptoms beyond theirself-reports. Experimental results on ourdataset show that additional symptoms ex-tracted from conversation can greatly im-prove the accuracy for disease identifica-tion and our dialogue system is able tocollect these symptoms automatically andmake a better diagnosis

      In this paper, we make a move to builda dialogue system for automatic diagno-sis. We first build a dataset collected froman online medical forum by extractingsymptoms from both patients’ self-reportsand conversational data between patientsand doctors. Then we propose a task-oriented dialogue system framework tomake the diagnosis for patients automat-ically, which can converse with patients tocollect additional symptoms beyond theirself-reports. Experimental results on ourdataset show that additional symptoms ex-tracted from conversation can greatly im-prove the accuracy for disease identifica-tion and our dialogue system is able tocollect these symptoms automatically andmake a better diagnosis

      从一个在线医疗论坛抽取来病人的病情自述以及和医生的对话过程作为训练数据,结果表明从对话过程获得的病情描述能大幅提高医生对疾病的诊断,并且论文的对话系统能够有效的收集到这些信息帮助诊断。

    1. PyDial: A Multi-domain Statistical Dialogue System Toolkit

      一个开源的端到端的统计对话系统工具。

      其总的架构包含Sematic Decode,Belief Tracker,Policy Reply System,Language generator. 整体来说整个系统都支持了基于规则的判断过程,也融合了模型的支持。源码值得一看的。

    1. the dataset used in our experiment hasonly the tags of filled information slots extracted by patternmatching between dialogue log and final order information

      用到的数据集是一个coffee ordering的对话过程的数据,31567通对话,142412对会话。数据只有用正则匹配出来的填充的标签信息。

    2. the agent model swaps the in-put and output sequences, and it also takes the tag of filledinformation slots as an input which is extracted from dia-logue in previous turns by pattern matching with the orderinformation in ground truth

      agent model 构建前先预训练。网络结构和user model一样,但是输入和输出反转,同时也把之前对话中已经填充的槽位信息作为输入。但是这俩部分信息并不是简单的直接拼接在一起,而是来学习适合的attention 权重来更好的利用注意力机制。此外任何其他额外的语义意图标签都不必用。

    3. By directly learningfrom the raw dialogue logs, the network takes the agent ut-Figure 2: The network structure: encoder-decoder structurewith the attention mechanismteranceXa:xa1;xa2;:::;xanas the input sequence and takescorresponding user utteranceYu:yu1;yu2;:::;yumas the tar-get sequence.

      User model.直接用双向的LSTM,以agent的utterance作为X,对应的用户的utterance作为Y。

    4. In the task-oriented dialogues, a user usually firstly showsthe intention to the agent and then answers the agent’s ques-tions one by one to specify the demand.

      这个认识是说在通常场景中用户先表达出意图然后回答agent的一个个问题来具体自己的需求。

      用户通常是被动的,偶尔的有一轮问题。换句话说用户基本上都是在一轮中回答由agent提出的问题。所以可以基于一个用户只需要考虑一轮回答来给出回复这样的假设来构建user model,,让agent model来处理多轮对话。

    5. we propose a uSer andAgent Model IntegrAtion (SAMIA) framework inspired byan observation that the roles of the user and agent models areasymmetric. Firstly, this SAMIA framework model the usermodel as a Seq2Seq learning problem instead of ranking ordesigning rules. Then the built user model is used as a lever-age to train the agent model by deep reinforcement learning.In the test phase, the output of the agent model is filtered bythe user model to enhance the stability and robustness. Ex-periments on a real-world coffee ordering dataset verify theeffectiveness of the proposed SAMIA framework.

      吐槽现有机器人比较low都是手工规则,强化学习只适用有限的几个场景。所以受用户和agent角色的不对称关系造了samia。首先是用户模型不是规则或者排序而是seq2seq,然后基于用户模型来用强化学习构建agent。