12 Matching Annotations
  1. Dec 2019
    1. To gain insight in what aspects of the simplifi-cation process are challenging to our model, wepresent the most recurring types of errors fromour test set

      Question 9; What suggestions and limitations do the authors suggest?

      The authors provide error analysis to discuss the limitations of their model. They said that the model does not perform well on long sentences, does not always resolve the pronouns correctly and sometimes changes only the simple part of the sentence.

    2. Average ratings of crowdsourced human judgments on fluency, adequacy and complexity. Ratingssignificantly different from S2S-All-FA are marked with * (p <0.05); statistical significance tests were calculatedusing a student t-test. We provide 95% confidence intervals for each rating in the appendix

      Question 4: What are the criteria for solving this problem?

      The criteria for success is to perform better than the existing systems in terms of SARI, an important automatic evaluation metric, and human evaluation on existing datasets. This table presents human evaluation results.

    3. Comparison of our models to baselines andstate-of-the-art models using SARI. We also includeoracle SARI scores (Oracle), given a perfect reranker.S2S-All-FA is significantly better than the DMASS andHybrid baselines using a student t-test (p <0.05)

      Question 4: What are the criteria for solving this problem?

      The criteria for success is to perform better than the existing systems in terms of SARI, an important automatic evaluation metric, and human evaluation on existing datasets. This table presents automatic evaluation results.

    4. There are two main Seq2Seq models we willcompare to in this work, along with the statisticalmodel from Narayan and Gardent (2014). Zhangand Lapata (2017) proposed DRESS (Deep RE-inforcement Sentence Simplification), a Seq2Seqmodel that uses a reinforcement learning frame-work at training time to reward the model for pro-ducing sentences that score high on fluency, ad-equacy, and simplicity. This work showed state-of-the-art results on human evaluation.How-ever, the sentences generated by this model are ingeneral longer than the reference simplifications.Zhao et al. (2018) proposed DMASS (Deep Mem-ory Augmented Sentence Simplification), a multi-layer, multi-head attention transformer architec-ture which also integrates simplification rules.This work has been shown to get state-of-the-artresults in an automatic evaluation, training on theWikiLarge dataset introduced by Zhang and Lap-ata (2017). Zhao et al. (2018), however, does notperform a human evaluation, and restricting evalu-ation to automatic metrics is generally insufficientfor comparing simplification models. Our model,in comparison, is able to generate shorter and sim-pler sentences according to Flesch-Kincaid gradelevel (Kincaid et al., 1975) and human judgments,and provide a comprehensive analysis using hu-man evaluation and a qualitative error analysis.

      Question 6: How is the literature used in relationship to the problem?

      The literature is used to describe the existing approaches for text simplification and also emphasize that the current approaches do not focus on the diversity of the output.

    5. The main contributions ofthis paper can be summarized as follows:•We propose a custom loss function to replacestandard cross entropy probabilities duringtraining, which takes into account the com-plexity of content words.•We include a similarity penalty at inferencetime to generate more diverse simplifications,and we further cluster similar sentences to-gether to remove highly similar candidates.•We develop methods to rerank candidate sim-plifications to promote fluency, adequacy,and simplicity, helping the model choose thebest option from a diverse set of sentences

      Question 5: What is the significance of the research?

      The research proposes different approaches that encourage the current models to produce more diverse output, where the input sentence is substantially modified. These approaches are not specific to text simplification but will also help other tasks like story generation and image captioning.

    6. We train our models on the Newsela Corpus.

      Question 7: What makes the methodology appropriate?

      The authors use Newsela, a benchmark dataset, to train their model, to evaluate their approach and to compare their performance with the state-of-the-art systems. This dataset is large and widely used by the NLP community for text simplification. The authors perform both automatic metric and human evaluation to show that their model outperformed previous state-of-the-art. The authors also show qualitative examples to emphasize the better performance of their model in terms of diversity.

    7. semantic role labelers (Vickrey and Koller, 2008;Woodsend and Lapata, 2014), and can improve thegrammaticality (fluency) and meaning preserva-tion (adequacy) of translation output (ˇStajner andPopovic, 2016)

      Question 3: What is the significance of this problem for the field?

      Text simplification is useful as a preprocessing step for other NLP tasks. Performing well on text simplification will improve the performance of downstream NLP tasks like machine translation, semantic parsing, syntactic parsing and document summarization.

    8. One of the main limitations in applying stan-dard Seq2Seq models to simplification is that thesemodels tend to copy directly from the originalcomplex sentence too often, as this is the mostcommon operation in simplification.

      Question 2: What problem is the paper addressing: What is the goal and what is the current state?

      Given a complex sentence, text simplification systems convert it into a simpler sentence. Current solutions/models for the task copy the input sentence as it is with minor changes, which does not necessarily simplify the input sentence. The goal is to encourage the models to produce simplified sentences which are diverse and different from the input sentence.

    9. Generating diverse outputs from Seq2Seq mod-els could be used in a variety of NLP tasks, such aschatbots (Shao et al., 2017), image captioning (Vi-jayakumar et al., 2018), and story generation (Fanet al., 2018). In addition, the proposed techniquescan also be extremely helpful in leveled and per-sonalized text simplification, where the goal is togenerate different sentences based on who is re-questing the simplification

      Question 10: How does this paper address its audience?

      The authors address the importance of generating diverse outputs for the task. They suggest that focusing on diversity of the output of our current models can also help with other tasks like story generation and chatbots.

    10. To improve the performance of future models,we see several options. We can improve the origi-nal alignments within the Newsela corpus, partic-ularly in the case where sentences are split. Priorto simplification, we can use additional contextaround the sentences to perform anaphora resolu-tion; at this point, we can also learn when to per-form sentence splitting; this has been done in theHybrid model (Narayan and Gardent, 2014), buthas not yet been incorporated into neural models.Finally, we can use syntactic information to ensurethe main clause of a sentence is not removed.

      Question 9; What suggestions and limitations do the authors suggest?

      The authors suggest that adding context information from the adjacent sentences, improving training data and using syntactic information can help with the current errors.

    11. In this section, we compare the baseline modelsand various configurations of our model with bothstandard automatic simplification metrics and ahuman evaluation. We show qualitative exampleswhere each of our contributions improves the gen-erated simplification in Table 3.

      Question 8 : Are the results and discussions appropriately discussed?

      Yes, the results are discussed and analyzed in this paper. The authors perform evaluation using both automatic metrics and humans. Human evaluation is important for the task because automatic metrics only capture the readability of the sentence but not fluency. The authors compare their approach with the existing state-of-the-art systems. For metrics where their approach underperforms existing systems, the authors also discuss why that happens. The authors also show examples wherever necessary to prove their point.

    12. Automatic text simplification aims to reduce thecomplexity of texts and preserve their meaning,making their content more accessible to a broaderaudience

      Question 1: What is the topic and why is it important?

      The topic of the paper is "Automatic Text Simplification". The goal of the task is to make text easier to understand while retaining its original meaning. It can help people with limited literacy skills like children, non-expert readers, non-native speakers and individuals with language disorders.