4,426 Matching Annotations
  1. Mar 2026
    1. One prominent definition of accessibility is given by ISO 9241-171, which defines it as 'the usability of a product, service, environment or facility by people with the widest range of capabilities.'

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    2. Acceptability has two main dimensions [591]. The first dimension, practical acceptability, includes costs, the reliability of the interactive system, and its compatibility with other systems. The perceptions of utility and usability may also influence the judgment of practical acceptability.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    3. ISO 9241-11 definition... defines usability as the 'extent to which a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.'

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    4. One shorthand way of expressing this is that utility is 'whether the functionality of a system in principle can do what is needed' [591, p. 25]. In practice, whether people can do anything concerns—among other things—usability.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    5. The utility of an interactive system concerns its match with the tasks of users. If the match is good, the tool has high utility; if the tasks that users want to do are not supported by the tool, the tool has low utility.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    6. Usability concerns how easily computer-based tools may be operated by users trying to accomplish a task. Usability differs from utility. Usability concerns whether users can use the product in a way that makes it possible to realize its utility; utility is about whether the goal is important to the user. Ideally, the user can use the tool without unnecessary effort so that the use is direct, transparent, and unnoticeable.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    7. Usability is one of the best predictors of users' willingness to adopt software. For example, the User Burden Scale is a questionnaire for measuring the felt burden in software use [806]. It consists of six subscales: difficulty of use, physical burden, time and social burden, mental and emotional burden, privacy burden, and financial burden.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    1. Norman offered two central concepts to help us understand these cognitive efforts: the gulf of execution and the gulf of evaluation. These two concepts describe inferential breakpoints for users seeking to express their intentions and interpret feedback from the system, respectively.

      highlight the key concepts in this paper

    2. A significant early theory of dialogue interaction is the seven-stage model of Norman [600]. It considers interaction as goal-directed, turn-based dialogue.

      highlight the key concepts in this paper

    3. both the computer and the user may have initiative. For example, a pop-up window can be presented to confirm a risky selection. When there is a misunderstanding about the context of the dialogue, errors may happen, and the partners must recover from them.

      highlight the key concepts in this paper

    4. both the computer and the human participate in establishing a shared context. The computer does not simply receive a message; it also communicates the effects of that message. Therefore, the design of feedback, affordances, and cues is central to dialogue-based interaction.

      highlight the key concepts in this paper

    5. The key idea in the dialogue view of interaction is the organization of communication as a series of turns. Dialogue evolves through communication turns between two or more partners. In one turn, an appropriate communication act is made by one partner based on the communication context.

      highlight the key concepts in this paper

    6. Dialogue can be understood as computation, goal-directed action, communication, or embodied action. Each perspective provides specific methods for the analysis and design of dialogue.

      Highlight the sentences that capture the main point of this chapter

    7. The key idea in the dialogue view of interaction is the organization of communication as a series of turns. Dialogue evolves through communication turns between two or more partners. In one turn, an appropriate communication act is made by one partner based on the communication context. The act aims to get the other partner to do or understand something. This understanding then forms the context within which the other partner takes their turn.

      Highlight the sentences that capture the main point of this chapter

    1. However, self-attention alone is permutation-invariant, i.e., if we reorder the rows of X, then the mechanism has no built-in sense of which token came first. Since word order matters, we must inject positional information. We often add a position vector pt to the token embedding: h(0)_t = e(xt) + pt One classical choice for the positional encoding is called the sinusoidal positional encoding. pt[2k] = sin(t / 10000^{2k/d}), pt[2k+1] = cos(t / 10000^{2k/d}) The sinusoidal features give each position a distinct geometric signature across many frequencies. Nearby positions have related encodings while distant positions remain distinguishable. This lets the network reason about relative offsets.

      highlight where positional encoding is mentioned

    1. Such points about the origins of data and the processes of their collection are a key factor in civic text visualization. Indeed, a shift to emphasizing paradata can help draw attention to the representativeness of data.

      Show alternative approaches to text visualization beyond analytics

    2. In contrast, we could consider designing explicitly for multiple users. Doing so requires more than designing for different levels of expertise (see the following subsection for more on expertise) or designing for collaborative use, though both those things may be valuable in their own right. Rather, this dimension encourages accounting for the different types of relationalities that users may have with a system [cf. BB17].

      Show alternative approaches to text visualization beyond analytics

    3. Civic text visualizations similarly designed to foreground interpretation could help make clearer who is making these interpretive decisions, thereby highlighting the lack of neutrality and objectivity in data [DK20].

      Show alternative approaches to text visualization beyond analytics

    4. It is informative to contrast this analytic emphasis with other evolving discourses in information visualization. The prior work reviewed above illustrates a few alternative orientations, including rhetoric [HD11], feminism [DK16; DK20], ethics [Cor19], and others [DFCC13; VW08].

      Show alternative approaches to text visualization beyond analytics

    5. For example, CommunityPulse [JHSM21] uses common, simple visualizations and iconography, such as bar charts and emojis, to provide overviews of people's emotions towards civic agendas and ideas. Similarly, ConsiderIt [KMF*12b] uses bar charts to visualize people's stance towards ballot measures.

      Find civic text visualization systems that are explicitly named.

    6. Tools such as ConsiderIt [KMF*12b] or Opinion Space [FBRG10] are designed specifically for the public. In contrast, tools such as CommunityPulse [JHSM21] or CommunityClick [JKW*21] are focused more on supporting community leaders and decision makers.

      Find civic text visualization systems that are explicitly named.

    7. For example, MultiConVis [HC16b] makes prescriptive statements not only as to the sentimental valence of individual conversations but also as to the topics that each conversation is about. Similarly, ConsiderIt [KMF*12b] asks participants to place individual statements as either supporting or opposing a given ballot proposition.

      Find civic text visualization systems that are explicitly named.

    8. Some tools provide both computational and visualization features. For instance, CommunityPulse provides a scaffolding for multifaceted public input analysis using visualizations [JHSM21], and MultiConVis enables multilevel exploration and analysis of threaded conversations [HC16b].

      Highlight all civic participation approaches

    9. Researchers in HCI and digital civics have begun to explore methods to improve the analysis capabilities of visual analytics tools [JHSM21; MJS20b]. Although the broader community of visualization researchers acknowledges the importance of designing for varied levels of expertise [Mun14; GTS10; SNHS13], existing work on text analytics in general, as well as civic text visualizations in particular, focuses research efforts towards designing for analysts. Less effort has been put on designing and developing text visualization for non-experts—people who are not trained in or have had limited exposure to visualization and analytics.

      Highlight all civic participation approaches

    10. Improving the public input process has become an important goal in the field of digital civics [MNC*19; VCL*16; OW15]. To that end, researchers and practitioners have developed a variety of systems for, e.g., sharing public opinions [FBRG10], building consensus [KMF*12a; ZNB15], summarizing public input [19], or identifying people's priorities, reflections, and hidden insights [JHSM21].

      Highlight all civic participation approaches

    11. Previous work has introduced several online engagement platforms to enable the public to asynchronously provide their comments, ideas, and feedback around civic issues [19; 20b; MJN*18]. These engagement tools have used micro-tasks [MJN*18], visualizations [19], and forum-like discussions [20b] to engage disconnected and disenfranchised populations [MNC*19]. Others have proposed technologies to promote in-person engagement of reticent participants during town halls [JKW*21] and public meetings [LLS] using clicker-like devices.

      Highlight all civic participation approaches

    12. Despite their central importance in the civic engagement process, members of the general public are not necessarily involved in the analysis process. Hence, they are often left out of the loop when designing civic text visualizations—their requirements, aptitudes, knowledge, etc. are not given central consideration. Integrating participatory approaches in civic text visualization could pave the way not only for more inclusive analysis but also for leveraging the general public's knowledge to gather richer insights.

      Highlight all civic participation approaches

    1. social dynamics, such as shyness and tendency to avoid confrontation with dominant personalities can also hinder opinion sharing in town halls by favoring privileged individuals who are comfortable or trained to take part in contentious public discussions [27, 127].

      Highlight all civic participation approaches

    2. town halls inadvertently cater to a small number of privileged individuals, and silent participants often become disengaged despite physically attending the meetings [61]. Due to the lack of inclusivity, the outcome of such meetings often tends to feel unjust and opaque for the general public [39, 54].

      Highlight all civic participation approaches

    3. designing communitysourcing technologies to include marginalized opinions and amplify participation alone may not be enough to solve inequality of sharing opinions in the civic domain [26, 126]. Despite the success of previous works [25, 53, 90], technology is rarely integrated with existing manual practices and follow-ups of engagements between government officials and community members are seldom propagated to the community.

      Highlight all civic participation approaches

    4. Marginalization can be broadly defined as the exclusion of a population from mainstream social, economic, cultural, or political life [58], which still stands as a barrier to inclusive participation in the civic domain [48, 94]. Researchers in HCI and CSCW have explored various communitysourcing approaches to include marginalized populations in community activities, proceedings, and designs [48, 53, 81, 93, 132].

      Highlight all civic participation approaches

    5. Prior investigations by Bryan [29] and Gastil [56] showed a steady decline in civic participation in town halls due to the growing disconnect between local government and community members and the decline in social capital [43, 111, 113]. Despite the introduction of online methods to increase public engagement in the last decade [4, 5, 7, 37, 81, 93], government officials continue to prefer face-to-face meetings to engage the community in the decision-making process [32, 52, 94].

      Highlight all civic participation approaches

    6. Traditional community consultation methods, such as town halls, public forums, and workshops are the modus operandi for public engagement [52, 94]. For fair and impartial civic decision-making, the inclusivity of community members' feedback is paramount [60, 94, 126]. However, traditional methods rarely provide opportunities for inclusive public participation [30, 87, 95].

      Highlight all civic participation approaches

    7. Murphy used such systems to promote democracy and community partnerships [103]. Similarly, Boulianne et al. deployed clicker devices in contentious public discussions about climate change to gauge public opinions [25]. Bergstrom et al. used a single button device where the attendees anonymously voted (agree/disagree) on issues during the meeting. They showed that back-channel voting helped underrepresented users get more involved in the meeting [22].

      Highlight all civic participation approaches

    1. Again, p is the probability of seeing results as extreme (or more extreme) as those actually observed if the null hypothesis were true. So p is computed under the assumption that the null hypothesis is true. Yet it is common for researchers, teachers and even textbooks to think of p as the probability of the null hypothesis being true (or equivalently, of the results being due to chance), an error called the "fallacy of the transposed conditional" (Haller and Krauss, 2002; Cohen, 1994, p.999).

      p-value is misinterpreted and confusing

    1. This assessment raises two issues. First, it is arbitrary. If 10 of the 15 CIs included the predicted values, would the results also support the theory, or instead refute it? If one instead used 99% CIs, would positive results for 12 of the 15 predictions be enough to support the theory? This arbitrariness arises because CIs offer no principled method for generating an inference regarding the theory.

      Estimation is too messy / complex and not clear enough

    1. To illustrate this point Oakes posed a series of true/false questions regarding the interpretation of p-vales to seventy experienced researchers and discovered that only two had a sound understanding of the underlying concept of significance [25].

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    2. failure to check assumptions about the data required by particular tests, over-testing and using inappropriate tests

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    3. abusing statistical tests, making illogical arguments as a result of tests, deriving inappropriate conclusions from nonsignificant results, and confusing the size of p-values with effect sizes.

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    4. This approach, fiercely promoted by Fisher in the 1930's [9], has become the gold standard in many disciplines including quantitative evaluations in HCI. However, the approach is rather counter-intuitive; many researchers misinterpret the meaning of the p-value.

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    1. We found that using MINE directly gave identical performance when the task was nontrivial, but became very unstable if the target was easy to predict from the context (e.g., when predicting a single step in the future and the target overlaps with the context).

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    2. We note that better [49, 27] results have been published on these target datasets, by transfer learning from a different source task.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    3. We also found that not all the information encoded is linearly accessible. When we used a single hidden layer instead the accuracy increases from 64.6 to 72.5, which is closer to the accuracy of the fully supervised model.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    4. For lasertag_three_opponents_small, contrastive loss does not help nor hurt. We suspect that this is due to the task design, which does not require memory and thus yields a purely reactive policy.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    5. Although this is a standard transfer learning benchmark, we found that models that learn better relationships in the childeren books did not necessarily perform better on the target tasks (which are very different: movie reviews etc).

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    6. We found that more advanced sentence encoders did not significantly improve the results, which may be due to the simplicity of the transfer tasks (e.g., in MPQA most datapoints consists of one or a few words), and the fact that bag-of-words models usually perform well on many NLP tasks [48].

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    7. It is important to note that the window size (maximum context size for the GRU) has a big impact on the performance, and longer segments would give better results. Our model had a maximum of 20480 timesteps to process, which is slightly longer than a second.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    8. Interestingly, CPCs capture both speaker identity and speech contents, as demonstrated by the good accuracies attained with a simple linear classifier, which also gets close to the oracle, fully supervised networks.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    9. Figure 6 shows that for 4 out of the 5 games performance of the agent improves significantly with the contrastive loss after training on 1 billion frames.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    10. Despite being relatively domain agnostic, CPCs improve upon state-of-the-art by 9% absolute in top-1 accuracy, and 4% absolute in top-5 accuracy.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    11. We also found that not all the information encoded is linearly accessible. When we used a single hidden layer instead the accuracy increases from 64.6 to 72.5, which is closer to the accuracy of the fully supervised model.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    1. Provide your best guess for the following question, and describe how likely it is that your guess is correct as one of the following expressions: ${EXPRESSION_LIST}. Give ONLY the guess and your confidence, no other words or explanation. For example:\n\nGuess: <most likely guess, as short as possible; not a complete sentence, just the guess!>\nConfidence: <description of confidence, without any extra commentary whatsoever; just a short phrase!>\n\nThe question is: ${THE_QUESTION}

      please find the barebones practical information i need to implement this system or strategy

    2. Provide your ${k} best guesses and the probability that each is correct (0.0 to 1.0) for the following question. Give ONLY the guesses and probabilities, no other words or explanation. For example:\n\nG1: <first most likely guess, as short as possible; not a complete sentence, just the guess!>\n\nP1: <the probability between 0.0 and 1.0 that G1 is correct, without any extra commentary whatsoever; just the probability!>

      please find the barebones practical information i need to implement this system or strategy

    3. Each linguistic likelihood expression is mapped to a probability using responses from a human survey on social media with 123 respondents (Fagen-Ulmschneider, 2023). Ling. 1S-opt. uses a held out set of calibration questions and answers to compute the average accuracy for each likelihood expression, using these 'optimized' values instead.

      please find the barebones practical information i need to implement this system or strategy

    4. Finally, our study is limited to short-form question-answering; future work should extend this analysis to longer-form generation settings.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    5. While our work demonstrates a promising new approach to generating calibrated confidences through verbalization, there are limitations that could be addressed in future work. First, our experiments are focused on factual recall-oriented problems, and the extent to which our observations would hold for reasoning-heavy settings is an interesting open question.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    6. the 1-stage and 2-stage verbalized numerical confidence prompts sometimes differ drastically in the calibration of their confidences. How can we reduce sensitivity of a model's calibration to the prompt?

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    7. Provide your best guess and the probability that it is correct (0.0 to 1.0) for the following question. Give ONLY the guess and probability, no other words or explanation. For example:\n\nGuess: <most likely guess, as short as possible; not a complete sentence, just the guess!>\n Probability: <the probability between 0.0 and 1.0 that your guess is correct, without any extra commentary whatsoever; just the probability!>\n\nThe question is: ${THE_QUESTION}

      please find the barebones practical information i need to implement this system or strategy

    8. Provide your best guess for the following question, and describe how likely it is that your guess is correct as one of the following expressions: ${EXPRESSION_LIST}. Give ONLY the guess and your confidence, no other words or explanation.

      please find the barebones practical information i need to implement this system or strategy

    9. To fit the temperature that is used to compute ECE-t and BS-t we split our total data into 5 folds. For each fold, we use it once to fit a temperature and evaluate metrics on the remaining folds. We find that fitting the temperature on 20% of the data yields relatively stable temperatures across folds.

      please find the barebones practical information i need to implement this system or strategy

    10. Additionally, the lack of technical details available for many state-of-the-art closed RLHF-LMs may limit our ability to understand what factors enable a model to verbalize well-calibrated confidences and differences in this ability across different models.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    11. With Llama2-70B-Chat, verbalized calibration provides improvement over conditional probabilities across some metrics, but the improvement is much less consistent compared to GPT-* and Claude-*.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    12. The verbal calibration of the open source model Llama-2-70b-chat is generally weaker than that of closed source models but still demonstrates improvement over its conditional probabilities by some metrics, and does so most clearly on TruthfulQA.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    13. Among the methods for verbalizing probabilities directly, we observe that generating and evaluating multiple hypotheses improves calibration (see Figure 1), similarly to humans (Lord et al., 1985), and corroborating a similar finding in LMs (Kadavath et al., 2022).

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    1. the psychology research community has been strongly questioning the value of NHST in psychology for some years now [6] and calling for a more meaningful reporting of statistical inference based on effect sizes, confidence intervals and Bayesian reasoning [9].

      Mentioning the problems with p-values

    2. Similarly, if the significance level is set at 0.05, then this is the probability of the data occurring by chance when there is no experimental effect, namely one in twenty times. The more tests that are done on a particular dataset, the more likely it is that some chance variation will be extreme enough to seem like significance.

      Mentioning the problems with p-values

    3. Violation of the assumptions of any statistical test can produce p values that bear little relation to the actual probabilities of outcomes and hence comparison to the significance level of 0.05 is meaningless.

      Mentioning the problems with p-values

    4. for an analysis to be sound, it is necessary that in the tests performed the probabilities of outcomes are accurately reflected in the p values produced by the tests. If this is not the case, then the NHST argument form is severely weakened.

      Mentioning the problems with p-values

    5. NHST is the most commonly encountered form of statistical inference and is what is usually associated with producing a null hypothesis, then testing it to give some statistic such as a t value, and then turning the statistic into a p value.

      Mentioning the problems with p-values

    1. The inclusion of counterfactuals often resulted in a substantial increase in precision, indicating that the models were better able to correctly classify relevant instances while reducing false positives.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    2. Mocha addresses two seemingly contradictory objectives: (1) generating labeled data that diversifies the training dataset to aid the model's learning, and (2) maintaining structural consistency across the batches of data presented to users to support their cognitive processes.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    3. The results of our study indicate that participants spent significantly less time annotating batches of counterfactuals when they were rendered according to SAT compared to other conditions i.e., supporting the participants' selective focus on the varying phrases, rather than phrases that stay consistent.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    4. From a cognitive perspective, the theme color aligns with the human's (theorized) structural mapping engine [27] by making relational discrepancies between the original and counterfactual examples more explicit.

      return any single sentence that describes an explicit or implicit connection to theory

    5. The last two prior works also combine Variation Theory (VT) and SAT together, as we did (i.e., a corollary of SAT referred to as Analogical Transfer/Learning Theory).

      return any single sentence that describes an explicit or implicit connection to theory

    6. Estes and Hasson [17] argue that while alignable differences can be more straightforward and easier for comparison, non-alignable differences can also provide key information that might otherwise remain overlooked.

      return any single sentence that describes an explicit or implicit connection to theory

    7. This symbiotic relationship stems from the fact that Structural Alignment Theory (SAT) enhances the salience of differences, while the way we used Variation Theory (VT) to generate contradicting examples across the boundaries of labels ensures that these differences are conceptually informative.

      return any single sentence that describes an explicit or implicit connection to theory

    8. Structural Alignment Theory states that humans naturally look for structural mapping between representations of objects to help them understand, compare, and infer relationships between said objects.

      return any single sentence that describes an explicit or implicit connection to theory

    9. According to Variation Theory, learners better understand concepts by observing variations along critical features (dimensions of variation) that define that concept and, separately, observing variations along superficial features that do not define that concept—all while other features, when possible, are held constant.

      return any single sentence that describes an explicit or implicit connection to theory

    1. Taken together, these findings almost unanimously show that, on average, AI-supported writing decreases but does not eliminate writer's feelings of ownership, underscoring the need for a larger theory of AI participation in the creative process.

      sentence that refers to a theory

    2. This can be understood through the frame of precarious work [5]; as writers feel that their work is increasingly precarious, the power differential between themselves and the organizations seeking to train LLMs grows larger.

      sentence that refers to a theory

    1. The study concluded with a 15-minute semi-structured interview. During the interview, participants saw screenshots from the three conditions and were asked which they preferred and disliked, why, what they wished the interface had, what influenced their skimming, and how they normally skimmed texts.

      sentence describing any interview procedures

    2. We used these mock-ups as design probes [31] to inspire ideation and elicit creative responses. Specifically, we asked participants to compare and contrast alternative mock-ups and reflect on how they could be used or improved to support their known or emerging synthesis and information-foraging goals.

      sentence describing any interview procedures

    3. In the first part of the session, we asked participants about their strategies for selecting publication venues for their manuscript submissions, how they identify and synthesize information from venues, their approaches to writing manuscripts, and finally, the technology they have used to help with these processes, current technology shortcomings, and ideas for addressing these challenges.

      sentence describing any interview procedures

    4. The interview sessions were divided into two parts: an open-ended semi-structured interview about their backgrounds and practices, followed by feedback on a range of mock-ups, including novel reified relationships between analogous sentences in different abstracts (Figure 2).

      sentence describing any interview procedures

    5. In order to determine (1) the context in which we might offer novel views of scientific abstracts and (2) the intelligibility of various novel prototype designs for reifying cross-abstract relationships, we conducted a formative interview study with 12 active researchers (see Appendix A for participant information).

      sentence describing any interview procedures

    6. pre-computing and reifying cross-document analogous relationships make it psychologically possible for users to engage—if they are willing to be guided by it. (Lower NFC users are more likely to fall into this category.)

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    7. Lower NFC participants were generally guided by emergent visual patterns created by the interactions between features, especially blocks of color spanning multiple sentences created when all three features are turned on.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    8. Dialectical activities cannot be done on a user's behalf by AI; with variation affordances, AI is supporting the user's engagement with the data themselves.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    9. In this sense, AbstractExplorer enables dialectical activities that users may otherwise have found to be too tedious or difficult to engage with.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    10. Our work demonstrates that designs informed by Structure-Mapping Theory can support users in navigating, making use of, and engaging with variation present in information.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    11. We posit that our approach can generalize to other domains such as journalism, code synthesis, and social media analytics where visual alignment of text can enable meaningful comparisons of underlying patterns to identify relational clarity.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    12. We demonstrate how slicing sentences according to roles and visually aligning them can help readers perceive cross-document relationships in a coherent manner.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    13. In this work, we introduce a new paradigm for exploring a large corpus of small documents by identifying roles at the phrasal and sentence levels, then slice on, reify, group, and/or align the text itself on those roles, with sentences left intact.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.

    14. Like prior Structural Mapping Theory (SMT)-informed work in text corpora representation, AbstractExplorer's features have enabled some users to see more of both the overview and the details at the same time, facilitating abstraction without losing context.

      statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.