3,162 Matching Annotations
  1. Last 7 days
    1. The Posi-tional Diction Clustering (PDC) algorithm identified analogous sentences across many LLM responses, which were reified both as color-coordinated cross-document analogous text highlighting (like ParaLib) and in a novel ‘interleaved’ view where analogous sen-tences across documents were rendered in adjacent rows to enable more easy comparison [18].

      sentence related to color

    2. The Semantic Reader project [43] supports features that bring information from related papers into the focal paper’s reading environment. For example, Relatedly [54], part of the Semantic Reader project, highlights unexplored dissimilar information in related work sections of unread papers while low-lighting previously seen information.

      sentence related to color

    3. For example, GP-TSM [24] helps readers read more efficiently by modulating text saliency while preserving grammar. Varifocal- Reader [36] supports skimming by presenting abstract summaries alongside the source document, with machine-learned annotations highlighting key sentence segments in different colors.

      sentence related to color

    4. The Positional Diction Clustering (PDC) algorithm identified analogous sentences across many LLM responses, which were reified both as color-coordinated cross-document analogous text highlighting (like ParaLib) and in a novel ‘interleaved’ view where analogous sentences across documents were rendered in adjacent rows to enable more easy comparison [18].

      sentence related to color

    5. AbstractExplorer instantiates new minimally lossy2 SMT-informed techniques for skimming, reading, and reasoning about a corpus of similarly structured short documents: phrase-level role classification that drives sentence ordering, highlighting, and spatial alignment.

      sentence related to any theory

    6. Structural Mapping Theory (SMT) is a long-standing well-vetted theory from Cognitive Science that describes how humans attend to and try to compare objects by finding mental representations of them that can be structurally mapped to each other (analogies).

      sentence related to any theory

  2. Mar 2026
    1. An appealing alternative to conventional text-based interfaces through graphical user interfaces is the direct use of hands as an input device to provide natural human-computer interaction.

      sentence about GUIs

    2. A more thorough description of the current tools and techniques for interacting with computers as well as recent developments in the subject is provided in the next section.

      sentence about GUIs

    3. The evolving multi-modal and Graphical user interfaces (GUI) enable humans to interact with embodied character agents in a way that is not possible with other interface paradigms.

      sentence about GUIs

    4. The widely used graphical user interfaces (GUI) of today are found in desktop applications, internet browsers, mobile computers, and computer kiosks.

      sentence about GUIs

    Tags

    Annotators

    1. In the context of close reading of research paper abstracts at scale, our findings suggest AbstractExplorer enabled participants to scale up the number of papers they could review through efficient skimming and find common patterns and outliers through sentence comparison, resulting in a rich synthesis of ideas and connections to foster deeper engagement with scholarly articles.

      sentence relating to methodology

    2. We extend existing approaches through automated role annotation, establishing alignments using grammatical chunk boundaries, and preserving sentences in their entirety, instead of relying on abstract meta-data.

      sentence relating to methodology

    3. In this work, we introduce a new paradigm for exploring a large corpus of small documents by identifying roles at the phrasal and sentence levels, then slice on, reify, group, and/or align the text itself on those roles, with sentences left intact.

      sentence relating to methodology

    4. Custom aspects are generated dynamically via API calls to a FastAPI back-end, which prompts an LLM to check whether each sentence in the filtered subset matches the aspect description—either in terms of overall content or a matching token—and extracts the most relevant chunk of that sentence to highlight.

      sentence relating to methodology

    5. After obtaining an expanded set of high-level chunk labels, we assign them to each of the sentence chunks by using LLMs in a multi-class classification few-shot learning task, with the initial labels and assignment as examples.

      sentence relating to methodology

    6. After identifying chunk boundaries, we again prompt an LLM to generate labels for chunks in a human-in-the-loop approach: starting from an initial set of labels for chunk roles, when a new label is generated, a researcher from the research team examines the new label and merges it with existing labels if appropriate, controlling for the total number of labels.

      sentence relating to methodology

    7. In the first stage, Sentence Segmentation and Categorization, abstracts are split into individual sentences using the NLTK package, and each sentence is classified into one of the five pre-defined aspects as listed in Section 4.1.1.

      sentence relating to methodology

    8. When users click on a bookmark icon to the left of any specific sentence in the Cross-Sentences Relationships Pane, that sentence is added to a bookmark list that can be viewed in the Bookmarked Sentences alternate pane.

      sentence relating to methodology

    9. Filtering enables users to narrow their focus to a subset of the corpus while still benefiting from features that help them recognize cross-sentence relationships within the remaining abstracts.

      sentence relating to methodology

    10. The Abstracts panel can be customized by users to display the full abstract text, an abstract “TLDR” (a shorter abstractive summary generated by an LLM), or both at the same time.

      sentence relating to methodology

    11. To allow users to contextualize individual sentences within their respective abstracts, we link the Cross-Sentence Relationship and Abstract panels: when users click on any sentence in the Cross-Sentence Relationships pane, the corresponding full abstract is automatically highlighted and scrolled into view in the Abstracts panel, offering additional context when needed.

      sentence relating to methodology

    12. Together, the vertical and horizontal juxtapositions are designed to help users identify both high-level commonalities and nuanced variations across structurally similar sentences.

      sentence relating to methodology

    13. These alignment options are intended to enable users to more easily read analogous chunks across sentences from different abstracts, ignoring details serving other roles within the sentence.

      sentence relating to methodology

    14. By default, sentences are vertically aligned by the middle of their shared structure tuple, but users can freely switch between the three alignment options using the button group atop the Cross-Sentence Relationship pane.

      sentence relating to methodology

    15. AbstractExplorer also aligns the sentences in three different ways, as illustrated in Figure 5: vertical alignment by the middle of the structure tuple (second element), vertical alignment by the left of the structure tuple (first element), and left-justified alignment (horizontal juxtapositions).

      sentence relating to methodology

    16. This ordering prioritizes dominant structural patterns (largest groups first) while exposing fine-grained variations (via length-sorted triplets), mirroring how humans compare sentences, if SMT is an accurate description in this domain of comparative close reading.

      sentence relating to methodology

    17. This allows users to first understand the different structure patterns and their commonality, before diving into close reading at scale of the sentences that share a particular structure by clicking any of the “Expand” toggles.

      sentence relating to methodology

    18. AbstractExplorer first segments sentences into grammar-preserving chunks—segments that respect grammatical boundaries, i.e., an LLM judges that the sentence can be truncated at that chunk boundary without breaking the grammatical integrity of the preceding text.

      sentence relating to methodology

    19. Viewing one aspect at a time enables users to closely read and compare just the analogous sentences of abstracts, which may be cognitively easier than the comparative close reading of many abstracts in their entirety, especially if cross-sentence relationships are pre-computed and reified in the interface.

      sentence relating to methodology

    20. AbstractExplorer classifies sentences into five pre-defined aspects common in CHI abstracts: Problem Domain, Gaps in Prior Work, Methodology/Contribution, Results/Findings, and Discussion/Conclusion.

      sentence relating to methodology

    21. We chose the sentence as our unit for cross-document alignment because: (1) it preserves complete propositional content (unlike phrases or words), (2) maintains grammatical coherence when isolated (unlike arbitrary text spans), and (3) serves as the minimal self-contained unit where aspects can be meaningfully compared.

      sentence relating to methodology

    22. To keep details at the forefront of the interface, we designed a mechanism to slice abstracts for viewing them from specific angles, allowing for comparative close reading at scale at the sentence level.

      sentence relating to methodology

    23. ABSTRACTEXPLORER is designed to help researchers (1) skim, read, and better familiarize themselves with the contents and composition style of a large corpus of abstracts and (2) reason about cross-paper relationships at scale without abstracting away the author-written sentences about their own work.

      sentence relating to methodology

    24. Finally, a summative study (Section 6) describes how researchers used ABSTRACTEXPLORER to familiarize themselves with a corpus of ~1000 CHI paper abstracts—reading across a larger and more diverse collection of abstracts and more easily discerning relationships and distributions across prior work.

      sentence relating to methodology

    25. Second, an ablation study with eye-tracking (Section 5) revealed that the three key features of ABSTRACTEXPLORER's central cross-sentence relationships pane-sentence order, role-coordinated highlighting, and alignment-work best in concert, not alone.

      sentence relating to methodology

    26. Three studies inform and validate ABSTRACT EXPLORER's design: First, a formative study (Section 3) suggested unmet needs and interest in our approach to supporting cross-document reasoning.

      sentence relating to methodology

    27. AbstractExplorer instantiates new minimally lossy SMT-informed techniques for skimming, reading, and reasoning about a corpus of similarly structured short documents: phrase-level role classification that drives sentence ordering, highlighting, and spatial alignment.

      sentence relating to methodology

    28. A summative study (N=16) describes how these features support users in familiarizing themselves with a corpus of paper abstracts from a single large conference with over 1000 papers.

      sentence relating to methodology

    29. AbstractExplorer has a unique combination of LLM-powered (1) faceted comparative close reading with (2) role highlighting enhanced by (3) structure-based ordering and (4) alignment.

      sentence relating to methodology

    30. In this work, we introduce a new paradigm for exploring a large corpus of small documents by identifying roles at the phrasal and sentence levels, then slice on, reify, group, and/or align the text itself on those roles, with sentences left intact.

      please find me the main contributions of this paper

    31. AbstractExplorer instantiates new minimally lossy SMT-informed techniques for skimming, reading, and reasoning about a corpus of similarly structured short documents: phrase-level role classification that drives sentence ordering, highlighting, and spatial alignment.

      please find me the main contributions of this paper

    32. AbstractExplorer has a unique combination of LLM-powered (1) faceted comparative close reading with (2) role highlighting enhanced by (3) structure-based ordering and (4) alignment. An ablation study (N=24) validated that these features work best together. A summative study (N=16) describes how these features support users in familiarizing themselves with a corpus of paper abstracts from a single large conference with over 1000 papers.

      please find me the main contributions of this paper

    33. We contribute: • Novel SMT theory-informed text analysis and rendering techniques for enabling cross-document skimming and comparative close reading at scale • AbstractExplorer, which instantiates these techniques for familiarizing oneself with a corpus of ∼1000 CHI paper abstracts. • Three studies informing and evalutaing the benefits, challenges, and interactions between these techniques.

      please find me the main contributions of this paper

    34. The ablation and summative studies verified the value of Abstract-Explorer, specifically showing that all three components of the Structural Mapping Engine—color coding, sentence ordering, and vertical alignment—are crucial for facilitating comparative close reading at scale.

      sentence relating to testing

    35. The study concluded with a 15-minute semi-structured interview. During the interview, participants saw screenshots from the three conditions and were asked which they preferred and disliked, why, what they wished the interface had, what influenced their skimming, and how they normally skimmed texts.

      sentence relating to testing

    36. After the ablation study validated the effectiveness of all three SMT-inspired features together (especially for lower NFC users), we completed the implementation of AbstractExplorer and eval-uated its impact on researchers’ reading and sensemaking of a corpus of all ∼1000 paper abstracts from ACM CHI 2024.

      sentence relating to testing

    37. The most preferred condition (all three features enabled) was tied with the baseline no-features-enabled condition for lowest reported cognitive load. Specifically, 11 participants reported their lowest raw NASA-TLX scores8 in the all-three-features condition, and a different 11 participants reported their lowest raw NASA-TLX scores in the baseline condition.

      sentence relating to testing

    38. The most popular condition had all three features enabled, i.e., 11 out of 24 participants (≈ 50%) preferred Figure 7C, as shown in the “Preferred” columns of Table 1. The remaining participants were roughly evenly split between the no-features baseline (6 par-ticipants) and the without-alignment ablation condition (5 partic-ipants). One participant each liked the without-highlighting and without-ordering ablation conditions most, respectively.

      sentence relating to testing

    39. The specific research questions for this study were: (1) How do highlighting, alignment, and ordering affect reading patterns, user experience, and cognitive load? (2) How do participants’ valuation of these features relate to their Need for Cognition? (3) Does each feature provide value on its own, or only in conjunction with one or more of the other two features?

      sentence relating to testing

    40. In this study, we allowed participants to experience views of same-aspect sentences (Section 4.1.1) with different combinations of high-lighting, ordering, and alignment (as described in Section 4.1.2 and Section 4.1.4) enabled or not, in order to understand which and/or what combinations most effectively supported users’ ability to skim and read laterally across documents.

      sentence relating to testing

    41. Three studies inform and validate ABSTRACT EXPLORER's design: First, a formative study (Section 3) suggested unmet needs and interest in our approach to supporting cross-document reasoning. Second, an ablation study with eye-tracking (Section 5) revealed that the three key features of ABSTRACTEXPLORER's central cross- sentence relationships pane-sentence order, role-coordinated high- lighting, and alignment-work best in concert, not alone. Finally, a summative study (Section 6) describes how researchers used AB- STRACTEXPLORER to familiarize themselves with a corpus of ~1000 CHI paper abstracts—reading across a larger and more diverse col-lection of abstracts and more easily discerning relationships and distributions across prior work.

      sentence relating to testing

    42. A summative study (N=16) describes how these features support users in familiarizing themselves with a corpus of paper abstracts from a single large conference with over 1000 papers.

      sentence relating to testing

    43. an ablation study with eye-tracking (Section 5) revealed that the three key features of ABSTRACTEXPLORER's central cross- sentence relationships pane-sentence order, role-coordinated high- lighting, and alignment-work best in concert, not alone.

      any sentence about eye-tracking, eye-trackers, etc.

    44. an ablation study with eye-tracking (Section 5) revealed that the three key features of ABSTRACTEXPLORER's central cross-sentence relationships pane-sentence order, role-coordinated highlighting, and alignment-work best in concert, not alone.

      sentence about eye-tracking

    45. an ablation study with eye-tracking (Section 5) revealed that the three key features of ABSTRACTEXPLORER's central cross- sentence relationships pane-sentence order, role-coordinated high- lighting, and alignment-work best in concert, not alone.

      sentence about eye-tracking

    46. AbstractExplorer used variation affordances present in prior systems, e.g., color-coordinated highlighting of analogous text in Gero et al. [18], and introduced new ones, such as alignment of sentences based on analogous chunks within them, which had only been hypothesized in prior work.

      sentence related to Structural Mapping Theory (SMT)

    47. Structural Mapping Theory (SMT) is a long-standing well-vetted theory from Cognitive Science that describes how humans attend to and try to compare objects by finding mental representations of them that can be structurally mapped to each other (analogies).

      sentence related to Structural Mapping Theory (SMT)

    48. AbstractExplorer instantiates new minimally lossy SMT-informed techniques for skimming, reading, and reasoning about a corpus of similarly structured short documents.

      sentence related to Structural Mapping Theory (SMT)

    49. Lossless SMT-informed techniques have yet to be brought to bear in the context of researchers familiarizing themselves with a corpus of existing literature.

      sentence related to Structural Mapping Theory (SMT)

    50. This SMT-informed approach, which AbstractExplorer shares, tries to give this mental machinery “a leg up,” letting users perhaps skip some steps by accepting reified cross-document relationships identified by the computer.

      sentence related to Structural Mapping Theory (SMT)

  3. Feb 2026
    1. The real annoying thing about Opus 4.6/Codex 5.3 is that it’s impossible to publicly say “Opus 4.5 (and the models that came after it) are an order of magnitude better than coding LLMs released just months before it” without sounding like an AI hype booster clickbaiting, but it’s the counterintuitive truth to my personal frustration
    1. A generative AI like ChatGPTData Analyst can take on the role of the evaluation soft-ware. It is expected that this manner of use will make thestudents' work easier, as less emphasis needs to be placedon the programming itself. Instead, teachers can incorpo-rate exercises that encourage students to code more effi-ciently and accurately with the assistance of AI. Thisshifts the focus from finding the right command or func-tion to examining and understanding the data moreclosely. As a consequence, students are better enabled tointerpret the results of statistical evaluation software cor-rectly, thus fulfilling goal 8 of the GAISE report.

      rhetoric: Schwarz uses a statement of transition to contrast the old education model (rote memorization of commands) with a new required model (critical examination).

      inference: This supports the argument that education and labor must start to pivot away from the "Generalist" process-oriented tasks. If the machine assistants handle the 'How' (the commands and functions), then the human must focus more on the 'Why' and the 'what does it mean (understanding/wisdom)'. This helps to validate the work of the assistants and helps to make it useful and valuable in the real world.

    2. statistical knowledge is still required in order toformulate the correct prompts and to ensure that the AIdoes not leave out any step of the analysis.

      rhetoric: author presents a prescriptive claim that AI needs humans with competent knowledge (in this case, statistics) to create prompts and ensure that the AI does not leave out any steps of the analysis. He positions domain knowledge not as a tool for using AI for statistical analysis, but a prerequisite for management of the AI and auditing the output.

      inference: In addition to policing and correcting the AI outputs, the deep domain knowledge is what allows the AI to do complex data analysis without mistakes, hallucinated results, or mathematically false outcomes. This is basically the job description of a human with "Augmented Human Wisdom". The human's value is no longer in doing math, but in possessing the vertical expertise (flesh/wisdom) to know exact what math needs to be done and ultimately auditing the assistant machine's work.

    3. ChatGPT Data Analyst clearly produced a false resulthere, precisely because the application assumptions for theANOVA were not checked.

      rhetoric: Schwarz employs cause-and-effect reasoning here based on empirical testing. He links a specific technical failure (not checking assumptions) to a definitive unwanted outcome (a false result).

      inference: the "Data Analyst" function of ChatGPT hallucinated a result during the use of it's core function! This is the best evidence so far of the 'Crisis of Truth' and the dangers of the 'Headless Automatons' in my essay. If a generalist with no deep knowledge uses AI, they are at great risk of blindly accepting mathematically false conclusions. Synthetic syntax without competent human validation is a liability.

    4. The results show that generative AI canfacilitate data analysis for individuals with minimal knowledge of statistics,mainly by generating appropriate code, but only partly by following standardprocedures.

      rhetoric: author uses comparative, objective statement (logos) to establish the main boundary of the technology's capability/capacity -- it excels at technical generation (things like coding) but fails at standard procedures (methodological adherence to SOPs).

      inference: the proves the 'Raising the Floor' concept. AI completely automates the entry-level syntax (the "Word"), meaning that the Generalist coder is obsolete! However, because it fails at standard procedures, it requires a human architect to guide it to outputs that are valuable in the real world.

    1. PWA have language deficits that require bespoke AAC supports. These supports may beenhanced by LLMs in software systems that use spoken user input to provide relevantsuggestions that have grammatical and speech production support.

      rhetoric: concluding statement. this positions the LLM as an 'enhancement' to physical human limitation, rather than a replacement of the human subject.

      inference: This helps to validate the 'Augmented Human Wisdom' model. The future of AI is NOT replacing humans, but AI acting as a high-powered syntax engine that is strictly guided by human needs and human intent. The AI does not have 'agency', as it is a software tool that helps the human to execute their visions.

    2. Perseverations that are input into the system are essentially mag-nified by the system’s suggested sentences,

      rhetoric: authors explain an unintended consequence of using the AI tool: it scales the errors or the emptiness of the human prompt.

      inference: this is an excellent metaphor for the 'manager fallacy'. If the human user in incompetent (or provides empty or incomplete input), the AI does not magically create wisdom -- it just amplifies the user's incompetence in a a highly articulate synthetic thought.

    3. Participant 2 stated the age of her daughters (“Name1 is 18, Name2 is21”), Aphasia-GPT transformed it as “Name1 is 18 and 21”, which is an impossible, butrelated, hallucination

      rhetoric: researchers use a specific, clinical observation of an error to demonstrate the model's inability to comprehend logical reality despite the human relaying a perfectly structured sentence.

      inference: this shows that AI is amoral and lacks the lived experience necessary to make logical judgments that work in the real world. It can format a sentence beautifully, but it does not/will not always understand that a single human cannot be two ages at once. This is why it is very important/necessary for the "flesh" to text the output against reality

    4. Aphasia-GPT is a real-time, AI-enabled web app designed to expand the words providedby a user into complete sentences as suggestions for a user to select.

      rhetoric: authors provide a definition of their creation (Aphasia-GPT) to describe it's mechanism: taking a fragmented input and expanding it into a fully structured, complete output.

      inference: this is the embodiment of Harari's primary metanym of the word v flesh (syntax v human). in this example, Aphasia-GPT provides the words (syntax) to the fleshy human that struggles with those words, while also relying on the human to spark the intent of the communication. The human is using AI to communicate with words, because the words are very difficult for the human.

    1. The cost of the time that it takes fix "workslop" could add up too, with a $186 monthly cost per employee on average, according to a survey of desk workers by BetterUp in partnership with the Stanford Social Media Lab. Forty percent of the workers surveyed said they received "workslop" in the last month and that it took an average of two hours to resolve each incident.

      $186/per employee/per month!

      10 employees = ($22,320) 25 employees = ($55,800) 50 employees = ($111,600) 100 employees = ($223,200) 250 employees = ($558,000) 500 employees = ($1,116,000) 1000 employees = ($2,232,000)

    2. “Younger workers aren’t necessarily more careless, but they’re often using AI more frequently and earlier in their workflows," Dennison said. "There is also a training gap. Organizations often assume younger employees intuitively understand AI, yet provide little guidance on verification, risk, or appropriate use cases. As a result, AI may be treated as an answer engine rather than a support tool."

      this is another great quote, which helps to establish how orgs treat younger generations, and how they tend to overtrust their understanding of AI.

    3. 58% said direct reports submitted work that contained factual inaccuracies generated by AI tools, while fewer reported that AI failed to account for critical contextual factors. Other issues cited include low-quality content, poor recommendations and inappropriate messaging.

      from reporting managers, 58% of them said that employees were submitting work that contained factual inaccuracies in the work that was generated by AI, and that fewer of them reported that AI failed to account for "critical contextual factors", implying that the writing was generic and not directly applicable to the context that the writing was written in. Other issues were: low quality content, poor recommendations and inappropriate messaging.

    4. 59% of managers saying that they had to invest additional time to correct or redo work created by AI. Similarly, 53% said their direct reports had to take on extra work, while 45% said they had to bring in co-workers to help fix the mistake.

      Extra time and money spent to repair errors made by AI but not caught by the human in the middle. 59% is almost 2/3 (closer to 3/5) needed to correct or redo the work created by AI without a human auditing it. 53% claim extra work is needed to repair the AI mistakes, and 45% also needed to bring in a (perhaps more senior) co-worker to help fix the mistake. I can imagine workers needing to work on a mistake the hits production code, and all of the thousands (or more) mistakes that would need to be later repaired and rolled back. very expensive and costly.