AI-powered analysis uncovers data at a scale and depth that legacy frameworks were not designed to accommodate.
令人惊讶的是:AI安全分析揭示的数据量之庞大、程度之深,已经彻底让传统的安全框架失效。过去几十年建立的安全防御体系,原本就不是为了处理这种维度的信息而设计的,这意味着整个网络安全行业可能需要被彻底重构,而不仅仅是简单的修补升级。
AI-powered analysis uncovers data at a scale and depth that legacy frameworks were not designed to accommodate.
令人惊讶的是:AI安全分析揭示的数据量之庞大、程度之深,已经彻底让传统的安全框架失效。过去几十年建立的安全防御体系,原本就不是为了处理这种维度的信息而设计的,这意味着整个网络安全行业可能需要被彻底重构,而不仅仅是简单的修补升级。
The cost of understanding what happens in a video has dropped by a factor of roughly 40, while the quality of that understanding has improved dramatically.
大多数人认为AI视频分析仍处于早期阶段且成本高昂,但作者指出AI视频分析成本已大幅下降40倍,质量反而提升。这一反直觉观点暗示视频分析可能已经跨越了实用性的门槛,将催生全新的应用类别,挑战了人们对AI视频处理能力的传统认知。
Exposure alone is a completely meaningless tool for predicting displacement
大多数人认为通过分析工作任务的AI暴露程度可以预测哪些工作会被取代,但作者认为这种单一指标完全无意义,因为它忽略了价格弹性和需求变化等关键因素。这挑战了当前AI就业影响研究的主流方法。
Interviews were video and audio recorded. We transcribed the audio using OpenAI's Whisper automatic speech recognition system and anonymized the transcript before analysis. We analyzed the interview data using thematic analysis [1]. First, two members of the research team independently coded four (25% of collected data) randomly chosen participant data to generate low-level codes. The inter-coder reliability between the coders was 0.88 using Krippendorff's alpha [37]. The two coders then met together to cross-check, resolve coding conflicts, and consolidate the codes into a codebook across two sessions. Using the codebook, the two coders analyzed six randomly selected participant data each. The research team then met, discussed the analysis outcomes, and finalized themes over three sessions.
sentence describing how analysis was performed on data collected by the authors of this paper
We conducted a qualitative analysis of user study transcripts and survey responses using a Grounded Theory approach [8]. First, the lead researcher collected a list of participants' behaviors, approaches, reflections on their experience, and feedback about the interface. The researcher then systematically coded this data, revisiting the data multiples times and refining the codes to ensure consistency and coherence. Through this process, high-level themes were identified and organized using affinity diagramming. Once the thematic structure was finalized, the researcher gathered supporting evidence for each theme and synthesized the findings, which were reviewed by the research team to ensure agreement on the results.
sentence describing how analysis was performed on data collected by the authors of this paper
Activity log data, which revealed how participants actually used the interface, echoed the above findings. According to the log data, participants spent most of their reading time (66.31%) with vertical alignment on the second element in structure pairs, followed by alignment on the first element (29.19%), and left-justified alignment (5.13%). Highlighting usage showed a similar preference: 91.13% of time with all chunks highlighted, 8.25% with partial highlighting, and minimal time (0.63%) without highlights.
sentence describing how analysis was performed on data collected by the authors of this paper
In this section, we present findings on how AbstractExplorer supports comparative close reading at scale by integrating quantitative survey responses and log data with qualitative analysis of transcripts and open-ended responses. The qualitative analysis process is described in detail in Appendix H.
sentence describing how analysis was performed on data collected by the authors of this paper
Throughout the two tasks, we also collected detailed interaction logs including counts of user-defined aspects created, duration of highlighting usage, and time allocation across the three possible alignment options.
sentence describing how analysis was performed on data collected by the authors of this paper
Both gaze data and the semi-structured interviews revealed that lower NFC participants were more willing to be guided by the three features and took advantage of them consciously.
sentence describing how analysis was performed on data collected by the authors of this paper
Using a two-tailed Mann-Whitney U Test, we found that participants who reported their lowest perceived cognitive load when all three features were enabled had significantly lower NFC than participants who reported their lowest cognitive load level when skimming with no features enabled—in the baseline interface (p=0.03).
sentence describing how analysis was performed on data collected by the authors of this paper
The raw NASA-TLX score is the sum of all 6 NASA-TLX questions after reversing the appropriate questions.
sentence describing how analysis was performed on data collected by the authors of this paper
To compute a participant's NFC score, we averaged their response to the six questions, each ranging from 1 to 7, after reversing the appropriate questions.
sentence describing how analysis was performed on data collected by the authors of this paper
For simplicity of analysis, we denote participants with NFC scores above the overall participants' median NFC of 5.42 (IQR = 0.583) as higher NFC, and lower NFC otherwise.
sentence describing how analysis was performed on data collected by the authors of this paper
To contrast participants' gaze patterns in each condition, we used a Tobii Pro Spark eye-tracker placed below the desktop monitor used by all subjects; Tobii Pro Lab software recorded each participant's gaze over time in each condition.
sentence describing how analysis was performed on data collected by the authors of this paper
We collected 80 sentences from our abstracts dataset labeled by our system as "Methodology/Contribution." Participants viewed the same 80 sentences in each condition—often with a different subset of sentences initially visible due to ordering changes—but only had two minutes to look at them in each condition.
sentence describing how analysis was performed on data collected by the authors of this paper
After obtaining an expanded set of high-level chunk labels, we assign them to each of the sentence chunks by using LLMs in a multiclass classification few-shot learning task, with the initial labels and assignment as examples (see prompt used in Appendix D.3).
sentence describing how analysis was performed on data collected by the authors of this paper
Then, we segment sentences within each aspect into grammarpreserving chunks (see prompt used in Appendix D.2). This results in grammatically coherent chunks that are the basis of structure patterns. After identifying chunk boundaries, we again prompt an LLM to generate labels for chunks in a human-in-the-loop approach: starting from an initial set of labels for chunk roles, when a new label is generated, a researcher from the research team examines the new label and merges it with existing labels if appropriate, controlling for the total number of labels.
sentence describing how analysis was performed on data collected by the authors of this paper
We process this data in a three-stage pipeline (Figure 6). In the first stage, Sentence Segmentation and Categorization, abstracts are split into individual sentences using the NLTK package, and each sentence is classified into one of the five pre-defined aspects as listed in Section 4.1.1. Classification is performed by prompting an LLM (see prompt used in Appendix D.1) with the sentence and its full abstract.
sentence describing how analysis was performed on data collected by the authors of this paper
After the interviews, we analyzed the data using the process described in Appendix B
sentence describing how analysis was performed on data collected by the authors of this paper
To analyze the annotation efficiency, we first conducted a Kruskal-Wallis rank sum test [39] to determine if there were statistically significant differences in annotation time across the three conditions, because our data violated the homogeneity of variances assumption, making non-parametric methods more appropriate.
return any single sentence that describes data analysis done on data collected by the authors when running human subjects experiments.
AI for Efficiency - Using AI to Get Faster at Analysis Tasks
AI Tools for each phase of analysis
“Analysts need to be able to dissect exactly how the AI reached a particular conclusion or recommendation,” says Chief Business Officer Eric Costantini. “Neo4j enables us to enforce robust information security by applying access controls at the subgraph level.”
“Analysts need to be able to dissect exactly how the AI reached a particular conclusion or recommendation,” “Neo4j enables us to enforce robust information security by applying access controls at the subgraph level.” Chief Business Officer Eric Costantini.
Kirkwood. I. (2020) HERE’S HOW #CDNTECH COMPANIES ARE PITCHING IN DURING COVID-19. Betakit. Retrieved from:https://betakit.com/heres-how-cdntech-companies-are-pitching-in-during-the-covid-19-outbreak/
Punn, N. S., Sonbhadra, S. K., & Agarwal, S. (2020). COVID-19 Epidemic Analysis using Machine Learning and Deep Learning Algorithms [Preprint]. Health Informatics. https://doi.org/10.1101/2020.04.08.20057679