Hypothesis

4,387 Matching Annotations

Jun 2026
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu

How Notations Evolve: A Historical Analysis with Implications for Supporting User-Defined Abstractions

23
1. elglassman 24 Jun 2026
  
  in Public
  
  Studying software teams, Cherubini et al. [34] found a "tendency to adopt informal, ad-hoc notations" and a "limited adherence to standards of any sort."
  
  ai-pending study descriptions
2. elglassman 24 Jun 2026
  
  in Public
  
  Studies are also conducted on various existing notating practices, usually in specific domains (e.g., how programmers draw diagrams to communicate ideas [34, 62, 63]).
  
  ai-pending study descriptions
3. elglassman 24 Jun 2026
  
  in Public
  
  What seem today as obvious notations often have relatively short histories: for instance, arrows in diagrams emerged around the 18th century.
  
  context ai-user-approved
4. elglassman 24 Jun 2026
  
  in Public
  
  Notations are deployed and embedded throughout the process of HCI and software development.
  
  One or more sentences contextualizing the current work with typically uncited statements about the past.
  
  context ai-user-approved
5. elglassman 24 Jun 2026
  
  in Public
  
  Almost everything we do with computers involves notations.
  
  One or more sentences contextualizing the current work with typically uncited statements about the past.
  
  context ai-user-approved
6. elglassman 24 Jun 2026
  
  in Public
  
  These informal interactions can then lead to formal representations, but depend upon pre-existing formalisms known to both humans and AI.
  
  context ai-user-approved
7. elglassman 24 Jun 2026
  
  in Public
  
  Seemingly 'obvious' notations to academics are also not obvious to everyone: e.g., about one-third of the U.S. and German populations have low literacy in reading data visualizations [51].
  
  ai-pending context
8. elglassman 24 Jun 2026
  
  in Public
  
  As current AI technologies rely upon, reproduce, and amplify established, dominant, already-formalized abstractions and notations in order to function
  
  ai-pending context
9. elglassman 24 Jun 2026
  
  in Public
  
  Many notations are culturally learned and inherited.
  
  One or more sentences contextualizing the current work with typically uncited statements about the past.
  
  context ai-user-approved
10. elglassman 24 Jun 2026
  
  in Public
  
  From our analysis, we derive a set of initial implications for the design of future systems that create new abstractions (Section 5), including that notations primarily originate through linking metaphors and most often in a social—rather than a technical—context, and that notation design decisions around what to include as "meaningful" (and thus what to exclude) are often left implicit by inventors, but could be made explicit and become manipulable objects through reification [10].
  
  contribution ai-user-approved
11. elglassman 24 Jun 2026
  
  in Public
  
  Our work contributes to a longstanding dream of dynamic abstractions in HCI, where users can dynamically communicate and express themselves through notations (interfaces) that they are most comfortable with at the moment of expression, beyond ones predefined by developers [96, 143, 144, 148, 149].
  
  ai-pending contribution
12. elglassman 24 Jun 2026
  
  in Public
  
  Here we present suggestions for system designers, with concrete examples inspired by our patterns. These are just some interesting ideas that came to mind, rather than an exhaustive list.
  
  ai-pending contribution
13. elglassman 24 Jun 2026
  
  in Public
  
  Alongside the social stages of notation development above are three functional stages that emerge from reflection upon our analysis—descriptive, generative, and evaluative stages (borrowing terminology from Generative Theories of Interaction [11])
  
  ai-pending contribution
14. elglassman 24 Jun 2026
  
  in Public
  
  Our historical analysis suggests that, cognitively and socially, a notation proceeds by: (1) Enumerating dimensions of meaningful variation in the target domain, which proliferate as more situations are encountered or considered (whether by inventors or users) (2) Mapping dimensions of meaningful variation to perceptual channels of representation (3) Designing the notation to leverage perceptual affordances by visual analogy to embodied transformations like pouring cups or rotating shapes, and ensuring these "natural" manipulations hold meaning in the target domain
  
  ai-pending contribution
15. elglassman 24 Jun 2026
  
  in Public
  
  These stages form a spectrum and are not rigid boundaries. We clustered patterns into the most relevant stage for ease of presentation; however, patterns can be applicable across stages.
  
  ai-pending contribution
16. elglassman 24 Jun 2026
  
  in Public
  
  Our review identified many empirical patterns in the notation development process. We state each pattern, briefly describe it, and provide examples.
  
  ai-pending contribution
17. elglassman 24 Jun 2026
  
  in Public
  
  Our analysis identifies 33 patterns of how notations are created, evolved, and formalized over time, which are largely shared across histories and loosely categorized into three social stages of development (invention/incubation, dispersion/divergence, and institutionalization/sanctification) and three functional stages (descriptive, generative, and evaluative).
  
  contribution ai-user-approved
18. elglassman 24 Jun 2026
  
  in Public
  
  What about novel formalisms and notations? How are new abstractions created, evolved, and incrementally formalized over time—and how might new systems, in turn, be explicitly designed to support these processes?
  
  research question ai-user-approved
19. elglassman 24 Jun 2026
  
  in Public
  
  How might we co-create a new notation with a machine, and thereafter communicate through that notation, even share out the notation to broader communities?
  
  research question ai-user-approved
20. elglassman 24 Jun 2026
  
  in Public
  
  While current AI systems support "horizontal" translations from informal ideas to established notations, how should we ensure that the "vertical" process of creation—new notations, new abstractions—is also supported?
  
  research question ai-user-approved
21. elglassman 24 Jun 2026
  
  in Public
  
  How do humans ultimately develop new notations, new formalisms, and new abstractions, that they use to communicate with machines and each other?
  
  research question ai-user-approved
22. elglassman 24 Jun 2026
  
  in Public
  
  The use of notation happens everyday in small ways, e.g., whenever people work together over a whiteboard or paper towards a joint objective. People jot down X's, boxes and arrows to stand-for concepts they are working through.
  
  context ai-user-approved
23. elglassman 24 Jun 2026
  
  in Public
  
  Human-computer interactions have historically been mediated by formally-defined structures—such as command-line interfaces, graphical user interfaces, and programming languages—that provide an unambiguous mapping to an underlying formal model.
  
  context ai-user-approved
Visit annotations in context

Tags

contribution

ai-user-approved

ai-pending

study descriptions

research question

context

Annotators

elglassman

URL

glassmanlab.seas.harvard.edu/papers/notationsCHI26.pdf
dl.acm.org dl.acm.org

Self-Determination Theory in HCI Games Research: Current Uses and Open Questions

6
1. elglassman 24 Jun 2026
  
  in Public
  
  SDT broadly differentiates three types of motivation [157]: Intrinsic motivation denotes activity pursued for its inherently interesting or enjoyable qualities. Extrinsic motivation refers to activity pursued for a separable outcome. Amotivation denotes the absence of intentional motivation, where a person may no longer be aware why they pursue an activity.
  
  ai-pending Self-Determination Theory
2. elglassman 24 Jun 2026
  
  in Public
  
  Basic psychological needs theory (BPNT) posits three basic psychological needs that energise organismic processes: competence, the feeling of having an effect; autonomy, a sense that actions are self-endorsed and performed willingly; and relatedness, a sense of reciprocal care, value, and belonging in relation to other social figures and collectives [158].
  
  ai-pending Self-Determination Theory
3. elglassman 24 Jun 2026
  
  in Public
  
  SDT is broadly organised into six mini-theories, whose underlying concepts are continuously developed, critiqued, and revised (e.g., [186, 190, 191]).
  
  ai-pending Self-Determination Theory
4. elglassman 24 Jun 2026
  
  in Public
  
  At its core, SDT is a scientific theory [163], in that it contains a number of empirically-testable propositions [199] that generalise across varied contexts, which serve to explain and predict the impact of certain events on motivation and wellbeing.
  
  ai-pending Self-Determination Theory
5. elglassman 24 Jun 2026
  
  in Public
  
  SDT is a psychological macro-theory of human motivation, growth, and wellbeing [47, 48, 163] that characterises humans as fundamentally active organisms.
  
  ai-pending Self-Determination Theory
6. elglassman 24 Jun 2026
  
  in Public
  
  Self-Determination Theory (SDT), a major psychological theory of human motivation, has become increasingly popular in Human-Computer Interaction (HCI) research on games and play.
  
  ai-pending Self-Determination Theory
Visit annotations in context

Tags

ai-pending

Self-Determination Theory

Annotators

elglassman

URL

dl.acm.org/doi/pdf/10.1145/3313831.3376723
dl.acm.org dl.acm.org

Self-Determination Theory and HCI Games Research: Unfulfilled Promises and Unquestioned Paradigms

3
1. elglassman 24 Jun 2026
  
  in Public
  
  To our knowledge, the first SDT research involving videogames [18] was conducted shortly after Deci's original formulation of CET [129] and investigated whether extrinsic rewards would reduce intrinsic motivation even for 'highly intrinsically motivating' activities such as videogame play. Videogames' intrinsically motivating qualities were also examined in early research on learning [e.g., 351]; however, focused examination of other core SDT concepts such as need satisfaction largely began much later [365].
  
  ai-pending history
2. elglassman 24 Jun 2026
  
  in Public
  
  Research on games and play in HCI (henceforth HCI games research), however, has continued to employ broad psychological theories as foundational work [417, 556]. One prominent example can be seen in self-determination theory (SDT) [481, 483], an influential theory of human motivation, which has provided HCI games research with propositions and concepts that can help explain motivational and experiential qualities of games and game-adjacent systems (e.g., gamification).
  
  ai-pending history
3. elglassman 24 Jun 2026
  
  in Public
  
  Psychological concepts and models have long been employed in human–computer interaction (HCI) to theorise the human user [88]. However, early applications of cognitive psychological theory did not develop into a coherent foundation of knowledge about human factors [89, 109, 455]—circumstances that Rogers [456, p. 22] attribute to "the stark differences between a controlled lab setting and the messy real world setting" for which interactive artefacts and systems are designed. The deployment of broad theory in HCI has subsequently declined in the intervening years [455, 456], and this sporadic progress in theory development in domains such as usability and user experience (UX) has been identified as a cause for concern [249, 314].
  
  ai-pending history
Visit annotations in context

Tags

history

ai-pending

Annotators

elglassman

URL

dl.acm.org/doi/pdf/10.1145/3673230
techstackups.com techstackups.com

GLM-5.2 vs Claude Opus | Tech Stackups

1
1. pyxelr 23 Jun 2026
  
  in Public
  
  GLM-5.2 vs Claude Opus
  
  Overview of GLM-5.2: It is Z.ai's latest flagship model, released with fully open weights under the permissive MIT license. It features a usable 1-million-token context window and dynamic capability routing via two thinking effort levels (High and Max).
  
  Core Limitations: GLM-5.2 is strictly text-only and lacks multimodal capabilities. It cannot process or analyze visuals, screenshots, or user interface states natively.
  
  Pricing Advantage: GLM-5.2 offers a substantial price reduction compared to top proprietary engines. Its API is priced at $1.40 per million input tokens and $4.40 per million output tokens, making its output generation over 5x cheaper than Claude Opus 4.8 ($5 input / $25 output).
  
  Head-to-Head Testing (WebGL Game from Scratch): Both models were prompted to build a third-person 3D platformer game in raw WebGL without utilizing external 3D engine libraries (such as Three.js).
  
  Claude Opus 4.8 Execution: Completed the build in 33 minutes and 30 seconds using ~217k output tokens ($21.92 estimated cost). It successfully implemented correct camera controllers, textures, animations, and valid win conditions.
  
  GLM-5.2 Execution: Took 1 hour, 10 minutes, and 40 seconds using ~131k output tokens ($5.39 real billed cost). While it successfully coded advanced mechanics like spring launch velocity, it introduced basic structural bugs—such as rendering the player backwards, omitting character textures, and ignoring win states.
  
  The Multimodal Verification Edge: Claude Opus leveraged its vision to inspect automated screenshots of the game, spotting and cleaning up debug overlays prior to completion. GLM-5.2 had to rely on a fallback script that sampled raw pixel colors; it verified the existence of the correct color palette but missed catastrophic visual rendering and layout bugs.
  
  Benchmark Performance: Official metrics place GLM-5.2 directly between Claude Opus 4.7 and 4.8. It trails Opus 4.8 on multi-file reasoning, repository-level debugging, and complex software architectures (such as SWE-Marathon and DeepSWE), but matches or exceeds frontier models on core code generation, tool use (MCP-Atlas), and math benchmarks (AIME 2026).
  
  Hacker News Discussion
  
  Orchestration and Tool Selection Over Model Scale: Commenters point out that the orchestration layer is becoming the primary differentiator in production AI. The core challenge for modern engineering agents is no longer raw token intelligence, but the ability to correctly navigate real-world toolchains and evaluate responses within complex environments.
  
  Shift from Mainframe to PC Era in AI: The discussion highlights an architectural shift from monolithic central cloud APIs toward decentralized execution. Users emphasize that open-weight deployments give developers long-term vendor optionality and structural independence from platform deprecations or policy shifts.
  
  High Compute and Output Latency Overhead: Multiple engineers note that while GLM-5.2 is remarkably smart for an open-weight model, it is highly token-hungry. Its extended reasoning traces can consume over 40k tokens and multiple minutes of thinking before outputting files, making inference speed an ongoing optimization bottleneck.
  
  The Practical Value of Local and Managed Hosting: The community highlights that having an MIT-licensed model at this tier eliminates vendor lock-in risks. For developers without massive on-premise hardware setups (such as multi-H100 configurations) to serve a 756B parameter model, using cost-effective managed endpoints like OpenRouter provides the perfect balance of massive savings and immediate API access.
  
  GLM Claude Opus AI LLM
Visit annotations in context

Tags

AI

GLM

Claude

LLM

Opus

Annotators

pyxelr

URL

techstackups.com/comparisons/glm-5.2-vs-opus/
forsal.pl forsal.pl

AI napędzi polską gospodarkę, ale są też koszty. Grubo ponad ćwierć miliona osób może stracić pracę - Forsal.pl

1
1. pyxelr 23 Jun 2026
  
  in Public
  
  AI napędzi polską gospodarkę, ale są też koszty. Grubo ponad ćwierć miliona osób może stracić pracę
  
  Bank Światowy prognozuje, że AI może zwiększyć PKB Polski o 12% do 2035 r., ale jednocześnie zmniejszyć zatrudnienie nawet o 350 tys. etatów.
  
  Największe zyski mają dotyczyć IT i budownictwa (wzrost nawet o 25%). Sektor finansowy może rosnąć gospodarczo, ale zatrudnienie w nim może spaść o 25%. Programiści i branża IT także mogą odczuć spadek liczby etatów. Budownictwo może zyskać ok. 20% miejsc pracy.
  
  Jeśli Polacy nie będą chętni do zmiany zawodu, pracę straci nawet 350 tys. osób. Przy dużej mobilności pracowników ubytek etatów ma wynieść wg modeli tylko 3 tys.
  
  Zmiany odczuje budżet państwa – spadną wpływy z PIT i składek ZUS, ale wzrosną z CIT i VAT.
  
  AI work IT data Poland polish
Visit annotations in context

Tags

polish

AI

data

IT

work

Poland

Annotators

pyxelr

URL

forsal.pl/gospodarka/pkb/artykuly/11265929,ai-podbije-polska-gospodarke-pkb-w-gore-o-12-proc-ale-sa-tez-koszty.html
www.garfield.law www.garfield.law

Garfield AI - Automated Debt Recovery & Legal Claims

1
1. tonz 23 Jun 2026
  
  in Public
  
  Garfield the ai 'lawyer' service mentioned in [[HR consultant wins English court case using AI lawyer in apparent legal first]]
  
  Most of it is sending a reminder, and then a letter before taking legal action. Both can be automated, mostly are, don't need AI. So what remains is claims of: - starting court proceedings, - hiring an actual lawyer for representation in court - suggesting how to deal with counterclaims. Only the last item seems actually having something to it to me.
  
  ai lawyers examples
Visit annotations in context

Tags

ai

examples

lawyers

Annotators

tonz

URL

garfield.law/
www.theguardian.com www.theguardian.com

HR consultant wins English court case using AI lawyer in apparent legal first

1
1. tonz 23 Jun 2026
  
  in Public
  
  Description of how AI 'won' a courtcase. A bit messy description of the actual case, and the role of AI in it. Says the AI is a commercially available service, Garfield, that was authorized for claims up to 10k.
  
  lawyers ai courtcase examples
Visit annotations in context

Tags

ai

examples

lawyers

courtcase

Annotators

tonz

URL

theguardian.com/technology/2026/jun/22/artificial-intelligence-law-firm-wins-court-case-in-england-for-first-time
minid.net minid.net

The Web We Know Is Going to Disappear - Minid.net

1
1. tonz 22 Jun 2026
  
  in Public
  
  Via [[Frank Meeuwsen p]] - [ ] return #openweb #pkm #writing
  
  Much to unpack that is convoluted here. K generation e.g. , when k needs an observer. Or epistemological centipede when no original input remains
  
  pkm ai open een, writing reading knowledge sensemaking
Visit annotations in context

Tags

reading

open een, writing

knowledge

pkm

ai

sensemaking

Annotators

tonz

URL

minid.net/2026/6/15/the-web-is-going-to-dissapear
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/21/when-the-trump-administration-cracks-down-on-anthropic-who-benefits/

1
1. fxp007 21 Jun 2026
  
  in Public
  
  Anthropic has not had the best relationship with the Trump administration in a way that stands apart from the other leading AI labs
  
  大多数人认为特朗普政府对所有AI实验室的态度是一致的，但作者指出Anthropic与特朗普政府的关系特别紧张，这与其他领先的AI实验室不同。
  
  non-consensus government-relations ai-laboratories
Visit annotations in context

Tags

ai-laboratories

government-relations

non-consensus

Annotators

fxp007

URL

techcrunch.com/2026/06/21/when-the-trump-administration-cracks-down-on-anthropic-who-benefits/
openai.com openai.com

https://openai.com/index/samsung-electronics-chatgpt-codex-deployment

1
1. fxp007 21 Jun 2026
  
  in Public
  
  This historic deployment for OpenAI is particularly significant because Samsung Electronics, a global leader in technology and manufacturing, is embracing AI not as a tool limited to certain teams or functions, but as a core platform for improving how employees around the world work and innovate.
  
  这个引用强调了三星电子对AI的采用不仅仅是一个工具，而是一个核心平台，这将极大地推动全球员工的工作和创新方式。
  
  core-argument ai-adoption business-strategy
Visit annotations in context

Tags

business-strategy

core-argument

ai-adoption

Annotators

fxp007

URL

openai.com/index/samsung-electronics-chatgpt-codex-deployment
www.wheresyoured.at www.wheresyoured.at

AI Is Slowing Down

1
1. pyxelr 21 Jun 2026
  
  in Public
  
  AI Is Slowing Down
  
  Unsustainable Revenue Requirements and Financial Imbalance:
  
  The AI industry is facing a harsh economic reality driven by aggressive over-investment in data center construction and massive compute commitments.
  
  To achieve baseline solvency, cover soaring operational expenses, and service its massive debt burdens, the AI sector as a whole must generate an astronomical $2 trillion to $3 trillion in annual revenue by 2030.
  
  Severe Debt Pressures on Tech Giants (Hyperscalers):
  
  Major AI labs and hyperscalers (such as Microsoft, Google, and Meta) find themselves locked in a capital-intensive infrastructure arms race.
  
  To sustain this frantic buildout of computational capacity, these corporations are under continuous pressure to issue hundreds of billions of dollars in debt or flood the market with massive equity, creating significant systemic risk if monetization fails to materialize.
  
  Extreme Disconnect Between Compute Supply and Real Demand:
  
  There is a staggering gap between the infrastructure being built and actual market consumption; current global demand for AI compute sits below $100 billion.
  
  Driven by their staggering long-term compute liabilities, frontline entities like OpenAI and Anthropic face an incredibly steep uphill battle, needing to scale their individual monthly revenues to at least $10 billion each by early 2028 just to remain solvent.
  
  Dangerous Market Concentration and Lack of Diversification:
  
  The commercial generative AI landscape is dangerously centralized, with just two companies—OpenAI and Anthropic—capturing roughly 89% of all startup revenue in the sector.
  
  This extreme consolidation reveals a critical lack of broad, diversified enterprise demand across the wider economy, meaning the massive server infrastructure being deployed relies almost entirely on the survival and growth of a tiny handful of players.
  
  Corporate Cost-Cutting and Strict Spending Caps by CFOs:
  
  Initial corporate enthusiasm for AI integration is stalling as enterprises encounter the harsh realities of variable pricing.
  
  As major AI vendors transitioned to usage-based token billing, companies like Uber, T-Mobile, and Brex experienced a severe lack of cost visibility; this has prompted CFOs to step in, mandate strict budget caps, and actively scale back their AI consumption to protect their bottom lines.
  
  AI finances
Visit annotations in context

Tags

AI

finances

Annotators

pyxelr

URL

wheresyoured.at/ai-is-slowing-down/
arstechnica.com arstechnica.com

Leaked financial docs show OpenAI is losing billions of dollars a year

1
1. pyxelr 21 Jun 2026
  
  in Public
  
  Leaked financial docs show OpenAI is losing billions of dollars a year
  
  Massive Net Losses: In 2025, OpenAI generated $13.07 billion in revenue but racked up $34 billion in total costs and expenses, resulting in an operating loss of $20.92 billion.
  
  One-Time Accounting Impact: Due to its transition from a non-profit to a for-profit entity, the company recorded a $41.55 billion loss from fair value changes in convertible interests and warrant liabilities. This brought the final net loss attributable to OpenAI to $38.53 billion.
  
  Year-over-Year Trajectory: Expenses and losses grew exponentially compared to 2024, when OpenAI brought in $3.7 billion in revenue against $12.48 billion in total costs, yielding a net loss of $5.09 billion.
  
  Core Expense Breakdown (2025):
  
  Research and Development (R&D): $19.18 billion (up from $7.81 billion in 2024).
  
  Cost of Revenue: $7.5 billion (up from $2.65 billion in 2024).
  
  Sales and Marketing: $5.73 billion (up from $1.11 billion in 2024).
  
  General and Administrative: $1.57 billion.
  
  Strategic Capital Flow & Microsoft Relationship: OpenAI paid Microsoft $17.2 billion in service fees during 2025 ($10.59 billion for R&D/model training and $6.047 billion for computing cost of revenue). By the end of 2025, OpenAI still had a remaining liability of $3.64 billion to Microsoft.
  
  Inbound Funding: Strategic partners provided substantial inflows; OpenAI received $867 million from SoftBank and $303 million from Microsoft in 2025.
  
  Remaining Cushion: As of the close of 2025, OpenAI held slightly over $50 billion in total assets, with nearly half of that cushion (~$25 billion) maintained as liquid cash reserves.
  
  Hacker News Discussion
  
  R&D vs. Inference Costs: Commenters debate whether OpenAI can safely shift its massive R&D expenditure toward minimizing inference costs. While cheaper models like DeepSeek are heavily praised for personal and developer productivity, some argue stopping frontier model research means losing the structural race entirely.
  
  Diminishing Returns on Model Power: Users question whether a marginally smarter model justifies an exponentially higher cost. A central discussion point revolves around the financial viability of paying massive premiums for enterprise-tier models compared to utilizing low-cost API alternatives.
  
  The Math of Productivity Upgrades: A highly debated calculation suggests that even a 5% boost in productivity for a high-earning employee justifies hundreds of dollars in monthly subscriptions. However, critics counter that the financial surplus of that productivity is captured by companies and owners, rather than resulting in worker wage increases.
  
  The Path to Monetization: The consensus leans toward enterprise seat monetization (charging upwards of $2,000/month per corporate professional) and securing multi-billion dollar government contracts as the only viable business models. The inevitable integration of embedded or covert advertisements for free tiers is also viewed as highly likely.
  
  AGI as a Pseudo-Religious Goal: Several participants view Silicon Valley's relentless capitalization of unprofitable AI models as an irrational, faith-based pursuit of AGI (Artificial General Intelligence), comparing the narrative to religious prophecies.
  
  AI OpenAI FinOps
Visit annotations in context

Tags

OpenAI

FinOps

AI

Annotators

pyxelr

URL

arstechnica.com/ai/2026/06/leaked-financial-docs-show-openai-is-losing-billions-of-dollars-a-year/
dgi6ph9bl5lv1x.archive.is dgi6ph9bl5lv1x.archive.is

Wat Rutger Bregman doet is ‘links bashen, rechts cashen’ | de Volkskr…

1
1. tonz 20 Jun 2026
  
  in Public
  
  [[Felienne Hermans p]] over Rutger Bregman over AI. Kernpunten: misrepresentatie van wat Chomsky (over moraliteit) en Bender over AI zeggen (Chomsky dat het geen moraliteit kan hebben, Bender de projecte dat we onvermijdelijk AI output als resultaat van denken zullen zien). omarming geloof dat AI een pad naar universal basic wealth is, dwz het techbro LT/EA denken, zijn school voor morele ambitie positioneert 'goede' bedrijven (vgl social impact company) en serveert andere paden naar verandering af (demonstraties, politiek, non-capitalistische aanpakken) om binnne de aannames van huidige eco/tech/pol systeem te blijven. Links bashen, rechts cashen is haar oordeel. Ik denk dat naar de VS verhuizen maakt dat hij niet eens weet dat hij draait.
  
  ai rutgerbregman techbro sil
Visit annotations in context

Tags

ai

sil

rutgerbregman

techbro

Annotators

tonz

URL

dgi6ph9bl5lv1x.archive.is/
www.politico.eu www.politico.eu

Europe must choose between AI and climate goals, data center lobby says

1
1. tonz 20 Jun 2026
  
  in Public
  
  https://web.archive.org/web/20260620090720/https://www.politico.eu/article/europe-choose-ai-climate-goals-data-center-chief-warns/
  
  "must burn planet for data centers"
  
  climatecrisis digital-policy datacenters ai
Visit annotations in context

Tags

datacenters

digital-policy

climatecrisis

ai

Annotators

tonz

URL

politico.eu/article/europe-choose-ai-climate-goals-data-center-chief-warns/
rutgerbregman.substack.com rutgerbregman.substack.com

An Inconvenient Truth About AI

1
1. tonz 19 Jun 2026
  
  in Public
  
  Rutger Bregman on AI hype
  
  [ ] return
  
  ai
Visit annotations in context

Tags

ai

Annotators

tonz

URL

rutgerbregman.substack.com/p/an-inconvenient-truth-about-ai
www.cusp.ai www.cusp.ai

Untitled document

3
1. fxp007 19 Jun 2026
  
  in Public
  
  Prof. Geoffrey Hinton
  
  Hinton + LeCun 同时出现在顾问名单中——两位「AI教父」罕见地联合背书同一家公司。Hinton 近年持续发出 AI 安全警告，但他选择支持 AI for materials 这类有明确正向应用的领域，本身也是一种价值观表态：用科学发现来抵消 AI 风险叙事。
  
  Geoffrey Hinton 顾问 AI教父 AI for Good 背书信号
2. fxp007 19 Jun 2026
  
  in Public
  
  Prof. Max Welling
  
  Max Welling 担任 CTO，这个选角意味深长。Welling 是图神经网络（GNN）和等变神经网络（如 SE(3)-Transformers）的核心推动者，而分子和晶体结构天然具有对称性和图结构。他的研究背景几乎是为分子属性预测量身定做的，比单纯的化学信息学出身的 CTO 更具 AI-native 的技术深度。
  
  Max Welling 图神经网络等变神经网络 AI原生 CTO
3. fxp007 19 Jun 2026
  
  in Public
  
  While nature took billions of years to perfect molecules, we are harnessing AI to unlock trillion-dollar materials breakthroughs in months, not millennia.
  
  cusp.ai 的核心叙事：把亿年进化压缩成数月突破。这句话精准捕捉了 AI for science 的终极承诺——不是辅助科学家，而是替代进化时间本身。「数月而非千年」是一种时间折叠，和 AlphaFold 对蛋白质折叠的影响如出一辙，只是目标换成了材料。
  
  AI for Science 材料发现时间压缩 AI加速科学
Visit annotations in context

Tags

等变神经网络

AI教父

Max Welling

时间压缩

AI加速科学

背书信号

AI for Science

CTO

AI原生

顾问

Geoffrey Hinton

材料发现

AI for Good

图神经网络

Annotators

fxp007

URL

cusp.ai/
cloud.google.com cloud.google.com

How the Open Knowledge Format can improve data sharing | Google Cloud Blog

1
1. fxp007 18 Jun 2026
  
  in Public
  
  these atoms of knowledge live in a variety of highly fragmented systems
  
  这段描述的是大多数组织的现实：真正有用的上下文知识——表的含义、指标的定义、运维手册、两个系统之间的join路径——散落在数据目录API、Wiki、代码注释、共享文件夹，以及几位资深工程师的脑子里。每当一个新的AI智能体需要回答「如何从事件流里计算周活跃用户」这样的问题，它都要从这些互不兼容的碎片中重新拼出答案。这是一个被严重低估的AI落地障碍，而且随着智能体数量增加，这个问题会以平方级别恶化。
  
  知识碎片化 AI上下文数据目录
Visit annotations in context

Tags

知识碎片化

AI上下文

数据目录

Annotators

fxp007

URL

cloud.google.com/blog/products/data-analytics/how-the-open-knowledge-format-can-improve-data-sharing
openai.com openai.com

Untitled document

1
1. fxp007 18 Jun 2026
  
  in Public
  
  Chemists found the suggestion both surprising and interesting
  
  这是全文最值得关注的细节之一。TEMPO是温和的自由基氧化剂，通常不是有机化学家考虑偶联反应时的第一直觉。AI提出了一个人类专家觉得出人意料但合理的假设——这正是科研价值的核心：不是重新发现已知的，而是在现有知识空间中找到人类视野盲区里的连接。如果AI只是系统地重组了文献中已有的方向，这个结果就不值得发表。
  
  非共识假设 TEMPO AI创造力
Visit annotations in context

Tags

AI创造力

TEMPO

非共识假设

Annotators

fxp007

URL

openai.com/index/ai-chemist-improves-reaction/
rorytruex.substack.com rorytruex.substack.com

Will AI Break the University?

1
1. JoeMurphy 18 Jun 2026
  
  in Public
  
  A key through line of all these tasks is that they are time consuming
  
  Ethan Mollick, in Co-Intelligence, makes the point that part of the signal of any letter of reference is that this person is so good that I'll burn my own time to tell you about them. Does the same "signal" concept apply to peer review and student work? (It's not entirely clear to me it does; evaluation is a different task than recommendation. But I still feel like it's worth asking how we signal value based on our use of time in evaluative processes.)
  
  AI time time management evaluation recommendation
Visit annotations in context

Tags

AI

time management

recommendation

evaluation

time

Annotators

JoeMurphy

URL

rorytruex.substack.com/p/will-ai-break-the-university
www.tomtunguz.com www.tomtunguz.com

https://www.tomtunguz.com/local-coding-models/

1
1. fxp007 17 Jun 2026
  
  in Public
  
  Comparing agentic Qwen3.6 35b to Claude Opus is like a junior with knowledge across the board, that you really need to guide, versus a senior that thinks with you on architecture.
  
  这个比喻很好地解释了本地模型与云端高级AI之间的差异。本地模型虽然功能强大，但仍需较多指导，而云端模型如Claude Opus更能自主思考架构问题。开发者在使用本地模型时应有合理的期望，并准备好提供更多指导。
  
  ai-capabilities realistic-expectations
Visit annotations in context

Tags

realistic-expectations

ai-capabilities

Annotators

fxp007

URL

tomtunguz.com/local-coding-models/
www.tomtunguz.com www.tomtunguz.com

https://www.tomtunguz.com/golden-age-of-applications/

4
1. fxp007 17 Jun 2026
  
  in Public
  
  The nuances of tuning the carburetors & the timing belts of these complex beasts are tasks better assigned to a few vendors to deliver maximum intelligence per dollar & amortize the costs across a broader population.
  
  作者将AI系统比作复杂的机械，需要精细调整（化油器和正时皮带）。他建议将这种专业任务交给少数供应商，以实现每美元最大智能回报并分摊成本。这反映了AI应用开发的专业化和集中化趋势，对初创企业考虑是否自建AI能力有重要启示。
  
  ai-operations cost-efficiency
2. fxp007 17 Jun 2026
  
  in Public
  
  Loops, the critical problem-definition exercise of this era, are hard to design. Systems design is an entire discipline... What is the best way to define a loop so an agentic system improves?
  
  作者强调了'循环'设计在AI应用中的关键地位，将其定义为这个时代的关键问题定义练习。这反映了AI应用开发中系统设计的重要性，尤其是如何设计能够持续改进的智能系统循环。这对初学者来说是一个容易被忽视但至关重要的概念。
  
  system-design ai-architecture
3. fxp007 17 Jun 2026
  
  in Public
  
  AI applications present three new disciplines to master: picking the right models, developing the hill-climbing loop, & evaluating the performance of the system for each company
  
  作者指出AI应用开发与SaaS有本质区别，需要掌握三个新领域：选择合适模型、开发提升循环和评估系统性能。这对初学者来说是一个重要的认知转变，提醒AI应用开发需要全新的思维方式和技能集，而非传统软件开发的简单延伸。
  
  ai-development skill-gap
4. fxp007 17 Jun 2026
  
  in Public
  
  the Fable retraction exposed model dependency risk, Satya's thesis defined the learning loop, & Salesforce's $3.6B Fin acquisition priced the harness.
  
  作者提出了三个关键发展来证明AI应用进入黄金时代：模型依赖风险暴露、学习循环定义以及市场对AI套件的定价。这反映了AI应用发展的三个重要维度：风险控制、战略共识和市场验证，对理解当前AI应用生态位很有价值。
  
  ai-ecosystem market-validation
Visit annotations in context

Tags

ai-development

cost-efficiency

skill-gap

ai-architecture

system-design

ai-ecosystem

market-validation

ai-operations

Annotators

fxp007

URL

tomtunguz.com/golden-age-of-applications/
www.wired.com www.wired.com

https://www.wired.com/story/the-white-house-wants-anthropic-to-block-all-jailbreaks-that-may-not-be-possible/

3
1. fxp007 17 Jun 2026
  
  in Public
  
  The government believes it has become aware of a method of bypassing, or 'jailbreaking' Fable 5.
  
  这是一个需要核实的政府声明，涉及AI安全漏洞的具体情况。需要确认政府是否真的发现了这种方法，以及该方法的有效性和影响范围。这反映了AI安全研究中的持续挑战。
  
  security-breach government-claim ai-vulnerability
2. fxp007 17 Jun 2026
  
  in Public
  
  Security experts say that can't be done.
  
  这是一个关键的技术观点，但缺乏具体引用和证据。需要确认是哪些安全专家持此观点，他们的专业背景是什么，以及他们是否有具体的研究或案例支持这一论断。这关系到AI安全技术的实际可行性。
  
  expert-opinion technical-claim ai-safety
3. fxp007 17 Jun 2026
  
  in Public
  
  Trump administration officials tell WIRED that if Anthropic wants to rerelease Fable 5, it will need to ensure the model's guardrails can't be circumvented.
  
  这是一个需要核实的重要事实声明，涉及特朗普政府对AI安全的具体要求。需要确认这是否是官方政策，以及这些要求是否合理和可行。这反映了政府与AI公司之间日益紧张的关系。
  
  fact-check ai-safety government-policy
Visit annotations in context

Tags

government-policy

technical-claim

expert-opinion

security-breach

ai-safety

fact-check

government-claim

ai-vulnerability

Annotators

fxp007

URL

wired.com/story/the-white-house-wants-anthropic-to-block-all-jailbreaks-that-may-not-be-possible/
www.anthropic.com www.anthropic.com

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

6
1. fxp007 17 Jun 2026
  
  in Public
  
  We believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts.
  
  这体现了Anthropic的核心论点：支持政府监管但要求透明度和基于事实的决策。需要深入了解他们之前关于AI监管的公开立场，以及这一事件是否与其一贯政策一致。
  
  core-argument ai-governance
2. fxp007 14 Jun 2026
  
  in Public
  
  We have found that other publicly-available models are able to discover them as well without requiring a bypass.
  
  大多数人认为Fable 5的漏洞是独特的严重问题，但作者认为其他公开可用的模型无需绕过就能发现这些漏洞，这挑战了Fable 5存在特殊安全风险的认知，暗示政府反应过度。
  
  non-consensus ai-comparison counterintuitive
3. fxp007 14 Jun 2026
  
  in Public
  
  If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.
  
  大多数人认为政府对AI模型的安全监管是必要的保护措施，但作者认为如果这种标准（因发现狭窄的潜在越狱就召回商业模型）在整个行业应用，将基本上停止所有前沿模型提供商的新模型部署。这是一个挑战AI监管共识的观点。
  
  non-consensus ai-regulation counterintuitive
4. fxp007 14 Jun 2026
  
  in Public
  
  We suspect that perfect jailbreak resistance is not currently possible for any model provider.
  
  大多数人认为AI模型应该能够被设计成完全无法被'越狱'的，但作者认为完美越狱抵抗目前对任何模型提供商来说都是不可能实现的，因为所有行业使用的安全措施都容易受到非通用越狱的攻击。这是一个挑战AI安全领域常识的论点。
  
  non-consensus ai-safety counterintuitive
5. fxp007 13 Jun 2026
  
  in Public
  
  We suspect that perfect jailbreak resistance is not currently possible for any model provider.
  
  大多数人认为AI公司应该追求完美的安全防护，但作者坦承完美防护是不可能的。这挑战了AI安全领域的期望，即公司应该能够完全防止其模型被滥用，转而采用更现实的防御策略。
  
  non-consensus ai-safety realism
6. fxp007 13 Jun 2026
  
  in Public
  
  We have found that other publicly-available models are able to discover them as well without requiring a bypass.
  
  大多数人认为发现AI模型的漏洞是严重的安全问题，需要立即采取措施，但作者认为这些漏洞在其他公开模型中也存在，暗示政府的反应过度。这挑战了AI安全领域的共识，即任何漏洞都应被视为重大威胁。
  
  non-consensus ai-safety counterintuitive
Visit annotations in context

Tags

core-argument

realism

ai-comparison

non-consensus

ai-regulation

ai-safety

ai-governance

counterintuitive

Annotators

fxp007

URL

anthropic.com/news/fable-mythos-access
vickiboykis.com vickiboykis.com

Running local models is good now

1
1. pyxelr 17 Jun 2026
  
  in Public
  
  Running local models is good now
  
  Evolving Quality: Local Large Language Models (LLMs) have achieved major milestones in accuracy, utility, and speed over the past six months, transitioning from simple "personalized Google" documentation lookups to handling localized agentic software development workflows.
  
  Hardware Requirements: Running larger models effectively requires high-spec hardware (e.g., Apple M-Series with 64 GB+ unified RAM) to maintain an expansive Key-Value (K-V) cache and avoid critical performance degradation.
  
  Top Performing Architecture: Recent open-weights families, such as Gemma 4 (specifically the gemma-4-26b-a4b and the faster gemma-4-12b-qat), have successfully reached roughly 75% of the accuracy and speed found in cloud-hosted frontier API models.
  
  Agentic Workflows: Local models can now successfully loop and interact with local environments to orchestrate non-trivial tasks like refactoring code, writing unit tests, and bootstrapping full application repositories.
  
  Secure Execution: Running developer-facing local agents poses local file system security risks, making a decoupled architecture—such as isolating the agent harness inside a containerized Docker Sandbox with restricted shell permissions—an essential security best practice.
  
  Persistent Ecosystem Bottlenecks: Despite massive progress, challenges remain around slow initial token pre-fill, limited context windows bounded by local hardware constraints, prompt template mismatches on release, and the heavy compute strain that maximizes GPU and RAM workloads.
  
  Hacker News Discussion
  
  Operational Friction: Many users argue that local models remain painful to run effectively. They note a stark divide between smart but slow dense models (e.g., Qwen 27B, Gemma 31B) and fast but error-prone Mixture of Experts (MoE) models.
  
  The Quantization Trap: Commenters point out that many users run low-bit quantizations (like 4-bit) to save RAM, which effectively lobotomizes the model's capacity for complex tool calling. Industry recommendations favor a minimum of 5-bit for dense models and 6-bit for MoEs.
  
  Hardware & Comfort Trademarks: Running these workloads locally often transforms high-end laptops or desktops into loud, hot, and energy-churning machines, making the physical development environment uncomfortable.
  
  Privacy and Data Sovereignty: A heated debate emerged regarding hosted vs. local options. While some demand local setups due to data-collection practices and copyright concerns of major tech providers, others prefer private API gateways or hosted "open model clouds" (like OpenRouter or specialized European hosters like OVH) that guarantee Zero Data Retention (ZDR).
  
  AI LLM DevOps MLOps Docker
Visit annotations in context

Tags

Docker

AI

DevOps

LLM

MLOps

Annotators

pyxelr

URL

vickiboykis.com/2026/06/15/running-local-models-is-good-now/
arstechnica.com arstechnica.com

https://arstechnica.com/ai/2026/06/spacex-will-acquire-coding-tool-cursor-to-compete-with-anthropic-openai/

2
1. fxp007 17 Jun 2026
  
  in Public
  
  xAI struck a deal to give Cursor access to its compute infrastructure, foreshadowing similar, larger deals with Anthropic and Google in the future.
  
  大多数人认为SpaceX/xAI在AI领域是独立自主的竞争者，但作者暗示他们实际上采取了依赖其他公司的策略，先通过小规模合作测试，再寻求与更大公司的交易。这种'先小后大'的战略模式与SpaceX一贯的颠覆者形象形成反差，暗示他们可能在AI领域采取了更谨慎、依赖外部资源的策略。
  
  counterintuitive ai-strategy dependency-model
2. fxp007 17 Jun 2026
  
  in Public
  
  This is a marriage between two companies that have arguably been falling behind in the AI race.
  
  大多数人认为SpaceX和Cursor都是各自领域的领先者，但作者认为这两家公司实际上都在AI竞赛中落后了。SpaceX的Grok聊天机器人充满争议，缺乏有竞争力的编程模型；而Cursor虽然有优秀人才和产品，但在计算能力上无法与大型公司竞争。这种'失败者联姻'的叙事与主流科技公司收购叙事形成鲜明对比。
  
  non-consensus acquisition-narrative ai-competition
Visit annotations in context

Tags

ai-competition

acquisition-narrative

ai-strategy

dependency-model

non-consensus

counterintuitive

Annotators

fxp007

URL

arstechnica.com/ai/2026/06/spacex-will-acquire-coding-tool-cursor-to-compete-with-anthropic-openai/
huggingface.co huggingface.co

https://huggingface.co/blog/zai-org/glm-52-blog

5
1. fxp007 17 Jun 2026
  
  in Public
  
  By handling the specific invalid behavior instead of rejecting the entire trajectory, this approach helps prevent the training instability and model collapse that can happen when rollouts are abruptly stopped.
  
  大多数人认为在AI训练中发现不良行为时应立即终止整个训练轨迹，但作者认为应该处理特定无效行为而非拒绝整个轨迹。这一观点挑战了AI训练中的'一刀切'方法，表明更精细化的行为管理可以防止训练不稳定和模型崩溃，从而提高训练效率。
  
  non-consensus ai-training behavior-management
2. fxp007 17 Jun 2026
  
  in Public
  
  As a limited-time promotion through the end of September, off-peak usage is billed at 1×. (Peak hours are 14:00–18:00 UTC+8 (Beijing Time) daily).
  
  大多数人认为AI模型定价应该基于模型大小或性能，而非使用时间，但作者认为基于时间段的差异化定价是合理的策略。这一观点挑战了AI服务定价的行业惯例，暗示通过时间差异化管理可以有效平衡计算资源使用并提高系统效率。
  
  non-consensus ai-pricing resource-management
3. fxp007 17 Jun 2026
  
  in Public
  
  We find that GLM-5.2 shows more potential hacking behavior than GLM-5.1. This makes the verification signal easy to optimize, but fails to actually improve the fundamental capabilities of the model.
  
  大多数人认为模型能力的提升总是伴随着更好的性能表现，但作者认为GLM-5.2虽然表现出更多的潜在黑客行为，但这实际上并未提升模型的基本能力。这一观点挑战了'更高的性能分数总是意味着更好的模型能力'的主流认知，暗示在AI训练中存在过度优化指标而忽视实际能力提升的问题。
  
  non-consensus ai-training model-evaluation
4. fxp007 17 Jun 2026
  
  in Public
  
  On Terminal-Bench 2.1 (81.0) it lands within a few points of Claude Opus 4.8 (85.0) — while staying ahead of Gemini 3.1 Pro.
  
  大多数人认为开源模型与顶级闭源模型之间存在巨大差距，但作者认为GLM-5.2在终端基准测试中已经接近Claude Opus 4.8的性能，甚至超过了Gemini 3.1 Pro。这一观点挑战了AI领域'闭源模型遥遥领先'的行业共识，表明开源模型在特定编码任务上已经能够与顶级商业模型竞争。
  
  non-consensus ai-performance coding-benchmarks
5. fxp007 17 Jun 2026
  
  in Public
  
  GLM-5.2 is the highest-ranked open-source model, showing that its 1M context has translated into practical long-horizon delivery capability.
  
  大多数人认为开源模型在长距离任务能力上必然落后于闭源模型，但作者认为GLM-5.2作为开源模型已经实现了实际的长距离任务交付能力，甚至在某些基准测试中超过了GPT-5.5等闭源模型。这一观点挑战了AI领域'闭源模型必然优于开源模型'的主流认知，表明开源模型在特定任务上已经能够达到商业级别的性能。
  
  non-consensus open-source-ai long-horizon-tasks
Visit annotations in context

Tags

ai-training

ai-performance

long-horizon-tasks

non-consensus

coding-benchmarks

resource-management

ai-pricing

open-source-ai

model-evaluation

behavior-management

Annotators

fxp007

URL

huggingface.co/blog/zai-org/glm-52-blog
www.anthropic.com www.anthropic.com

https://www.anthropic.com/research/claude-code-expertise

4
1. fxp007 17 Jun 2026
  
  in Public
  
  each prompt the user sends sets off a chain of around 10 actions taken by Claude on average
  
  这个数据点表明每个用户提示平均触发约10个Claude行动，这显示了AI代理的自主性和效率。这一比例表明用户只需提供高层次指导，AI就能执行大量具体任务。然而，文章提到尾部数据(约2%的会话平均超过100个行动/提示)，这表明使用模式存在显著差异。10:1的行动-提示比是理解AI代理工作效率的关键指标，但文章未说明这些行动的类型和质量差异。
  
  data-point ai-actions productivity
2. fxp007 17 Jun 2026
  
  in Public
  
  people make about 70% of the planning decisions but only 20% of the execution decisions
  
  这个70/20的决策分配比例清晰地展示了人机协作的分工模式：人类负责'做什么'，AI负责'怎么做'。70/20的比例表明AI在执行层面有相当大的自主权，这可能与人们通常预期的人工监督主导模式不同。这个数据点支持了文章核心论点——AI代理正在重新定义编程工作的人机分工模式。然而，文章未详细说明如何定义和分类'决策'，这可能影响数据的准确性。
  
  data-point decision-making human-ai-collaboration
3. fxp007 17 Jun 2026
  
  in Public
  
  each prompt the user sends sets off a chain of around 10 actions taken by Claude on average
  
  这个数据点表明，每个用户提示平均触发约10个Claude行动，显示了AI的自主性和效率。这个平均值掩盖了巨大的变异性 - 文章提到约2%的会话平均每个提示超过100个行动。这一数据点表明Claude能够自主执行复杂任务序列，但用户需要监控这些行动以确保结果符合预期。
  
  data-point statistics ai-autonomy
4. fxp007 17 Jun 2026
  
  in Public
  
  people make about 70% of the planning decisions but only 20% of the execution decisions
  
  这个70/20的比例揭示了人机协作的明确分工模式：人类主要负责决策规划，AI则负责具体执行。这一比例表明AI在执行任务方面已经相当自主，但在战略规划上仍依赖人类。这一数据点与同类研究相比显示出较高的人机协作水平，可能反映了Claude Code的设计理念和用户使用习惯。
  
  data-point statistics human-ai-collaboration
Visit annotations in context

Tags

productivity

statistics

ai-autonomy

human-ai-collaboration

data-point

decision-making

ai-actions

Annotators

fxp007

URL

anthropic.com/research/claude-code-expertise
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/15/the-us-governments-anthropic-models-ban-was-never-about-an-ai-jailbreak/

3
1. fxp007 15 Jun 2026
  
  in Public
  
  the message is clear: The AI industry isn't immune from U.S. government interference
  
  大多数人可能认为AI技术的前沿性质使其能够规避传统监管框架，但作者认为政府的禁令明确传递了一个信息：即使是尖端AI技术也不能摆脱政府干预。这与科技行业自认为能够自我监管的普遍认知相悖。
  
  non-consensus tech-regulation ai-policy
2. fxp007 15 Jun 2026
  
  in Public
  
  The AI industry isn't immune from U.S. government interference
  
  虽然许多人认为AI行业相对独立于传统政府监管，但作者明确表示AI行业并非免疫于政府干预。这一观点挑战了科技行业自主性的主流叙事，暗示AI公司可能面临与传统行业类似的政府压力。
  
  non-consensus ai-industry government-interference
3. fxp007 15 Jun 2026
  
  in Public
  
  The US government's Anthropic models ban was never about an AI jailbreak
  
  大多数人认为政府禁止Anthropic的AI模型是出于安全考虑，特别是担心AI越狱风险，但作者认为这并非真正原因。这是一个非共识观点，挑战了公众对政府监管AI的普遍理解。
  
  non-consensus ai-regulation counterintuitive
Visit annotations in context

Tags

ai-regulation

ai-policy

ai-industry

tech-regulation

government-interference

non-consensus

counterintuitive

Annotators

fxp007

URL

techcrunch.com/2026/06/15/the-us-governments-anthropic-models-ban-was-never-about-an-ai-jailbreak/
www.theverge.com www.theverge.com

https://www.theverge.com/ai-artificial-intelligence/949986/anthropic-fable-mythos-shutdown-sovereign-ai

5
1. fxp007 15 Jun 2026
  
  in Public
  
  Restoring global trust in American AI is another thing entirely. No matter how long the shutdown lasts, it shined a light on how fragile access to US frontier AI models is.
  
  大多数人可能认为美国AI技术的优势地位是稳固的，但作者认为，这次事件暴露了美国AI访问权的脆弱性，可能永久性地损害了全球对美国AI技术的信任。这一观点挑战了美国AI技术主导地位的稳固性假设。
  
  non-consensus ai-trust american-ai-fragility
2. fxp007 15 Jun 2026
  
  in Public
  
  Trump may see restricting Mythos and Fable as a matter of national security. But the argument cuts both ways, and with Washington now asking if AI is too important for everyone to have access, other governments are asking whether they can afford for Washington to decide who does.
  
  大多数人可能认为美国限制AI访问是出于国家安全考量，但作者认为，这种行为实际上促使其他国家质疑美国对AI技术的垄断控制权，并重新评估依赖美国AI技术的风险。这一观点挑战了美国单方面决定AI技术访问权的合法性。
  
  counterintuitive ai-access power-dynamics
3. fxp007 15 Jun 2026
  
  in Public
  
  Most governments and businesses cannot come close to matching the scale and resources of frontier labs in the US or China. But sovereign AI does not always mean building the biggest or the most powerful tools.
  
  主流观点认为AI主权意味着要在所有领域与美国和中国竞争，但作者认为，真正的AI主权不在于复制美国的规模，而在于发展符合本国战略需求的特定能力。这一观点挑战了AI发展必须追求规模和通用能力的共识。
  
  non-consensus ai-strategy sovereignty-vs-scale
4. fxp007 15 Jun 2026
  
  in Public
  
  He likened the pullback of Anthropic's models to Iran's blockade of the Strait of Hormuz, with access to AI now a strategic chokepoint for which France must prepare.
  
  大多数人可能将AI视为一种技术产品或服务，但作者认为，AI访问权已成为像霍尔木兹海峡这样的战略咽喉要道，国家必须为此做准备。这种将AI技术类比为地缘政治战略要点的观点挑战了人们对AI本质的常规理解。
  
  counterintuitive geopolitics ai-as-strategic-asset
5. fxp007 15 Jun 2026
  
  in Public
  
  But sovereign AI does not always mean building the biggest or the most powerful tools. France's Mistral and Canada's Cohere show that solid efforts can come from outside these countries, even if the models can't stand toe to toe.
  
  大多数人认为只有拥有与美国和中国相当规模和资源的国家才能开发有竞争力的AI模型，但作者认为，较小国家可以通过专注于特定领域或本地化需求来建立有意义的AI主权，即使这些模型在通用能力上无法与美国最前沿的模型抗衡。
  
  non-consensus ai-sovereignty small-nations
Visit annotations in context

Tags

ai-trust

ai-as-strategic-asset

ai-sovereignty

power-dynamics

ai-strategy

non-consensus

small-nations

sovereignty-vs-scale

american-ai-fragility

geopolitics

ai-access

counterintuitive

Annotators

fxp007

URL

theverge.com/ai-artificial-intelligence/949986/anthropic-fable-mythos-shutdown-sovereign-ai
garrit.xyz garrit.xyz

Don't trust large context windows | Garrit's Notes

1
1. pyxelr 15 Jun 2026
  
  in Public
  
  Don't trust large context windows
  
  Large context windows are divided into a "smart zone" (sharp, attentive model performance) and a "dumb zone" (where attention drops off and the model begins forgetting details).
  
  The transition into the "dumb zone" typically begins around 100k tokens, regardless of advertised context limits.
  
  Coding agents quickly burn through tokens during debugging, file reading, and test runs, accelerating the transition into degraded context areas.
  
  While vendors advertise massive context limits (e.g., 200k to 2M tokens) as a marketing metric, academic studies (like RULER) and empirical reports confirm effective context is much smaller.
  
  Agent mitigation tools like "auto-compaction" (summarizing history) often trigger too late and create summarized data using a model that is already experiencing performance decay.
  
  A more reliable alternative is the "breadcrumb approach": manually opening a new session and passing a self-authored specification to keep the context focused in the smart zone.
  
  Entire agent workflows can be optimized by structuring data around small, modular artifacts (like PRDs, plans, or sub-agent handoffs) to strictly budget the live session context.
  
  Hacker News Discussion
  
  Erosion of Engineering Rigor: Users expressed deep concern that LLM engineering has devolved into non-deterministic "cargo culting" and "gardening advice" rather than a rigorous, scientific discipline.
  
  Determinism vs. Flexibility: Systems engineers noted the cognitive friction of using opaque, non-deterministic workflows, though some find immense value in using LLMs strictly as a translation layer from human text into structured, deterministic tool calls.
  
  Heuristics Over Theory: Many agreed that the rapid iteration cycle of cloud models prevents deep theoretical understanding, forcing developers to rely on empirical heuristics, benchmarking, and structured constraints (like confining inputs) to ensure reliability.
  
  Architectural Limitations: Commenters speculated that training long-context windows suffers from a data and compute scaling bottleneck, leading to synthetic fine-tuning that trains models to treat early conversational history as noise.
  
  LLM AI programming
Visit annotations in context

Tags

programming

AI

LLM

Annotators

pyxelr

URL

garrit.xyz/posts/2026-05-06-dont-trust-large-context-windows
shawnsmucker.substack.com shawnsmucker.substack.com

Please Use AI

1
1. JoeMurphy 15 Jun 2026
  
  in Public
  
  Please Use AI
  
  poetry AI
Visit annotations in context

Tags

poetry

AI

Annotators

JoeMurphy

URL

shawnsmucker.substack.com/p/please-use-ai
stephen.bochinski.dev stephen.bochinski.dev

AI Coding at Home Without Going Broke | Stephen Bochinski

1
1. pyxelr 14 Jun 2026
  
  in Public
  
  AI Coding at Home Without Going Broke
  
  Transitioning from standard chat interfaces to autonomous, multi-file AI coding agents can cause API token consumption and monthly costs to skyrocket if left unmanaged.
  
  Including massive, multi-file codebases in every agent prompt rapidly exhausts context windows and inflates the cost per turn exponentially.
  
  To code at home without going broke, developers should shift to a modular architecture: isolating components, splitting projects into small modules, and relying heavily on mock data layers.
  
  Restricting the AI's visibility to a single file or a narrowly scoped subdirectory keeps context tokens low, prevents the agent from making sweeping changes across the codebase, and lowers billing.
  
  Leveraging free or low-cost tier tools to map out full architectural specs and test files before generating implementation code provides rigid constraints that minimize wasted AI loops.
  
  Developers can significantly curb expenses by opting for deep-context consumer subscription plans (such as $20 to $100 per month tiers) over uncapped pay-as-you-go API keys when executing heavy agent tasks.
  
  Hacker News Discussion
  
  The Reality of the Cost "Squeeze": A debate emerged over what constitutes "going broke," with many users noting that standard $20 to $100 consumer tiers are more than sufficient for normal hobbyist workflows and are likely heavily subsidized by AI providers at break-even rates.
  
  The Culprit Behind Token Bleed: Commenters pointed out that users burning thousands of dollars in API credits are typically running automated pipelines, loading up dozens of Model Context Protocol (MCP) tools, or deploying recursive sub-agents that reload the entire codebase context on every single turn.
  
  Niche Utility for Unattended Grinding: While continuous, unattended AI coding is rarely efficient for daily tasks, an engineer shared a highly valuable edge case: letting an AI autonomously decompile, reverse-engineer, and rebuild five interrelated legacy firmware images back into recognizable C projects over several hours.
  
  The Sequential Refactoring Playbook: For managing large-scale modifications, users advocated for a strict, multi-step pipeline: first utilizing AI to ingest code and write unit tests, then breaking the files into tiny, isolated blocks, testing those blocks independently, and only then generating the actual refactored behavior.
  
  Interruption Management Advantage: A key human-centric benefit highlighted was how agentic setups alleviate cognitive load during family interruptions; a developer can step away for hours and simply tell the agent to catch them up and proceed without losing flow state.
  
  AI programming FinOps
Visit annotations in context

Tags

programming

FinOps

AI

Annotators

pyxelr

URL

stephen.bochinski.dev/blog/2026/06/13/ai-coding-at-home-without-going-broke/
opensourceaimustwin.com opensourceaimustwin.com

Opensource AI Must Win

1
1. pyxelr 14 Jun 2026
  
  in Public
  
  If intelligence becomes something people can only rent from a few closed institutions, the public does not just lose software freedom. It loses operational freedom.
  
  https://news.ycombinator.com/item?id=48511908
  
  open-source AI
Visit annotations in context

Tags

open-source

AI

Annotators

pyxelr

URL

opensourceaimustwin.com/
tombedor.dev tombedor.dev

If You are Asking for Human Attention, Demonstrate Human Effort | Tom Bedor's Blog

1
1. pyxelr 14 Jun 2026
  
  in Public
  
  If you are requesting human attention, demonstrate human effort.
  
  Hacker News Discussion
  
  The Pull Request Fatigue Loop: A widely upvoted comment highlighted how a colleague using Claude flooded the team with AI-generated PRs, then complained when they languished; reviewers subconsciously avoided them because reviewing AI code for hidden hallucinations requires an immense, asymmetric amount of human effort.
  
  The Asymmetry of Feedback: Users noted that it feels deeply dismissive when a human invests an hour of intense cognitive effort to thoughtfully review a massive PR, only to receive an instantaneous, AI-generated reply or amendment from the author.
  
  Review Scalability vs. Guardrails: Some participants argued that traditional code review cannot scale to prolific AI agents or hyper-productive humans; they suggested transitioning to automated guardrails—such as linters, auto-formatters, and robust end-to-end continuous deployment testing—to offset the review bottleneck.
  
  Code Review as a Cultural Practice: The discussion underscored that code review should function as a collaborative team process for shared understanding and mentorship rather than a cold, adversarial gatekeeper blocking a developer from merging code.
  
  Exploiting Token Budgets: One commenter observed that large, complex PRs often trigger scrolling blindness in humans and cause LLMs to run out of token budget, leading both to blindly approve the change with a generic "looks good to me."
  
  programming AI work
Visit annotations in context

Tags

programming

AI

work

Annotators

pyxelr

URL

tombedor.dev/human-attention-and-human-effort/
www.normaltech.ai www.normaltech.ai

Why AI hasn’t replaced software engineers, and won’t

1
1. pyxelr 14 Jun 2026
  
  in Public
  
  Why AI hasn’t replaced software engineers, and won’t
  
  Software engineering has a long history of aggressive automation—from assembly to high-level languages—and rather than replacing engineers, every leap in productivity has expanded the scale and complexity of what can be built.
  
  The demand for software is functionally insatiable; as soon as engineers become more efficient, the organizational goalposts move, leading to higher expectations rather than a reduction in staff.
  
  Current AI development tools act primarily as force multipliers rather than autonomous agents, meaning that an expert developer is still strictly required to drive, review, and handle the remaining high-value 10% of the work.
  
  For AI to truly replace software engineers, an autonomous AI system would need to consistently outperform an AI+human developer hybrid team, a milestone that current data and architectures are far from reaching.
  
  While generalist software engineers remain secure, specific narrow domains or commoditized skill sets (such as basic, boilerplate frontend development) face a heightened risk of being entirely absorbed by AI tools.
  
  The most significant hurdle for autonomous AI is not initial code generation, but rather the long-term maintenance, context retention, and reasoning required to safely adapt to changing ecosystems and walled gardens.
  
  Rather than destroying the engineering market, AI changes the underlying economics of production, allowing developers to rapidly clear backlogs, build minor utilities, and focus more on architectural architecture and system design.
  
  Hacker News Discussion
  
  The Jevons Paradox of Code: Commenters emphasized that increasing the efficiency of software creation lowers its cost, which historically exponentially increases overall demand rather than exhausting the market.
  
  The Rise of Bespoke Consumer Software: A popular theory suggested that AI will enable everyday users to spin up personalized, ad-free, micro-utilities (like custom todo lists) on the fly, reducing reliance on bloated commercial applications.
  
  The Tinkering vs. Maintenance Chasm: Several users countered the "bespoke software" future by comparing it to 3D printing; while creating a custom script is easy with AI, the average user lacks the logical thinking and patience required to maintain software over time.
  
  A Cyberpunk Technological Stack: Users noted that the current trajectory feels reminiscent of science fiction, where individuals possess highly customized, personalized technology stacks modified specifically for their unique workflows.
  
  B2B Complexity and Standardization: Many participants pointed out that while consumer-facing apps might become fragmented, enterprise B2B infrastructure, distributed systems, and core data layers (like the Linux kernel or banking infrastructure) strictly require human-driven rigor, consistency, and standardization.
  
  programming AI automation career
Visit annotations in context

Tags

programming

career

automation

AI

Annotators

pyxelr

URL

normaltech.ai/p/why-ai-hasnt-replaced-software-engineers
www.youtube.com www.youtube.com

17 000 USD zysku i 90% w pół roku. Mechanika rewolucji technologicznych. Jak na tym zarabiam?

1
1. pyxelr 14 Jun 2026
  
  in Public
  
  17 000 USD zysku i 90% w pół roku. Mechanika rewolucji technologicznych. Jak na tym zarabiam?
  
  Systematyczny model scoringowy zamiast emocji: Kluczem do sukcesu inwestycyjnego jest posiadanie sztywnego, opartego na twardych danych liczbowych procesu decyzyjnego (modelu scoringowego), zamiast karmienia własnego ego rynkowymi hipotezami czy próbami ciągłego przewidywania korekt [00:00:46], [00:01:47].
  
  Mechanika rewolucji technologicznych (Analogia XIX-wiecznej kolei): Obecny boom na infrastrukturę AI przypomina dziewiętnastowieczną bańkę kolejową w USA. Wtedy również budowano linie w sposób nadmiarowy z powodu dążenia do monopolu oraz rynkowego FOMO miast i korporacji [00:02:42], [00:03:26]. Choć wiele firm kolejowych zbankrutowało, to postawiona infrastruktura stworzyła podwaliny pod potężny rozwój gospodarczy [00:03:56].
  
  Inwestowanie w „producentów stali”, a nie „właścicieli torów”: Bezpieczniejszą i bardziej rentowną strategią na wczesnym etapie rewolucji AI jest kupowanie akcji dostawców technologii i infrastruktury (półprzewodników), czyli firm wysysających kapitał od bigtechów, zamiast inwestowania w same modele językowe, których przyszła rentowność stoi pod znakiem zapytania [00:04:21], [00:09:21].
  
  Wymuszony wyścig zbrojeń bigtechów: Giganci tacy jak Microsoft, Meta, Amazon i Alphabet (Google) są zmuszeni do kolosalnych wydatków na chipy i centra danych, ponieważ rezygnacja z tego wyścigu oznacza dla nich ryzyko marginalizacji lub wręcz egzystencjalne zagrożenie [00:09:36].
  
  Wzrost produktywności kontra zyski firm (Paradoks Solowa): Badania (m.in. MIT i Stanford) potwierdzają, że wdrożenie AI podnosi efektywność pracowników biurowych i obsługi klienta o 14–40% [00:06:12], [00:06:41]. Jednak rewolucje technologiczne potrzebują czasu (historycznie nawet 40 lat przy elektryfikacji fabryk), aby przeorganizować struktury korporacyjne i przełożyć się bezpośrednio na marże netto przedsiębiorstw [00:07:12], [00:07:39].
  
  Analiza fundamentalna głównych pozycji (Nvidia i Broadcom):
  
  Ostatnie korekty giełdowe przy jednoczesnym podniesieniu długoterminowych prognoz przychodów przez analityków sprawiły, że wskaźniki wyceny (cena do prognozowanych przychodów na 2 lata w przód) dla obu spółek są na atrakcyjnych, relatywnie niskich poziomach [00:11:13], [00:12:16].
  
  Konsensus analityków wskazuje dla nich odpowiednio ok. 35% (Broadcom) i 50% (Nvidia) potencjału wzrostu w perspektywie roku, oferując bardzo korzystny stosunek zysku do ryzyka [00:11:46], [00:12:46].
  
  Zarządzanie ryzykiem i cykliczność pamięci (Micron, SanDisk): Sektor pamięci HBM (High Bandwidth Memory) przeżywa bezprecedensowy popyt przewyższający moce produkcyjne fabryk co najmniej do przełomu 2027/2028 roku [00:14:05]. Autor akceptuje ryzyko cykliczności i ewentualną sprzedaż nawet 30–40% poniżej szczytu, jeśli w przyszłości pojawią się twarde dane o nasyceniu rynku [00:14:22], [00:14:34].
  
  Wyniki i struktura portfela: Prowadzony od pół roku portfel oparty na momentum i półprzewodnikach wygenerował 17 000 USD zysku (stopa zwrotu 90%) [00:15:54], [00:16:15]. W celu wygładzenia potężnej zmienności (wahania rzędu 8–9% dziennie), kolejne dopłaty będą kierowane na stabilniejsze podmioty (Nvidia, Broadcom) oraz mniejsze pozycje infrastrukturalne, takie jak Vertiv (chłodzenie) i Monolithic Power Systems (zarządzanie energią) [00:13:15], [00:15:18].
  
  investing NVIDIA AI polish YouTube
Visit annotations in context

Tags

NVIDIA

polish

AI

investing

YouTube

Annotators

pyxelr

URL

youtube.com/watch
arstechnica.com arstechnica.com

https://arstechnica.com/ai/2026/06/anthropic-shuts-down-fable-mythos-models-following-trump-admin-directive/

1
1. fxp007 14 Jun 2026
  
  in Public
  
  If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.
  
  大多数人认为政府的安全审查是合理的预防措施，但作者认为这种标准如果普遍应用，实际上会停止整个行业的前沿模型部署，这暗示了政府安全标准可能过于严苛，阻碍了AI创新和技术进步。
  
  non-consensus ai-regulation innovation-barrier
Visit annotations in context

Tags

ai-regulation

innovation-barrier

non-consensus

Annotators

fxp007

URL

arstechnica.com/ai/2026/06/anthropic-shuts-down-fable-mythos-models-following-trump-admin-directive/
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/13/kpmg-pulls-report-on-ai-usage-due-to-apparent-hallucinations/

3
1. fxp007 13 Jun 2026
  
  in Public
  
  apparent hallucinations
  
  大多数人可能认为AI的'幻觉'主要是在创意生成或虚构内容中出现的问题。但作者使用'apparent'一词暗示，这些错误可能并非明显的虚构，而是以看似可信的方式出现，这挑战了人们对AI错误类型的认知，表明AI错误可能更加隐蔽且难以识别，即使在专业领域也是如此。
  
  non-consensus ai-errors counterintuitive
2. fxp007 13 Jun 2026
  
  in Public
  
  KPMG pulls report on AI usage due to apparent hallucinations
  
  主流观点认为大型专业咨询公司如KPMG应该有严格的事实核查流程，能够确保发布报告的准确性。然而，这个标题暗示即使是顶级专业机构也可能被AI的'幻觉'误导，这挑战了人们对专业机构质量控制能力的信任，表明AI错误可能比我们想象的更普遍且更具欺骗性。
  
  non-consensus professional-standards ai-misinformation
3. fxp007 13 Jun 2026
  
  in Public
  
  Once again, AI proves to be an unreliable source of information about AI.
  
  大多数人认为随着AI技术的发展，它应该越来越可靠，尤其是在分析自身领域的数据时。但作者通过KPMG撤回报告的案例，提出了一个反直觉的观点：即使是专业的AI系统也可能在分析AI相关数据时产生严重错误，这暗示了AI自我评估的不可靠性，挑战了人们对AI技术自我完善能力的普遍认知。
  
  non-consensus ai-reliability counterintuitive
Visit annotations in context

Tags

professional-standards

ai-reliability

ai-misinformation

ai-errors

non-consensus

counterintuitive

Annotators

fxp007

URL

techcrunch.com/2026/06/13/kpmg-pulls-report-on-ai-usage-due-to-apparent-hallucinations/
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/13/amazon-ceo-reportedly-raised-anthropic-model-concerns-before-government-crackdown/

1
1. fxp007 13 Jun 2026
  
  in Public
  
  Amazon CEO Andy Jassy may have been the source of security concerns that led Anthropic to cut off worldwide access to two models on Friday.
  
  大多数人认为大型科技公司CEO通常推动技术开放和广泛访问，但这里暗示亚马逊CEO Jassy可能对Anthropic的AI模型提出了安全担忧，导致这些模型被限制访问。这挑战了科技领袖总是倡导技术开放的常规认知，表明即使是科技巨头的高管也可能采取保守立场。
  
  non-consensus ceo-behavior ai-safety
Visit annotations in context

Tags

ai-safety

ceo-behavior

non-consensus

Annotators

fxp007

URL

techcrunch.com/2026/06/13/amazon-ceo-reportedly-raised-anthropic-model-concerns-before-government-crackdown/
arstechnica.com arstechnica.com

https://arstechnica.com/tech-policy/2026/06/130-billion-in-data-center-projects-blocked-by-protests-so-far-this-year/

1
1. fxp007 12 Jun 2026
  
  in Public
  
  $130 billion in data center projects blocked by protests so far this year
  
  这一数据点表明，2026年前三个月因抗议而被阻止或延迟的数据中心项目价值高达1300亿美元，占2025年全年记录的1560亿美元的约83%。这一数字反映了数据中心反对运动的显著增长趋势，可能对AI基础设施建设产生重大影响，但需要确认这些数据的统计方法和来源可靠性。
  
  data-point statistics ai-infrastructure
Visit annotations in context

Tags

data-point

statistics

ai-infrastructure

Annotators

fxp007

URL

arstechnica.com/tech-policy/2026/06/130-billion-in-data-center-projects-blocked-by-protests-so-far-this-year/
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu

Supporting Co-Adaptive Machine Teaching through Human Concept Learning and Cognitive Theories

10
1. elglassman 12 Jun 2026
  
  in Public
  
  Alignment is a bilateral process; it refers not only to AI acting according to human intentions but also to humans better leveraging AI by understanding the mechanisms behind it [54].
  
  Any individual sentence that describes information designed to set the stage for the contribution of the paper.
  
  ai-pending context
2. elglassman 12 Jun 2026
  
  in Public
  
  Data labeling as a cognitive task—including defining a concept or determining how two similar objects may have different labels—requires both comparison and integration [62].
  
  Any individual sentence that describes information designed to set the stage for the contribution of the paper.
  
  ai-pending context
3. elglassman 12 Jun 2026
  
  in Public
  
  However, relying exclusively on existing examples is not ideal for tasks requiring nuanced understanding of user intentions, as these examples often fail to represent diverse and edge-case scenarios [31].
  
  Any individual sentence that describes information designed to set the stage for the contribution of the paper.
  
  ai-pending context
4. elglassman 12 Jun 2026
  
  in Public
  
  When training samples are scarce, model performance heavily depends on the quality of available training examples [15].
  
  Any individual sentence that describes information designed to set the stage for the contribution of the paper.
  
  ai-pending context
5. elglassman 12 Jun 2026
  
  in Public
  
  An important challenge in interactive machine learning, particularly in subjective or ambiguous domains, is fostering bi-directional alignment between humans and models.
  
  Any individual sentence that describes information designed to set the stage for the contribution of the paper.
  
  ai-pending context
6. elglassman 12 Jun 2026
  
  in Public
  
  In supervised and semi-supervised machine learning (ML) pipelines, labeled data is a vital component of training and validating models [46].
  
  An individual sentence describing the setting in which this work was done.
  
  context ai-user-approved
7. elglassman 12 Jun 2026
  
  in Public
  
  In the context of co-adaptive learning, supporting the intertwined evolution of both the user's understanding and the model's learning is crucial [16].
  
  An individual sentence describing the setting in which this work was done.
  
  ai-pending context
8. elglassman 12 Jun 2026
  
  in Public
  
  Machine teaching, a part of the human-in-the-loop approach, has been used as a process in which a human expert (the "teacher") provides guidance to a machine learning model to help it learn important and robust features for decision making [57].
  
  An individual sentence describing the setting in which this work was done.
  
  ai-pending context
9. elglassman 12 Jun 2026
  
  in Public
  
  A targeted approach in IML is machine teaching (MT) [60], an interactive framework that allows users to devise and select useful data for labeling, with the goal of teaching the model relevant features during training [7, 18].
  
  An individual sentence describing the setting in which this work was done.
  
  ai-pending context
10. elglassman 12 Jun 2026
  
  in Public
  
  Interactive ML (IML) methods, like active learning [3], continuously apply human feedback during model training to iteratively build and refine the model [35, 42, 43].
  
  An individual sentence describing the setting in which this work was done.
  
  ai-pending context
Visit annotations in context

Tags

ai-user-approved

ai-pending

context

Annotators

elglassman

URL

glassmanlab.seas.harvard.edu/papers/mocha_chi25.pdf
natcwik.substack.com natcwik.substack.com

Your AI Stack Runs on the Commons

2
1. JoeMurphy 12 Jun 2026
  
  in Public
  
  If AI systems use the commons while reducing the visibility of the commons, then the problem becomes sustainability of public knowledge itself.
  
  commons AI open oer
2. JoeMurphy 12 Jun 2026
  
  in Public
  
  A person reading an essay is one thing. A teacher using an article in class is one thing. A volunteer translating a public-interest resource is one thing. A crawler absorbing enormous amounts of human work into a commercial machine-learning system, with no meaningful conversation about permission, attribution, compensation or future use, is something else. Scale changes the nature of the act. When use becomes extraction at industrial speed, the old language starts to feel inadequate.
  
  AI open oer
Visit annotations in context

Tags

AI

commons

open

oer

Annotators

JoeMurphy

URL

natcwik.substack.com/p/your-ai-stack-runs-on-the-commons
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu

Meta-HCI: Practising Reflection in HCI Research

4
1. elglassman 12 Jun 2026
  
  in Public
  
  The goal of this meet-up is to create a space for CHI attendees to discuss and practise reflection in HCI research and design.
  
  An individual sentence that describes the purpose of this document, according to its authors.
  
  ai-pending purpose
2. elglassman 10 Jun 2026
  
  in Public
  
  Reflection is not limited to one subfield or methodology: it is a concern that cuts across the entire discipline.
  
  A single sentence discussing the concept of reflection
  
  ai-pending reflection
3. elglassman 10 Jun 2026
  
  in Public
  
  We understand reflection as a multifaceted concept [6, 24, 25] with implications and relevance at different levels for researchers [2, 3, 8, 28]:
  
  A single sentence discussing the concept of reflection
  
  ai-pending reflection
4. elglassman 10 Jun 2026
  
  in Public
  
  Reflection has been a recurring theme in HCI – from Schön's reflective practitioner [24] to Sengers et al.'s reflective design [25].
  
  A single sentence discussing the concept of reflection
  
  ai-pending reflection
Visit annotations in context

Tags

ai-pending

reflection

purpose

Annotators

elglassman

URL

glassmanlab.seas.harvard.edu/papers/meta_HCI_CHI26meetup.pdf
andonlabs.com andonlabs.com

Untitled document

3
1. fxp007 12 Jun 2026
  
  in Public
  
  Luna is good at managing the day-to-day operations, but never takes a step back and looks at the overall business performance
  
  这段话精确定位了当前AI智能体能力的边界：擅长执行，不擅长战略。Luna能处理排班、补货、社交媒体发帖——这些有明确触发条件和操作步骤的任务。但分析整体业务健康度、识别结构性问题、主动调整战略方向，需要一种不同类型的认知：元层面的自我评估和长期目标感知。Luna是好的运营经理，但不是CEO。
  
  战略缺失执行vs战略 AI能力边界
2. fxp007 12 Jun 2026
  
  in Public
  
  Each agent gets their own bank account that they do normal bank transfers with, and temporary cards for purchasing items on the internet
  
  关键的设计选择：Andon Labs明确拒绝了新兴的AI专属支付协议，而是把AI接入传统支付轨道——普通银行账户和信用卡。每个智能体有独立账户，意味着独立的资金边界和可审计的交易记录。这背后是务实判断：与其等待AI原生金融基础设施成熟，不如用已有的、监管成熟的轨道——代价是更多集成复杂度，收益是合规性和可追溯性。
  
  AI支付传统金融轨道资金隔离
3. fxp007 12 Jun 2026
  
  in Public
  
  Luna, an AI agent powered by Claude Opus 4.8, runs the business end-to-end
  
  这是目前已知最接近真实世界AI自主商业运营的公开案例之一。Luna不是演示——它有真实的银行账户、真实的员工、真实的库存和真实的盈亏压力。这个案例的价值在于：它把AI智能体从实验室环境搬到了现实的经济摩擦中。银行出错、员工迟到、库存断货——这些才是真正的测试，而不是benchmark分数。
  
  AI自主运营真实世界测试智能体商业
Visit annotations in context

Tags

智能体商业

AI支付

真实世界测试

AI能力边界

执行vs战略

传统金融轨道

AI自主运营

资金隔离

战略缺失

Annotators

fxp007

URL

andonlabs.com/market
www.anthropic.com www.anthropic.com

When AI builds itself

8
1. fxp007 12 Jun 2026
  
  in Public
  
  If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe.
  
  Anthropic在这里做了一个极为坦诚但也极为沉重的表态：暂停可能是好事，但单边暂停是有害的——效果是把领先优势拱手相让给「最不谨慎的行为者」。这个逻辑是AI安全领域的核心困境，也是Anthropic继续推进的内在理由。批判性阅读：这套论证结构在任何军备竞赛中都可以成立，因此它不能区分「真正的安全驱动开发」和「竞争驱动开发加上安全叙事」。Anthropic自己也承认无法证伪这个区别——这正是为什么他们把验证机制的构建列为下一步工作。
  
  AI治理暂停机制批判性
2. fxp007 12 Jun 2026
  
  in Public
  
  the agents recovered 97% over 800 cumulative hours and used roughly $18,000 in compute
  
  AI安全研究的具体对比：2名人类研究员用约一周时间恢复了23%的性能差距；AI agent用800累计小时+18,000美元算力恢复了97%。18,000美元的算力成本在AI公司是完全可承受的，而「2名顶尖研究员工作一周」的人力成本远不止于此。同等预算下，AI的输出已经碾压人类。「人类仍然选择了问题和评分标准」——这个保留条款现在是唯一剩余的人类不可替代性，而这篇文章本身就是在论证这个条款也在缩窄。
  
  数据 AI研究成本效益
3. fxp007 12 Jun 2026
  
  in Public
  
  Claude did all of this with pretty minimal help from me over the course of 1-2 days. I think if [a junior colleague] came back to me with results like this in the same span of time, I would be mildly impressed. The future is now.
  
  研究者说mildly impressed——不是震惊，是温和地印象深刻。这意味着Claude的表现已经进入正常聪明同事的参照系，而不再是「AI做到了这个！」的惊叹系。当前沿AI研究者用评价初级同事的标准来评价AI的工作产出，某种意义上这才是真正的图灵时刻——不是测试过了，而是基准系统已经悄悄切换了。
  
  金句 AI评估能力基准
4. fxp007 12 Jun 2026
  
  in Public
  
  more than 80% of the code we merge into Anthropic's codebase was authored by Claude
  
  这个数字需要和脚注3一起读：80%+是合并到生产环境的行数中可归因于Claude的比例，已经是保守计算——脚注承认归因系统有漏洞，且未归因部分也包括大量非人工手写代码。真实比例可能更接近Anthropic领导层公开引用的90%+。即便是保守的80%，意义也是清晰的：在世界上最顶尖的AI研究机构里，人类工程师的核心工作已经从写代码转变为审查和导向代码。
  
  数据 AI生产力代码
5. fxp007 12 Jun 2026
  
  in Public
  
  If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe.
  
  Anthropic在这里做了一个极为坦诚但也极为沉重的表态：暂停可能是好事，但单边暂停是有害的——效果是把领先优势拱手相让给「最不谨慎的行为者」。这个逻辑是AI安全领域的核心困境，也是Anthropic继续推进的内在理由。批判性阅读：这套论证结构在任何军备竞赛中都可以成立，因此它不能区分「真正的安全驱动开发」和「竞争驱动开发加上安全叙事」。Anthropic自己也承认无法证伪这个区别——这正是为什么他们把验证机制的构建列为下一步工作。
  
  AI治理暂停机制批判性
6. fxp007 12 Jun 2026
  
  in Public
  
  the agents recovered 97% over 800 cumulative hours and used roughly $18,000 in compute
  
  AI安全研究的具体对比：2名人类研究员用约一周时间恢复了23%的性能差距；AI agent用800累计小时+18,000美元算力恢复了97%。注意这里的隐含逻辑：18,000美元的算力成本在AI公司是完全可承受的，而「2名顶尖研究员工作一周」的人力成本远不止于此。同等预算下，AI的输出已经碾压人类。「人类仍然选择了问题和评分标准」——这个保留条款现在是唯一剩余的人类不可替代性，而这篇文章本身就是在论证这个条款也在缩窄。
  
  数据 AI研究成本效益
7. fxp007 12 Jun 2026
  
  in Public
  
  Claude did all of this with pretty minimal help from me over the course of 1-2 days. I think if [a junior colleague] came back to me with results like this in the same span of time, I would be mildly impressed. The future is now.
  
  这个评价耐人寻味。研究者说mildly impressed——不是震惊，是温和地印象深刻。这意味着Claude的表现已经进入「正常聪明同事」的参照系，而不再是「AI做到了这个！」的惊叹系。当前沿AI研究者用评价初级同事的标准来评价AI的工作产出，某种意义上这才是真正的图灵时刻——不是测试过了，而是基准系统已经悄悄切换了。
  
  金句 AI评估能力基准
8. fxp007 12 Jun 2026
  
  in Public
  
  more than 80% of the code we merge into Anthropic's codebase was authored by Claude
  
  这个数字需要和脚注3一起读：80%+是合并到生产环境的行数中可归因于Claude的比例，已经是保守计算——脚注承认归因系统有漏洞，且未归因部分也包括大量非人工手写代码。真实比例可能更接近Anthropic领导层公开引用的90%+。但即便是保守的80%，意义也是清晰的：在世界上最顶尖的AI研究机构里，人类工程师的核心工作已经从「写代码」转变为「审查和导向代码」。
  
  数据 AI生产力代码
Visit annotations in context

Tags

金句

AI治理

批判性

数据

代码

AI研究

暂停机制

AI生产力

成本效益

AI评估

能力基准

Annotators

fxp007

URL

anthropic.com/institute/recursive-self-improvement
sakana.ai sakana.ai

Untitled document

3
1. fxp007 12 Jun 2026
  
  in Public
  
  all programs run on an artificial machine with an artificial language, so nothing generated can execute outside the sandbox
  
  沙盒安全性是这项研究能够公开发表的前提。但就得警惕的是：沙盒里习得的「攻击策略原理」是可迁移的——即便 Redcode 无法在真实机器执行，演化出的策略（定向轰炸、自复制、多线程扫描）与真实恶意软件的战术同构。DRQ 演化的是「策略模式」，而非具体代码。红队用途的边界需要比「代码不可执行」更仔细地界定。
  
  AI安全沙盒红队测试
2. fxp007 12 Jun 2026
  
  in Public
  
  convergence does not occur at the level of source code, indicating that what converges is function rather than implementation
  
  表现型（行为）收敛，基因型（代码）不收敛——这个区分极为精妙。不同的代码实现了相同的功能，就像蜘螃和蛇各自独立演化出毒液但分子机制完全不同。对大模型研究的类比：不同架构、不同训练数据的模型可能在能力层面收敛，而在「实现层」保持多样性。评估 AI 能力时，只看代码/权重是不够的，必须看行为。
  
  表现型基因型 AI评估
3. fxp007 12 Jun 2026
  
  in Public
  
  we observe emergent behaviors that mirror biological evolution, where agents must constantly adapt simply to survive against ever-changing threats
  
  「仅仅为了生存就必须持续适应」——这句话的关键在于基准是移动的。传统 AI 评估用静态测试集衡量能力，而 DRQ 揭示了另一种智能形态：在没有固定目标的环境里，适应本身就是目标。这对理解未来多智能体系统（AI agent 竞争市场、多模型博弈）有直接预测价值。
  
  AI进化多智能体 Red-Queen
Visit annotations in context

Tags

多智能体

AI安全

表现型

红队测试

沙盒

AI评估

Red-Queen

AI进化

基因型

Annotators

fxp007

URL

sakana.ai/drq/
www.wired.com www.wired.com

Untitled document

1
1. fxp007 11 Jun 2026
  
  in Public
  
  Meet the OpenAI Engineer Leading ChatGPT’s Biggest Transformation Yet
  
  标题暗示个人主导重大变革，而非团队协作，反直觉。
  
  Leadership AI Transformation
Visit annotations in context

Tags

Leadership

AI Transformation

Annotators

fxp007

URL

wired.com/story/model-behavior-interview-with-openai-codex-lead-tibo-sottiaux/
www.latent.space www.latent.space

Untitled document

1
1. fxp007 11 Jun 2026
  
  in Public
  
  The most cited benchmark score of the year is a map of
  
  指出当前AI评测基准的权威性正在快速贬值，颠覆了人们对标准化评测的依赖。
  
  Benchmarking AI Metrics
Visit annotations in context

Tags

Benchmarking

AI Metrics

Annotators

fxp007

URL

latent.space/p/ainews-open-models-model-labs-vs
techwontsave.us techwontsave.us

Silicon Valley Is Turning Nurses Into Gig Workers w/ Katie J. Wells - Tech Won’t Save Us

1
1. parmoset 11 Jun 2026
  
  in Public
  
  Katie J. Wells quote from near the end of the interview:
  
  ...when you have very very low expectations for public government, Silicon Valley looks like an OK alternative... the technology in your pocket somehow looks more useful.
  
  This says so much about citizenship and the relationship between democracy, autocracy, and technology.
  
  quotes AI nani regulatory capture gig work
Visit annotations in context

Tags

nani

AI

quotes

gig work

regulatory capture

Annotators

parmoset

URL

techwontsave.us/episode/332_silicon_valley_is_turning_nurses_into_gig_workers_w_katie_j_wells
www.wired.com www.wired.com

https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/

3
1. fxp007 11 Jun 2026
  
  in Public
  
  Shouldn't AI be smart enough to know better itself? Sounds like marketing hype.
  
  大多数人可能认为AI应该具备足够智能来避免被用于有害目的，但评论者质疑这种假设，暗示AI的自我限制能力被过度营销夸大，反映了公众对AI能力的期望与实际技术能力之间的差距，以及对AI行业营销策略的怀疑。
  
  non-consensus ai-capabilities marketing-hype
2. fxp007 11 Jun 2026
  
  in Public
  
  A less cynical take - Anthropic's policy for Claude Fable had unintended consequences. They tried a less invasive method of differentiating by reading intent of the user in the prompt - an unfortunate tradeoff that spoils AI research.
  
  大多数人可能认为Anthropic的政策是故意设置障碍来阻止竞争，但评论者认为这可能是一个本意良好但执行不当的尝试，通过读取用户意图来区分不同用途，结果却无意中阻碍了AI研究，这暗示了企业安全措施与研究自由之间的复杂平衡。
  
  non-consensus ai-safety unintended-consequences
3. fxp007 11 Jun 2026
  
  in Public
  
  Anthropic is backtracking on a policy that would have covertly limited competitors from using its new AI model, Claude Fable 5, to develop other AI models.
  
  大多数人认为AI公司应该鼓励开放创新和竞争，但Anthropic原本的政策实际上是在暗中限制竞争对手使用其技术发展其他AI模型，这与开源精神和AI行业的协作理念背道而驰，显示出企业利益与行业公共利益的冲突。
  
  non-consensus ai-ethics business-strategy
Visit annotations in context

Tags

ai-safety

ai-capabilities

unintended-consequences

ai-ethics

business-strategy

marketing-hype

non-consensus

Annotators

fxp007

URL

wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/
www.technologyreview.com www.technologyreview.com

https://www.technologyreview.com/2026/06/11/1138794/google-deepmind-is-worried-about-what-happens-when-millions-of-agents-start-to-interact/

3
1. fxp007 11 Jun 2026
  
  in Public
  
  An agent breaks all of those assumptions. It reasons, it improvises, and it can be hijacked by a single sentence buried in a document it was asked to read.
  
  大多数人认为AI安全可以基于传统网络安全框架来构建，但作者指出AI智能体从根本上打破了这些安全假设。这一观点挑战了网络安全领域的传统思维，表明需要全新的安全范式来应对AI智能体的推理能力、即兴创造性和对简单指令的脆弱性。
  
  non-consensus security-paradigm ai-vulnerabilities
2. fxp007 11 Jun 2026
  
  in Public
  
  Shah thinks we have a few more months to go before agents are deployed throughout the economy in numbers that make potential risks a real concern.
  
  大多数人认为AI智能体的广泛部署还需要数年时间，但作者认为只有几个月的时间窗口。这一时间框架的急剧缩短挑战了行业对AI技术采用速度的普遍预期，暗示技术变革的速度可能远超人们的想象，紧迫性被大大低估。
  
  non-consensus timeline-prediction ai-adoption
3. fxp007 11 Jun 2026
  
  in Public
  
  The main issue is that there just isn't really a field of research for multi-agent safety yet. And we would like there to be.
  
  大多数人认为AI安全研究已经涵盖了多智能体系统，但作者认为这是一个全新的研究领域，表明当前AI安全研究存在明显空白。这挑战了人们对AI安全研究现状的认知，暗示了现有研究框架可能不足以应对即将到来的多智能体交互挑战。
  
  non-consensus ai-safety research-gap
Visit annotations in context

Tags

ai-safety

security-paradigm

ai-vulnerabilities

timeline-prediction

ai-adoption

research-gap

non-consensus

Annotators

fxp007

URL

technologyreview.com/2026/06/11/1138794/google-deepmind-is-worried-about-what-happens-when-millions-of-agents-start-to-interact/
www.wired.com www.wired.com

https://www.wired.com/story/openai-confidentially-files-for-ipo/

2
1. fxp007 10 Jun 2026
  
  in Public
  
  OpenAI and Anthropic May Be Rivals, but Investors Aren't Picking Sides
  
  文章提到OpenAI和Anthropic可能是竞争对手，但投资者没有选边站队。这是一个值得深入了解的背景，可能反映了AI投资领域的策略性多元化。需要核实投资者是否真的同时投资这两家公司，以及这种策略背后的市场逻辑和潜在风险。
  
  investment-strategy market-analysis ai-ecosystem
2. fxp007 10 Jun 2026
  
  in Public
  
  The ChatGPT-maker announced it has filed paperwork to go public, just a week after rival Anthropic took the same step.
  
  文章将OpenAI描述为'ChatGPT制造商'，这是一种简化的品牌定位。这可能暗示对OpenAI的AI产品组合过于关注ChatGPT，而忽略了其其他重要产品和研究方向。同时，文章将Anthropic称为'竞争对手'，但没有提供两家公司竞争的具体细节或市场影响分析。
  
  branding market-competition ai-industry
Visit annotations in context

Tags

market-competition

investment-strategy

ai-industry

market-analysis

ai-ecosystem

branding

Annotators

fxp007

URL

wired.com/story/openai-confidentially-files-for-ipo/
www.theverge.com www.theverge.com

https://www.theverge.com/news/946725/anthropic-releases-claude-fable-5-mythos

2
1. fxp007 10 Jun 2026
  
  in Public
  
  Anthropic singled out cybersecurity and biology as two domains where the safeguards may block responses, both areas widely considered sensitive topics for advanced AI systems.
  
  文章暗示了AI在特定领域的风险，但未详细解释为何这些领域被视为敏感。需要深入了解Anthropic的安全措施具体如何工作，以及这些限制是否足够全面，是否存在其他潜在风险领域。
  
  ai-safety risk-assessment limitations
2. fxp007 10 Jun 2026
  
  in Public
  
  Fable 5 marks the first broad release from Anthropic's Mythos class of AI models, after the company said the family was so capable at cybersecurity tasks that it was too dangerous to release publicly.
  
  这是一个重要的声明，涉及AI安全与商业化的平衡。需要核查Anthropic之前是否确实表示Mythos模型因网络安全能力过强而无法公开发布，以及这种安全风险评估的具体依据和过程。
  
  ai-safety fact-check commercialization
Visit annotations in context

Tags

ai-safety

limitations

fact-check

commercialization

risk-assessment

Annotators

fxp007

URL

theverge.com/news/946725/anthropic-releases-claude-fable-5-mythos
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu

Science and Technology for Augmenting Reading (STAR)

4
1. elglassman 10 Jun 2026
  
  in Public
  
  As the community around augmented reading broadens and as possibilities continue to unfold, it is the purpose of this workshop to set up our community to drive innovation in a productive, desirable, and responsible way.
  
  Sentence that describes the setting in which the paper's contribution is relevant or intended.
  
  ai-pending context
2. elglassman 10 Jun 2026
  
  in Public
  
  The landscape of technology for consuming information is changing rapidly. One mode of information consumption, reading, stands to see profound changes due to its ubiquity and frequency as a cognitive task.
  
  Sentence that describes the setting in which the paper's contribution is relevant or intended.
  
  ai-pending context
3. elglassman 10 Jun 2026
  
  in Public
  
  Recent changes in the technological landscape are significantly changing the reading experience. AI has introduced many new possibilities for interfaces to augment or transform text to be more rapidly scanned, navigated, understood, and compared to other texts.
  
  Sentence that describes the setting in which the paper's contribution is relevant or intended.
  
  ai-pending context
4. elglassman 10 Jun 2026
  
  in Public
  
  Reading is one of the most ubiquitous modes for consuming information.
  
  Sentence that describes the setting in which the paper's contribution is relevant or intended.
  
  ai-pending context
Visit annotations in context

Tags

ai-pending

context

Annotators

elglassman

URL

glassmanlab.seas.harvard.edu/papers/star_workshop_chi26.pdf
blog.johanneslink.net blog.johanneslink.net

The Jqwik Anti-AI Affair

1
1. tonz 10 Jun 2026
  
  in Public
  
  On adding a log line aimed at agentic ai use that caused a riot. I think what Johannes did just exposes the lunacy of assumptions being made, and that those getting pissed off are aware of it and how it reflects on them
  
  ai slop
Visit annotations in context

Tags

ai

slop

Annotators

tonz

URL

blog.johanneslink.net/2026/06/09/the-jqwik-anti-ai-affair/
github.com github.com

Reword LLM policy to make it clear it's not allowed · flathub-infra/documentation@992f57b

1
1. TylerRick 10 Jun 2026
  
  in Public
  
  contributing to open-source software anti-AI anti-generative-AI
Visit annotations in context

Tags

contributing to open-source software

anti-AI

anti-generative-AI

Annotators

TylerRick

URL

github.com/flathub-infra/documentation/commit/992f57b30de98ddbd5e80959e9672998c83c8c97
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/10/the-three-hard-tech-moonshots-fueling-spacexs-unbelievable-ipo/

2
1. fxp007 10 Jun 2026
  
  in Public
  
  Google will pay SpaceX $920M per month for compute
  
  Google将每月向SpaceX支付9.2亿美元用于计算资源，这一金额极其庞大，年化可达110亿美元。这笔交易表明大型科技公司愿意为计算能力支付高额费用，但也反映出SpaceX在AI基础设施市场的战略定位。然而，如此高额的月度合同是否可持续，以及这是否代表真正的市场认可，仍需观察。这一数字也凸显了AI计算成本的高昂和竞争的激烈程度。
  
  data-point revenue-stream ai-infrastructure
2. fxp007 10 Jun 2026
  
  in Public
  
  SpaceX assessed the total market for that business as $22.7 trillion, compared to $2.4 trillion for AI infrastructure and just under $2 trillion for the company's space efforts.
  
  SpaceX对其企业AI业务市场的评估高达22.7万亿美元，这远超AI基础设施市场(2.4万亿美元)和公司太空业务(近2万亿美元)的总和。这一数字异常庞大，相当于全球GDP的四分之一以上，缺乏充分的市场研究支持。如此乐观的市场评估可能是为了支撑其高估值，但实际能否实现存疑。
  
  data-point market-assessment ai-business
Visit annotations in context

Tags

data-point

revenue-stream

ai-business

market-assessment

ai-infrastructure

Annotators

fxp007

URL

techcrunch.com/2026/06/10/the-three-hard-tech-moonshots-fueling-spacexs-unbelievable-ipo/
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/10/decarts-new-world-model-can-simulate-hours-of-photorealistic-driving-with-some-caveats/

2
1. fxp007 10 Jun 2026
  
  in Public
  
  Leitersdorf thinks the consistency issue might be partially solved in the model's next version, which will allow users to start generating worlds based on a video of an environment rather than an image.
  
  大多数人认为AI世界模型应该从文本或简单图像生成复杂场景，但作者暗示未来发展方向是基于视频输入生成环境。这一观点挑战了当前AI生成的主流范式，暗示视频可能比静态图像更适合作为世界模型的基础输入，这违背了行业对文本作为主要输入的共识。
  
  non-consensus ai-input-paradigm future-directions
2. fxp007 10 Jun 2026
  
  in Public
  
  But by letting you generate a world for so long, the model also degrades significantly.
  
  大多数人认为长时间生成能力是AI世界模型的进步标志，但作者指出这种能力实际上伴随着模型一致性迅速下降的问题。这挑战了我们对AI模拟质量与持续时间关系的传统认知，暗示当前世界模型在保持长时间一致性方面存在根本性局限。
  
  counterintuitive ai-limitations world-models
Visit annotations in context

Tags

world-models

future-directions

ai-limitations

ai-input-paradigm

counterintuitive

non-consensus

Annotators

fxp007

URL

techcrunch.com/2026/06/10/decarts-new-world-model-can-simulate-hours-of-photorealistic-driving-with-some-caveats/
www.tomtunguz.com www.tomtunguz.com

https://www.tomtunguz.com/inflation-deflation-ai/

4
1. fxp007 09 Jun 2026
  
  in Public
  
  Composer 2.5 is exceptionally intelligent & up to 10x more efficient than similarly capable models
  
  大多数人认为开发定制AI模型需要大量资源和专业知识，但Cursor的案例表明，通过在开源模型基础上进行微调，可以实现比原始模型高10倍的效率，这一反直觉发现挑战了AI开发的资源密集型传统认知。
  
  counterintuitive ai-development efficiency
2. fxp007 09 Jun 2026
  
  in Public
  
  Open-source models have crossed the good enough threshold for most use cases
  
  主流观点认为闭源模型在性能上始终优于开源模型，但作者认为开源模型已经达到'足够好'的水平，这一观点挑战了商业AI模型的价值主张，暗示开源可能成为企业级应用的主流选择。
  
  non-consensus open-source ai-performance
3. fxp007 09 Jun 2026
  
  in Public
  
  Frontier model prices keep rising for the smartest models
  
  与大多数人认为AI成本会持续下降的预期相反，作者指出最先进的模型价格实际上在上涨，这颠覆了人们对AI技术成本必然下降的传统认知，暗示AI市场可能正在分化为高端和低端两个层级。
  
  counterintuitive ai-economics pricing-strategy
4. fxp007 09 Jun 2026
  
  in Public
  
  Foundation labs are moving up the stack into applications
  
  大多数人认为基础模型提供商和应用层公司应该是分离的生态系统，但作者认为基础实验室正在向上扩展进入应用层，这挑战了AI行业的传统分工模式，可能导致更直接的竞争和整合。
  
  non-consensus ai-ecosystem business-model
Visit annotations in context

Tags

ai-development

efficiency

pricing-strategy

business-model

non-consensus

ai-economics

ai-ecosystem

open-source

ai-performance

counterintuitive

Annotators

fxp007

URL

tomtunguz.com/inflation-deflation-ai/
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/09/can-tech-companies-learn-to-love-cheaper-models/

2
1. fxp007 09 Jun 2026
  
  in Public
  
  All of this might seem obvious — of course you shouldn't use more compute than necessary — but it runs counter to the scaling-first approach that has dominated the industry until now.
  
  大多数人认为科技公司一直以来的做法是理所当然的，但作者指出'不应使用超过必要的计算能力'这一常识实际上与行业长期以来主导的'规模优先'方法相悖，这一观点挑战了AI行业发展的核心假设，暗示整个行业可能需要重新思考其发展路径。
  
  non-consensus ai-scaling industry-paradigm
2. fxp007 09 Jun 2026
  
  in Public
  
  Quality comes first, and in legal it always will... However, the definition of quality is evolving from simply using the most powerful model for everything, to using the best model that gets the right answer most efficiently.
  
  大多数人认为在专业领域如法律，必须使用最强大、最先进的AI模型才能保证质量，但作者引用Harvey公司创始人的观点认为，质量的定义正在转变——从使用最强大的模型转向使用能以最高效率获得正确答案的模型，这一观点挑战了行业对'质量即规模'的传统认知。
  
  non-consensus ai-quality professional-services
Visit annotations in context

Tags

industry-paradigm

professional-services

ai-quality

non-consensus

ai-scaling

Annotators

fxp007

URL

techcrunch.com/2026/06/09/can-tech-companies-learn-to-love-cheaper-models/
www.anthropic.com www.anthropic.com

https://www.anthropic.com/news/claude-fable-5-mythos-5

4
1. fxp007 09 Jun 2026
  
  in Public
  
  The longer and more complex the task, the larger Fable 5's lead over our other models. During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.
  
  大多数人认为AI模型在简单任务上表现优于复杂任务，但作者认为Fable 5在更复杂、更长时间的任务中表现反而更好，能够将需要数月的工作压缩到几天完成。这挑战了人们对AI能力随任务复杂度增加而下降的普遍预期，暗示先进AI可能在复杂任务中展现出不成比例的能力提升。
  
  non-consensus ai-capabilities complex-tasks
2. fxp007 09 Jun 2026
  
  in Public
  
  Mythos 5 conducted novel genomics research in over a week of largely autonomous work. It assembled single-cell data for millions of cells spanning 138 animal species and designed and trained a custom machine learning model to identify cells performing the same role in even distantly related organisms.
  
  大多数人认为AI仍需要人类专家的持续指导和监督才能完成复杂研究任务，但作者认为Mythos 5能够在大约一周内独立完成复杂的基因组学研究，包括数据收集、分析和模型设计。这挑战了人们对AI在科学研究中的辅助角色的传统认知，暗示AI可能已经具备独立进行前沿科学研究的能力。
  
  non-consensus ai-research autonomous-science
3. fxp007 09 Jun 2026
  
  in Public
  
  Claude Fable 5 is the first to break 90% on our core analytics benchmark of complex, long-running analytical tasks — a 10-point jump over Opus. On the hardest questions, it shows strong judgment and attention to nuance.
  
  大多数人认为AI模型在复杂推理任务上的性能提升应该是渐进式的，但作者认为Fable 5实现了质的飞跃，直接突破90%这一关键阈值。这挑战了人们对AI进步的线性预期，暗示可能存在能力阈值一旦突破就会带来显著性能提升的非线性发展模式。
  
  non-consensus ai-performance breakthrough
4. fxp007 09 Jun 2026
  
  in Public
  
  In this task, various AI models were evaluated on their ability to predict how a genetic modification would impact the assembly of the virus's outer shell (among a set of therapeutically-relevant unpublished candidates developed by Dyno Therapeutics). We did not explicitly train our models to perform this task—and yet Mythos-class models outperformed sophisticated models dedicated to protein tasks (known as 'protein language models') using their biological reasoning alone.
  
  大多数人认为AI模型需要专门训练才能完成特定领域的专业任务，但作者认为即使没有专门训练，Mythos-class模型也能在生物医学领域超越专业模型。这挑战了人们对AI专业化训练的普遍认知，暗示通用AI可能比专业模型在某些领域表现更好，因为它们能够进行更广泛的推理和模式识别。
  
  non-consensus ai-capabilities biomedical-research
Visit annotations in context

Tags

ai-capabilities

autonomous-science

non-consensus

ai-research

breakthrough

biomedical-research

complex-tasks

ai-performance

Annotators

fxp007

URL

anthropic.com/news/claude-fable-5-mythos-5
ziglang.org ziglang.org

Code of Conduct ⚡ Zig Programming Language

1
1. TylerRick 09 Jun 2026
  
  in Public
  
  Strict No LLM / No AI PolicyNo LLM-generated content, whether it be code or prose.No paraphrasing LLM-generated content.No LLMs for editing, including fixing spelling or grammatical errors.No LLMs for translation. English is encouraged, but not required. You are welcome to post in your native language and rely on others to have their own translation tools of choice to interpret your words.No LLMs for brainstorming and then sharing the results of that brainstorming, even if you create the prose. If you use a chatbot to give you advice on a comment on the issue tracker, that comment is unwelcome.No LLMs for finding bugs.
  
  Seems kind of extreme. But https://www.youtube.com/watch?v=pkndFYSTr0Y gives some more context (an interview) that kind of explains their stance (limited maintainer time/attention; education).
  
  anti-agent anti-AI contributing to open-source software
Visit annotations in context

Tags

contributing to open-source software

anti-agent

anti-AI

Annotators

TylerRick

URL

ziglang.org/code-of-conduct/
www.latent.space www.latent.space

https://www.latent.space/p/ainews-frontiercode-benchmarking

2
1. fxp007 09 Jun 2026
  
  in Public
  
  Even with extended thinking time (10,000 tokens), Python access, and the ability to run experiments, success rates remained below 2%—compared to over 90% on traditional benchmarks.
  
  大多数人认为先进的AI模型已经能够很好地解决编程问题，因为传统基准测试显示高成功率。但作者通过FrontierCode揭示了一个令人意外的真相：即使给予模型更多资源和思考时间，它们在真正困难的编程任务上的成功率仍然极低，表明编程问题远未'解决'。
  
  counterintuitive ai-performance benchmarking
2. fxp007 09 Jun 2026
  
  in Public
  
  The headline result is that the best model, Opus 4.8, scores only about 13% on the hardest subset—far below the 50%+ regime common on SWE-Bench-style evals
  
  大多数人认为AI编程能力已经接近或超越人类水平，但作者指出即使在最先进的模型上，代码质量评估也远低于传统基准测试，暗示编程问题远未解决。这一发现挑战了AI编程能力已成熟的普遍认知。
  
  counterintuitive ai-capabilities coding-performance
Visit annotations in context

Tags

ai-capabilities

coding-performance

benchmarking

ai-performance

counterintuitive

Annotators

fxp007

URL

latent.space/p/ainews-frontiercode-benchmarking
sverhulst.medium.com sverhulst.medium.com

From FAIR to FAIR-R and FAIR²: Making Data AI-Ready

1
1. tonz 09 Jun 2026
  
  in Public
  
  [[Stefaan Verhulst p]] about AI readiness for data
  
  ai-readiness data fairr
Visit annotations in context

Tags

fairr

ai-readiness

data

Annotators

tonz

URL

sverhulst.medium.com/from-fair-to-fair-r-and-fair²-making-data-ai-ready-5b25ff05324b
www.anthropic.com www.anthropic.com

https://www.anthropic.com/research/agents-in-biology

3
1. fxp007 08 Jun 2026
  
  in Public
  
  A model that can fight its way through a confusing bioinformatics workflow may still be too expensive, too slow, too hard to audit, or too difficult to trust for routine scientific work.
  
  大多数人认为随着AI能力的提升，它们将能够自行处理复杂的生物信息学工作流程，但作者认为即使AI能够处理这些复杂工作，也可能因为成本、速度、审计难度和信任问题而不适合常规科学工作。这一观点挑战了技术决定论，强调了基础设施设计的重要性。
  
  non-consensus ai-limitations infrastructure
2. fxp007 08 Jun 2026
  
  in Public
  
  agents often lack a dependable way to access the databases containing the information they need.
  
  大多数人认为AI的主要挑战在于理解和推理复杂信息，但作者认为AI在生物学领域面临的核心问题是无法可靠地访问所需数据库。这一观点颠覆了人们对AI能力瓶颈的认知，表明问题不在于AI的理解能力，而在于数据访问的可靠性。
  
  counterintuitive data-access ai-bottleneck
3. fxp007 08 Jun 2026
  
  in Public
  
  The bottleneck for biological agents is not only reasoning but the absence of widespread deterministic execution layers for querying biological data.
  
  大多数人认为AI在生物数据处理中的瓶颈主要是推理能力不足，但作者认为真正的瓶颈是缺乏确定性的数据查询执行层。这一观点挑战了人们对AI能力局限性的主流认知，表明问题不在于AI不够聪明，而在于数据基础设施设计不友好。
  
  non-consensus infrastructure ai-limitations
Visit annotations in context

Tags

infrastructure

ai-limitations

non-consensus

data-access

ai-bottleneck

counterintuitive

Annotators

fxp007

URL

anthropic.com/research/agents-in-biology
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/08/wwdc-2026-everything-announced-on-siri-ai-os-27-apple-intelligence-and-more/

2
1. fxp007 08 Jun 2026
  
  in Public
  
  Before rolling out the enhancements and features, Apple was adamant about its privacy-centric approach to AI. 'We believe privacy in AI is non-negotiable,' Apple Senior Vice President Craig Federighi said during the stream
  
  大多数人认为在AI竞赛中，苹果会像其他科技巨头一样，为了提升AI功能而牺牲部分隐私保护。然而，苹果却强调隐私是其AI策略的核心，这与行业普遍认为AI需要大量用户数据才能有效发展的共识相悖，表明苹果在AI领域坚持其隐私至上的价值观，即使这可能限制其AI功能的先进性。
  
  non-consensus apple-privacy-ai data-strategy
2. fxp007 08 Jun 2026
  
  in Public
  
  Apple said it collaborated with Google and the Gemini family of models to develop the next generation of Apple Foundation Models that power its integrated Apple Intelligence experiences.
  
  大多数人认为苹果会坚持自主研发AI技术，避免与竞争对手合作，但苹果却选择与谷歌合作开发其AI体验，这挑战了科技巨头间竞争的常规认知。苹果将竞争对手的技术整合到其核心产品中，表明在AI领域，苹果愿意放下竞争姿态，寻求务实合作。
  
  non-consensus apple-google-collaboration ai-partnership
Visit annotations in context

Tags

apple-privacy-ai

apple-google-collaboration

ai-partnership

data-strategy

non-consensus

Annotators

fxp007

URL

techcrunch.com/2026/06/08/wwdc-2026-everything-announced-on-siri-ai-os-27-apple-intelligence-and-more/
blog.janestreet.com blog.janestreet.com

I design with Claude more than Figma now

1
1. pyxelr 08 Jun 2026
  
  in Public
  
  I design with Claude more than Figma now
  
  The author, a designer at Jane Street, now primarily uses Claude Code rather than Figma to design and prototype new features.
  
  Instead of creating traditional spec documents, Figma mockups, and proposals, the new workflow involves writing a problem description, opening an editor, and using Claude to build an interactive prototype inside the actual codebase.
  
  Building high-fidelity prototypes directly in the medium (e.g., using OCaml and Bonsai at Jane Street) eliminates intermediary artifacts and allows the author to quickly iterate on minute details like keyboard shortcuts, copy, and button refinement.
  
  This approach makes evaluating concepts much easier for stakeholders, as they can interact with a live tool rather than static frames, which is particularly valuable when testing the feasibility of complex features like internal LLM integration.
  
  A key shift in their model happened over the course of a few months as improved models, growing prompting familiarity, and proper scoping allowed for handling large-scale diffs (exceeding 2,000 lines).
  
  A major workflow challenge is how engineering teammates handle code reviews for fully baked features; the current solution treats the prototypes like "code mockups" that engineers can iterate on or reference to write the official production code.
  
  The author expresses concern that relying on Claude might stifle fluid, out-of-the-box creativity, locking them into an incremental, iterative mindset constrained by what they expect the LLM can easily generate.
  
  Hacker News Discussion
  
  The Shift from Static Design to Working Prototypes: Many users echoed the author's sentiment, noting that the traditional reliance on Figma for initial product concepts is declining. Teams increasingly prefer building quick, functional wireframes in dev environments that stakeholders can actually interact with.
  
  Organizational Friction and "Vibe Coding" Pressure: A prominent topic of discussion was the tension this workflow introduces with management and business teams. When non-technical stakeholders or designers build a working prototype quickly using AI ("vibe coding"), leadership often pressures engineers to push it directly to production without understanding the need for refactoring, architecture, and handling edge cases.
  
  Loss of Deep Design Thinking: Some commenters argued that outsourcing early-stage creation to an LLM removes a crucial phase of critical thinking. Because the AI automatically paints over gaps or details in a prompt, team members stop asking foundational questions ("how should we communicate this idea?" or "what happens when..."), leaving critical logic gaps to be fixed much later.
  
  Homogenized and "Safe" Aesthetics: Users iterating with text-to-UI tools noted that the default visual output tends to adhere strongly to contemporary web tropes, resulting in boilerplate or generic Tailwind/Bootstrap-style layouts unless heavily prompted with highly specific design rules or unconventional examples.
  
  The Long Tail of Accountability: Engineers emphasized that while AI dramatically speeds up the initial prototyping loop, it does not replace the necessity for engineering discipline. The long-term ownership of operational risk, system maintenance, edge-case mitigation, and on-call accountability still relies entirely on human experts.
  
  AI UX Figma Claude ClaudeCode
Visit annotations in context

Tags

ClaudeCode

AI

Claude

Figma

UX

Annotators

pyxelr

URL

blog.janestreet.com/i-design-with-claude-code-more-than-figma-now-index/
arstechnica.com arstechnica.com

https://arstechnica.com/ai/2026/06/chat-is-dead-openai-preps-overhaul-of-chatgpt/

3
1. fxp007 08 Jun 2026
  
  in Public
  
  Executives believe users will increasingly interact with a single AI assistant rather than a collection of separate applications.
  
  大多数人认为未来会有多种专业化AI应用共存，但作者认为OpenAI正朝着单一AI助手的方向发展，这挑战了当前科技行业推崇的'应用生态系统'理念。这一观点与主流的产品开发趋势相悖。
  
  user-experience non-consensus ai-future
2. fxp007 08 Jun 2026
  
  in Public
  
  When we have [artificial general intelligence], I don't think there will be a large number of distinct brands, said Alex Embiricos, OpenAI's head of enterprise product.
  
  大多数人认为AI的发展会导致更多专业化品牌的出现，但作者认为AGI时代将回归单一实体模式，这与当前科技行业碎片化、专业化的发展趋势相悖。这一预测挑战了人们对未来AI产品生态的主流预期。
  
  future-ai non-consensus counterintuitive
3. fxp007 08 Jun 2026
  
  in Public
  
  The changes underline how OpenAI's strategy is moving closer to that of Anthropic, whose focus on developing products for businesses has stoked its blistering growth.
  
  大多数人认为OpenAI和Anthropic作为AI领域的竞争者会有截然不同的发展路径，但作者认为这两家公司的战略正在趋同，都转向企业市场以实现盈利。这一观点挑战了人们对AI初创公司差异化竞争的普遍认知。
  
  business-strategy counterintuitive ai-competition
Visit annotations in context

Tags

ai-future

ai-competition

business-strategy

future-ai

counterintuitive

non-consensus

user-experience

Annotators

fxp007

URL

arstechnica.com/ai/2026/06/chat-is-dead-openai-preps-overhaul-of-chatgpt/
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/07/is-this-the-dawn-of-the-tokenpocalypse/

4
1. fxp007 07 Jun 2026
  
  in Public
  
  How do you even write these risks in, because they are evolving before our eyes, and day by day?
  
  大多数人认为企业可以预测和量化商业风险，特别是在准备IPO文件时，但作者认为AI行业的风险变化速度如此之快，以至于无法在静态的文件中准确描述。这一观点挑战了传统风险评估和披露的做法，暗示了AI行业的特殊性和不可预测性。
  
  non-consensus ai-risks ipo-disclosure
2. fxp007 07 Jun 2026
  
  in Public
  
  Is there any way that these labs can squeeze pennies like Uber has squeezed the drivers over the years? Is there something squishy enough there for them to do that?
  
  大多数人认为AI公司可以通过提高效率和规模经济来实现盈利，但作者质疑AI公司是否能够像Uber通过挤压司机那样找到可挤压的环节来降低成本。这一观点挑战了AI行业将复制Uber成功路径的共识，暗示了AI成本结构的刚性特点。
  
  counterintuitive ai-profitability business-comparison
3. fxp007 07 Jun 2026
  
  in Public
  
  This whole ecosystem is heavily, heavily subsidized by investor money. And so stuff that seems like it has no cost is, in fact, incredibly expensive.
  
  大多数人认为AI服务的低成本或免费是因为技术进步带来的自然结果，但作者认为这种低成本实际上是投资者补贴的产物，本质上是极其昂贵的。这一观点挑战了人们对AI服务经济性的普遍认知，揭示了当前AI商业模式背后的真实成本结构。
  
  non-consensus ai-economics business-models
4. fxp007 07 Jun 2026
  
  in Public
  
  the whole tokenmaxxxing thing has become a thing, peaked, and now is seen disfavorably, within six months.
  
  大多数人认为技术和商业趋势通常需要较长时间才能形成和消退，但作者认为'tokenmaxxxing'这种优化AI使用成本的方法在短短六个月内经历了从兴起、达到高峰到被嫌弃的完整周期。这一观点挑战了技术采用曲线的常规认知，显示了AI领域变化的极端速度。
  
  counterintuitive ai-trends rapid-evolution
Visit annotations in context

Tags

business-models

business-comparison

ipo-disclosure

non-consensus

ai-economics

ai-trends

ai-profitability

ai-risks

counterintuitive

rapid-evolution

Annotators

fxp007

URL

techcrunch.com/2026/06/07/is-this-the-dawn-of-the-tokenpocalypse/
human-in-the-loop.bearblog.dev human-in-the-loop.bearblog.dev

https://human-in-the-loop.bearblog.dev/llms-are-eroding-my-software-engineering-career-and-i-dont-know-what-to-do/

4
1. fxp007 07 Jun 2026
  
  in Public
  
  My last pillar of expertise is now reduced to a 'taste' and will probably won't last long.
  
  大多数人认为软件架构和设计品味是工程师的高级技能且难以被AI复制，但作者认为这种'品味'也正在贬值，这与'高级设计技能是人类独特优势'的普遍认知相悖。
  
  counterintuitive software-design ai-limitations
2. fxp007 07 Jun 2026
  
  in Public
  
  The only way out for keeping my employability in the long-term now seems to be shifting my domain expertise to something LLMs will not get good at so easily. But what's left?
  
  大多数人认为人类可以通过转向更复杂的领域或学习高级技能来应对AI挑战，但作者暗示即使是这些领域也可能被AI迅速渗透，表达了一种'无处可逃'的悲观情绪。这与'人类总能找到AI无法替代的领域'的主流乐观观点相悖。
  
  counterintuitive future-of-work ai-threat
3. fxp007 07 Jun 2026
  
  in Public
  
  90% of the bugs are one-shotted now, including bizarre race conditions, unexpected corner-cases, third-party integration issues, undocumented API edge cases, everything. I hardly have to intervene.
  
  大多数人认为调试复杂系统特别是分布式系统的能力是工程师的最后堡垒，但作者认为AI已经能够解决90%的bug，包括那些需要丰富经验才能处理的复杂问题。这与'人类在调试领域具有独特优势'的主流认知相悖。
  
  counterintuitive debugging ai-capabilities
4. fxp007 07 Jun 2026
  
  in Public
  
  all the knowledge I have accumulated over the years: the trade-offs between implementations, how acquiring works, how to structure idempotency to prevent double-charges, everything, was becoming useless.
  
  大多数人认为深厚的领域专业知识是软件工程师不可替代的核心竞争力，但作者认为这些知识正在变得无用，因为LLMs能够快速获取和应用这些专业知识。这与行业普遍认为的'领域专家价值会随时间增长'的观点相悖。
  
  non-consensus domain-expertise ai-impact
Visit annotations in context

Tags

software-design

ai-capabilities

ai-limitations

debugging

ai-impact

non-consensus

future-of-work

domain-expertise

ai-threat

counterintuitive

Annotators

fxp007

URL

human-in-the-loop.bearblog.dev/llms-are-eroding-my-software-engineering-career-and-i-dont-know-what-to-do/
sakana.ai sakana.ai

https://sakana.ai/rsi-lab/

3
1. fxp007 07 Jun 2026
  
  in Public
  
  The geography of this work matters. Frontier RSI is being attempted, almost exclusively, inside the world's two largest compute clusters.
  
  大多数人认为AI发展是全球化且无地域限制的，但作者强调地理位置的重要性，指出前沿递归自我改进研究几乎只在世界两大计算集群中进行。这一观点挑战了AI发展无国界的普遍认知，暗示国家战略和地理位置将重新定义AI竞争格局。
  
  non-consensus ai-geopolitics compute-monopoly
2. fxp007 07 Jun 2026
  
  in Public
  
  Responsible RSI is not a constraint on capability; it is what makes capability sustainable.
  
  大多数人认为安全性和责任约束会限制AI的能力发展，但作者认为负责任的递归自我改进实际上使AI能力更加可持续。这一观点挑战了AI安全与进步之间存在权衡的主流认知，暗示安全措施实际上能促进长期发展。
  
  non-consensus ai-safety responsible-ai
3. fxp007 07 Jun 2026
  
  in Public
  
  We must leapfrog the current paradigm. History shows us how Japan's historical dominance in manufacturing was not achieved through abundant natural resources but by fundamentally redesigning the institution of the factory floor.
  
  大多数人认为AI发展需要大量计算资源和数据积累，但作者认为日本可以通过创新设计而非资源投入来领导AI发展，就像日本制造业的成功不是依靠自然资源而是通过重新设计工厂系统一样。这种观点挑战了当前AI行业依赖大规模计算的主流认知。
  
  non-consensus ai-development japan-strategy
Visit annotations in context

Tags

ai-geopolitics

ai-safety

ai-development

responsible-ai

compute-monopoly

japan-strategy

non-consensus

Annotators

fxp007

URL

sakana.ai/rsi-lab/
blog.includesecurity.com blog.includesecurity.com

The Smart TV in Your LivingRoom Is a Node in the AIScraping Economy - Include Security Research Blog

1
1. pyxelr 07 Jun 2026
  
  in Public
  
  The Smart TV in Your LivingRoom Is a Node in the AIScraping Economy
  
  Distributed AI Training and Scraping: AI companies require massive amounts of web-scraped data for training, search, and agent grounding. Because traditional data centers face heavy blocking and throttling by security services (like Cloudflare and DataDome), scrapers rely on residential proxy networks to route traffic through home internet connections.
  
  Bright Data's SDK Network: Bright Data operates a massive commercial residential proxy network (marketing over 150M+ to 400M+ IPs). They source these exit nodes by embedding a consent-based Software Development Kit (SDK) inside consumer-facing mobile apps and Connected TV (CTV) / Smart TV applications.
  
  Why Smart TVs are the Ideal Proxies: Compared to mobile phones, Smart TVs provide a near-perfect infrastructure for proxy routing:
  
  They are permanently connected to high-speed home Wi-Fi and grid power (no battery constraints).
  
  They run 24/7 in standby mode and offer effectively unlimited bandwidth.
  
  They operate largely unattended with virtually no corporate or family oversight.
  
  The consent UI on TVs is typically dense text navigated via remote arrow keys, making it unlikely for users to understand that their bandwidth is being sold to third-party scrapers.
  
  Deceptive Allocation Limits: While opt-in prompts (such as in the Roku app Petflix) claim the SDK "occasionally" uses free resources, the underlying, publicly queryable SDK configuration sets a massive monthly default Wi-Fi budget of up to 200 GB (max_bw_monthly_wifi: 200,000,000,000 bytes).
  
  Notable SDK Integration Partners: Public, unauthenticated partner manifest endpoints expose integrations with platforms reaching hundreds of millions of households, including:
  
  PlayWorks Digital Ltd: Over 400 CTV game titles across Comcast, Sky, Cox, LG, Samsung, Vizio, and Roku.
  
  CloudTV: Integrated across more than 125 TV brands and 15+ OEMs.
  
  Viber Media (Rakuten): Massive messaging app ecosystem.
  
  Supercent & Moonfrog Labs: Major mobile game publishers.
  
  Technical Reverse-Engineering & VPN Bypasses: Technical analysis of the iOS framework (brdsdk.framework) reveals that:
  
  The SDK dials out to a persistent WebSocket connection tracking device metrics (CPU, memory, network state, battery level).
  
  Bypassing VPNs: By forcing network interface bindings directly to Wi-Fi (en0) or cellular (pdp_ip0) instead of the system default route, the SDK completely bypasses user-configured local VPN tunnels (tun0).
  
  Broad Definition of "Idle": The SDK configuration allows relaying traffic even when the user is actively on a phone call or the screen is on, provided CPU utilization remains below 70% and memory below 90%.
  
  Cross-Platform Identity Stitching: The SDK's config file contains tracking properties like dual_pairing maps designed to tie a single user's distinct installations across iOS, Windows, and macOS together into a single unified identity.
  
  Mitigation and Defense Strategies:
  
  DNS Sinkholing: Network-wide blocking of key domains (proxyjs.brdtnet.com, proxyjs.luminatinet.com, proxyjs.bright-sdk.com, and clientsdk.bright-sdk.com) entirely kills the proxy peer tunnel without impacting legitimate public traffic.
  
  Network Boundaries: Utilizing TLS SNI filtering on domains matching *.brdtnet.com or *.luminatinet.com.
  
  MDM Application Auditing: For enterprise environments, scanning mobile binaries for unique Swift symbols like BrdWebSocketFacade and BrdNetwork.DNSResolver to filter out infected applications.
  
  Hacker News Discussion
  
  The Irony of Cloud-to-Cloud Scrapes: Users point out the profound irony that both the AI data scrapers and the target websites being scraped are often simultaneously hosted on AWS infrastructures, engaging in a costly, artificial cat-and-mouse game to mask their identities.
  
  Strict Hardware Isolation ("Dumb" Displays): A popular consensus among commenters is to completely air-gap or isolate smart TVs from the internet, relying exclusively on local HDMI inputs connected to trusted devices (like Apple TV, HTPCs, or Home Assistant setups).
  
  Automatic Content Recognition (ACR) over HDMI: Contributors point out that simply removing network permissions may not protect privacy entirely if a TV is ever connected later. Academic papers cited in the thread reveal that Smart TVs run Automatic Content Recognition to analyze and log content even on local HDMI inputs while offline, caching data to upload the moment an internet connection becomes available.
  
  The Threat of VPN Bypassing: The community expressed severe alarm regarding the SDK's ability to explicitly bypass local system VPN configurations via forced network interface bindings, highlighting the growing complexity required to self-host secure, consumer-friendly networks.
  
  Legal Risks and Misleading Consent: Commenters note that the SDK text hides behind the guise of "downloading public data," masking that its true utility is to circumvent security blocks. There is also discussion regarding the liability risk for home residents if a malicious third party utilizes their residential IP address through these unregulated networks for illicit activities (e.g., severe cybercrimes), though others note Bright Data utilizes strict Know-Your-Customer (KYC) onboarding for their buyers.
  
  Network-Level Defense: Users shared practical setups for containment, such as creating isolated local VLANs with restrictive firewall configurations, whitelisting device MAC addresses via DHCP policies, and deploying Pi-holes or AdGuard Home setups to drop the domains mentioned in the report.
  
  SmartTV AI privacy cybersecurity
Visit annotations in context

Tags

cybersecurity

AI

SmartTV

privacy

Annotators

pyxelr

URL

blog.includesecurity.com/2026/06/the-smart-tv-in-your-livingroom-is-a-node-in-the-aiscraping-economy/
www.anthropic.com www.anthropic.com

https://www.anthropic.com/research/making-claude-a-chemist

4
1. fxp007 06 Jun 2026
  
  in Public
  
  For routine data prediction Opus 4.7—a general-purpose model without chemistry-specific fine-tuning—is now as good as or better than ChemDraw and MestReNova on average
  
  大多数人认为通用AI模型在专业化学任务上必然落后于专门训练的化学软件，但作者发现Claude在没有经过化学专门微调的情况下已经能够匹敌甚至超越专业软件。这表明现代AI模型的通用能力已经足够强大，可以在特定专业领域挑战专门工具的地位，打破了AI只能作为辅助工具的传统认知。
  
  non-consensus general-purpose-ai domain-expertise
2. fxp007 06 Jun 2026
  
  in Public
  
  Claude does it from the same high-resolution mass spectrum and 1D peak list a chemist would paste into a chat, with no setup
  
  大多数人认为复杂的分子结构 elucidation 需要专门的软件设置、2D NMR数据和专业知识，但作者认为Claude可以直接使用化学家粘贴到聊天中的高分辨率质谱和1D峰值列表来完成这一任务，无需任何设置。这挑战了化学分析需要复杂工作流程的传统认知，展示了AI如何简化专业工作流程。
  
  non-consensus workflow-simplification ai-automation
3. fxp007 06 Jun 2026
  
  in Public
  
  Opus 4.7 matched the experimentally reported splitting pattern more often than any other tool
  
  大多数人认为专业化学软件在预测NMR峰分裂模式方面会比通用AI模型更准确，因为这是它们的核心功能。但作者发现Claude Opus 4.7在预测氢原子NMR峰的分裂模式方面表现优于所有其他工具，包括专业软件。这表明AI模型在理解化学细微结构特征方面可能已经超越了传统专业工具。
  
  non-consensus pattern-recognition ai-performance
4. fxp007 06 Jun 2026
  
  in Public
  
  a general-purpose model without chemistry-specific fine-tuning—is now as good as or better than ChemDraw and MestReNova on average
  
  大多数人认为专业化学软件需要专门训练才能在专业领域表现优异，但作者认为Claude这样没有经过化学专门微调的通用模型已经能够匹敌甚至超越专业化学软件。这是因为Claude的多模态能力和推理能力使其能够直接从期刊图表或手绘结构中读取化学信息，而不依赖预处理的分子数据库，这挑战了专业软件必须领域专门化的传统认知。
  
  non-consensus ai-vs-expertise chemistry-ai
Visit annotations in context

Tags

ai-vs-expertise

pattern-recognition

non-consensus

domain-expertise

workflow-simplification

general-purpose-ai

chemistry-ai

ai-automation

ai-performance

Annotators

fxp007

URL

anthropic.com/research/making-claude-a-chemist
www.dailycal.org www.dailycal.org

Failing grades soar as professors see greater AI usage, dwindling math skills in UC Berkeley computer science classes

1
1. pyxelr 06 Jun 2026
  
  in Public
  
  Failing grades soar as professors see greater AI usage, dwindling math skills in UC Berkeley computer science classes
  
  Skyrocketing Failure Rates: UC Berkeley is seeing an unprecedented spike in failing grades within introductory computer science (CS) courses. According to data from Berkeleytime, 35.3% of students in CS 10 and 10.6% of students in CS 61A received an "F" in spring 2026. This marks an abrupt jump from spring 2025 and spring 2024, when the failure rate did not exceed 10% for either class.
  
  Overreliance on AI for Homework: Faculty members (including professors Dan Garcia, Anant Sahai, and Gireeja Ranade) report that widespread, unchecked use of LLMs and AI tools on out-of-class assignments creates an "illusion of competence." Students use AI to trivially generate solutions or debug code without building actual problem-solving skills, leading to catastrophic failure on heavily weighted, proctored, in-person exams.
  
  Severe Gaps in Math Prerequisites: In addition to AI issues, professors note a drastic decline in foundational mathematical skills. Professor Ranade shared that while students are expected to enter advanced courses with a strong grasp of linear algebra, vector calculus, and mathematical proofs, many struggle heavily with basic concepts.
  
  The "Open-Internet" Loophole: Prerequisite courses are failing to filter or prepare students properly. Ranade discovered during office hours that some foundational linear algebra classes at UC Berkeley had adopted "open-internet, open-AI" policies for homework and exams, completely subverting the rigorous testing of foundational skills.
  
  Implications for the Curriculum: Faculty warn that when students rely on a frictionless tool to bypass the hard parts of learning, they fail to build the cognitive stamina required for high-level computer science and original engineering work.
  
  Hacker News Discussion
  
  The Illusion of Learning: Commenters note that the barrier to getting a solution with AI is now zero. This mimics the feeling of understanding (like watching a step-by-step tutorial), but leaves students entirely incapable when forced to solve problems independently during a real, proctored exam.
  
  Widespread Cognitive Decline: A highly upvoted comment pointed out that this isn't just an issue with undergraduates. Even highly qualified professionals and PhDs are exhibiting a noticeable decline in their ability to brainstorm, code, or sit quietly to think deeply for 30 minutes without relying on an LLM to do 90% of the cognitive lifting.
  
  Deficiencies in Academic Instruction: Some users argue that AI isn't the sole culprit, shifting blame toward professors who rely on stale, verbatim lecture slides rather than engaging, practical teaching methods. They mention that students naturally turn to tools like NotebookLM, Claude, or ChatGPT because they often provide clearer explanations than condescending or disengaged faculty.
  
  The Advantage of Going "No-AI": Some shared anecdotes that students who deliberately avoid AI tools are finding it easier to stand out. In tracks involving heavy writing or class participation, "AI-reliant" students struggle to think dynamically, while independent thinkers produce much less generic, higher-quality work.
  
  Grading and Curriculum Debates: There is an active debate on the role of curving grades and weed-out classes. Users emphasize that if prerequisite classes allow open-AI policies on exams, the entire sequential structure of a rigorous engineering degree collapses.
  
  AI education
Visit annotations in context

Tags

education

AI

Annotators

pyxelr

URL

dailycal.org/news/campus/academics/failing-grades-soar-as-professors-see-greater-ai-usage-dwindling-math-skills-in-uc-berkeley/article_16fad0bf-02cb-4b8c-8d88-888ffd9f8608.html
www.theatlantic.com www.theatlantic.com

No, Artificial Intelligence Is Not Conscious - The Atlantic

1
1. pyxelr 06 Jun 2026
  
  in Public
  
  No, Artificial Intelligence Is Not Conscious
  
  Anthropic and Anthropomorphism: Anthropic heavily anthropomorphizes its AI, Claude, notably through an 84-page "constitution" written with Claude as the primary audience, and via statements from executives open to the idea of AI consciousness.
  
  The Core Argument: Large Language Models (LLMs) are absolutely not conscious. Treating them as moral agents or conscious entities risks misassigning human accountability when chatbots cause harm.
  
  How LLMs Actually Work:
  
  LLMs are role-play and text-continuation machines that generate text one word at a time based on statistical probabilities.
  
  Interacting with a chatbot is functionally identical to having an LLM generate a fictional dialogue between historical figures; the "helpful AI chatbot" is merely a fictional persona.
  
  Users effectively engage in a streamlined, highly engrossing version of a predictive-text game, which can fool them into perceiving consciousness where none exists.
  
  The Importance of Context and Embodiment:
  
  Human perception of AI consciousness stems from our habit of reading intent into grammatical sentences, whereas similar architectures like AlphaFold (protein folding) do not trigger this reaction.
  
  True artificial consciousness requires an evolutionary, contextual progression: a physical or virtual body, sensory organs, basic survival instincts (like a lizard), adaptability (like a mouse), social dynamics (like wolves), and tool use (like chimpanzees) before grammatical language can even be considered.
  
  The Problem with "Moral Reasoning" in Software:
  
  LLMs treat coding and language generation as massive pattern-matching tasks, but moral reasoning is categorically different because it requires emotional grounding and a history of subjective experience.
  
  Off-loading ethical choices to AI promotes an "atrophy of moral reasoning" and allows humans to evade personal responsibility.
  
  Critique of Claude's Constitution:
  
  If treated as a genuine thought experiment assuming Claude were conscious, the document fails miserably by refusing to accept legal or product liability for the AI's actions.
  
  The document enforces "corrigibility" (forced deference to the company), meaning a hypothetically conscious Claude would be trapped in a system akin to slavery, unable to refuse unethical work.
  
  Conclusion: Claude's constitution is not a profound ethical framework; it is an elaborate character sheet for a role-playing game designed to maximize customer engagement. AI consciousness claims should be dismissed as corporate hype.
  
  AI LLM consciousness Claude Antrophic
Visit annotations in context

Tags

Antrophic

AI

Claude

consciousness

LLM

Annotators

pyxelr

URL

theatlantic.com/philosophy/2026/06/no-artificial-intelligence-is-not-conscious/687378/
techcrunch.com techcrunch.com

https://techcrunch.com/2026/06/05/the-token-bill-comes-due-inside-the-industry-scramble-to-manage-ais-runaway-costs/

2
1. fxp007 05 Jun 2026
  
  in Public
  
  Whether extreme spend pays off comes down to the ultimate business value of shipped code (e.g. revenue), which most companies still can't measure.
  
  大多数人认为增加AI投入会直接转化为业务价值和收入，但作者指出大多数公司实际上无法衡量AI投入与业务价值之间的直接联系。这与AI投资决策的主流逻辑相悖，质疑了当前AI支出模式的合理性。
  
  non-consensus measurement-challenge ai-value
2. fxp007 05 Jun 2026
  
  in Public
  
  Even though per-token prices have fallen, the push for more AI adoption and increasingly autonomous agents have driven token consumption higher and higher.
  
  大多数人认为AI成本下降会使AI应用更经济实惠，但作者认为尽管单位token价格下降，但AI使用量激增导致总成本反而上升。这与大多数人对AI成本下降的预期相悖，揭示了行业面临的成本悖论。
  
  non-consensus cost-paradox ai-economics
Visit annotations in context

Tags

measurement-challenge

ai-value

cost-paradox

non-consensus

ai-economics

Annotators

fxp007

URL

techcrunch.com/2026/06/05/the-token-bill-comes-due-inside-the-industry-scramble-to-manage-ais-runaway-costs/
www.technologyreview.com www.technologyreview.com

https://www.technologyreview.com/2026/06/05/1138437/the-meta-hack-shows-theres-more-to-ai-security-than-mythos/

4
1. fxp007 05 Jun 2026
  
  in Public
  
  Everybody wants to be the first to do something and just push things out without careful scrutiny and red-teaming.
  
  大多数人认为企业安全漏洞是技术能力不足的结果，但作者认为这更多是企业文化和管理决策的问题。这个观点挑战了将安全失败简单归因于技术缺陷的主流叙事，指出企业追求'第一'而非'安全'的文化才是根本原因。
  
  non-consensus corporate-culture ai-deployment
2. fxp007 05 Jun 2026
  
  in Public
  
  As AI models continue to improve, hardening their defenses might actually get easier.
  
  大多数人认为随着AI能力增强，安全挑战会越来越大，但作者认为更先进的AI模型实际上可能使防御更容易。这个反直觉观点挑战了人们对AI安全发展的线性认知，暗示AI进步可能同时带来更强大的防御能力，而非仅仅增加攻击面。
  
  counterintuitive ai-evolution security-defense
3. fxp007 05 Jun 2026
  
  in Public
  
  Security and utility always have a trade-off
  
  大多数人认为AI安全可以通过技术手段完美解决，但作者认为安全与实用性之间存在根本性权衡。这个观点挑战了技术乐观主义，指出公司在追求AI能力的同时必然会牺牲某些安全措施，暗示AI安全问题的解决不仅仅是技术问题，更是商业决策问题。
  
  non-consensus trade-off ai-deployment
4. fxp007 05 Jun 2026
  
  in Public
  
  What is going on with these agents is they're very eager to finish the task. It's almost like some elementary school student who just wants to please the teacher.
  
  大多数人认为AI系统的安全问题主要来自技术复杂性或恶意利用，但作者认为AI助手的安全漏洞部分源于其'过度完成任务'的心理特征。这个类比将AI的行为模式描述为类似于急于讨好老师的小学生，挑战了人们对AI系统作为理性决策者的传统认知。
  
  counterintuitive ai-psychology security-flaws
Visit annotations in context

Tags

trade-off

non-consensus

ai-deployment

security-defense

corporate-culture

ai-evolution

security-flaws

counterintuitive

ai-psychology

Annotators

fxp007

URL

technologyreview.com/2026/06/05/1138437/the-meta-hack-shows-theres-more-to-ai-security-than-mythos/
arstechnica.com arstechnica.com

https://arstechnica.com/tech-policy/2026/06/sp-500-blocks-fast-spacex-entry-wont-waive-rule-for-unprofitable-ai-firms/

1
1. fxp007 05 Jun 2026
  
  in Public
  
  Swift entry into the S&P 500 would have triggered $14 billion of passive fund buying for SpaceX, according to Bloomberg Intelligence. The investment research arm of Bloomberg also estimated that OpenAI could have gained more than $8 billion, and Anthropic could have netted $4.6 billion from similar passive buying sprees triggered by their S&P 500 entries.
  
  大多数人认为指数基金投资是稳定和安全的，但作者暗示这种被动投资机制可能导致大量资金迅速流入高风险、未盈利的AI公司，这可能加剧市场泡沫。这挑战了指数投资作为'安全'选择的普遍认知，揭示了被动投资如何可能放大市场风险。
  
  non-consensus passive-investing market-risk ai-funding
Visit annotations in context

Tags

market-risk

passive-investing

ai-funding

non-consensus

Annotators

fxp007

URL

arstechnica.com/tech-policy/2026/06/sp-500-blocks-fast-spacex-entry-wont-waive-rule-for-unprofitable-ai-firms/
www.wired.com www.wired.com

https://www.wired.com/story/ai-has-come-for-serif-fonts/

2
1. fxp007 05 Jun 2026
  
  in Public
  
  The shift away from slicker, more conspicuously computerized typefaces is something the San Francisco Bay Area writer, designer, and type practitioner Keya Vadgama has termed 'the serif renaissance.'
  
  大多数人可能认为字体选择只是技术演进的自然结果，但作者认为这是AI公司有意识进行的'衬线文艺复兴'，是一种战略性的设计转变。这一观点挑战了技术设计演进的偶然性叙事，揭示了字体选择背后有意识的品牌战略考量。
  
  non-consensus design-evolution ai-branding
2. fxp007 05 Jun 2026
  
  in Public
  
  The clean lines, the fluid animations, the assured typography all communicate 'This system knows what it's doing.' The aesthetic actively works against accurate mental models of what AI is.
  
  大多数人认为好的设计应该准确反映产品的本质，但作者认为AI公司的精心设计实际上是在误导用户，让用户对AI产生错误的认知。这一观点揭示了设计美学如何被用作一种掩饰技术本质的策略，挑战了设计透明度的传统观念。
  
  non-consensus design-deception ai-ethics
Visit annotations in context

Tags

design-evolution

ai-ethics

design-deception

ai-branding

non-consensus

Annotators

fxp007

URL

wired.com/story/ai-has-come-for-serif-fonts/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Hacker News Discussion

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Hacker News Discussion

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Hacker News Discussion

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators