1,276 Matching Annotations

Mar 2023
www.technologyreview.com www.technologyreview.com

The inside story of how ChatGPT was built from the people who made it

4
1. peter_murray 06 Mar 2023
  
  in Public
  
  The required behavior of a large language model for something like search is very different than for something that’s just meant to be a playful chatbot. We need to figure out how we walk the line between all these different uses, creating something that’s useful for people across a range of contexts, where the desired behavior might really vary
  
  Context of use matters when setting behavior
2. peter_murray 06 Mar 2023
  
  in Public
  
  Every time we have a better model, we want to put it out and test it. We’re very optimistic that some targeted adversarial training can improve the situation with jailbreaking a lot. It’s not clear whether these problems will go away entirely, but we think we can make a lot of the jailbreaking a lot more difficult. Again, it’s not like we didn’t know that jailbreaking was possible before the release. I think it’s very difficult to really anticipate what the real safety problems are going to be with these systems once you’ve deployed them. So we are putting a lot of emphasis on monitoring what people are using the system for, seeing what happens, and then reacting to that. This is not to say that we shouldn’t proactively mitigate safety problems when we do anticipate them. But yeah, it is very hard to foresee everything that will actually happen when a system hits the real world.
  
  Jailbreaks we’re anticipated, but the huge public uptake required more and faster effort to fix.
3. peter_murray 06 Mar 2023
  
  in Public
  
  large language model OpenAI
4. peter_murray 06 Mar 2023
  
  in Public
  
  Since November, OpenAI has already updated ChatGPT several times. The researchers are using a technique called adversarial training to stop ChatGPT from letting users trick it into behaving badly (known as jailbreaking). This work pits multiple chatbots against each other: one chatbot plays the adversary and attacks another chatbot by generating text to force it to buck its usual constraints and produce unwanted responses. Successful attacks are added to ChatGPT’s training data in the hope that it learns to ignore them.
  
  Adversarial training with ChatGPT
  
  The bot gets pitted against itself to see if it can be broken. Since there is a randomization factor in each generated stream, there is a possibility that a chat sequence can get around the defenses.
Visit annotations in context

Tags

large language model

OpenAI

Annotators

peter_murray

URL

technologyreview.com/2023/03/03/1069311/inside-story-oral-history-how-chatgpt-built-openai/
www.ted.com www.ted.com

Jan Wurzbacher: The massive machines removing carbon from Earth's atmosphere | TED Talk

4
1. peter_murray 04 Mar 2023
  
  in Public
  
  it will be important that we can all ramp up the capacity of extracting and removing CO2 from the atmosphere. Now, when thinking about the latter, so far I've spoken about technology, but we shouldn't forget that also, nature offers several solutions to extract carbon from the air, such as forests and oceans. And one element that is very important will be doubling down on these methods offered by nature, enhancing them and protecting them.
  
  Importance of technological extraction methods to supplement natural extraction methods
2. peter_murray 04 Mar 2023
  
  in Public
  
  there is not a lot of CO2 in the air. We're currently at around 420 ppm. That means one molecule out of 2,500 molecules in the air around us is CO2. That's not a lot. And that means to extract only one ton of CO2 from the air, we need to filter around two million cubic meters of air. That's about 800 Olympic swimming pools.
  
  CO2 is rather low density in air, making it challenging to capture
3. peter_murray 04 Mar 2023
  
  in Public
  
  This is Orca. This is the first worldwide commercial direct air capture and storage plant. It is in Iceland, and it is an industrial plant that extracts CO2 out of ambient air. We have operated now for more than one year. It costs more than 10 million dollars to build Orca. And its modules, those eight boxes that we call CO2 collectors, they are designed to extract a bit more than 10 tons of carbon dioxide from the air every day.
  
  Direct capture of CO2 from the atmosphere at near-commercial scale
  
  It uses a absorbent material to capture the CO2, then heats the material to 100°C to extract the CO2. Climeworks then mixes the CO2 with water before injecting into volcanic basalt rock formations where it solidifies into crystals after about 2 years.
4. peter_murray 04 Mar 2023
  
  in Public
  
  From TED Countdown London 2022
  
  Abstract
  
  To restrain global warming, we know we need to drastically reduce pollution. The very next step after that: using both natural and technological solutions to trap as much excess carbon dioxide from the air as possible. Enter Orca, the world's first large-scale direct air capture and storage plant, built in Iceland by the team at Climeworks, led by climate entrepreneur Jan Wurzbacher. This plant is capable of removing 4,000 tons of carbon dioxide from the air each year. With affordability and scalability in mind, Wurzbacher shares his vision for what comes after Orca, the future of carbon removal tech -- and why these innovations are crucial to stop climate change.
  
  climate change
Visit annotations in context

Tags

climate change

Annotators

peter_murray

URL

ted.com/talks/jan_wurzbacher_the_massive_machines_removing_carbon_from_earth_s_atmosphere/c/transcript
deliverypdf.ssrn.com deliverypdf.ssrn.com

SSRN-id4213674.pdf

14
1. peter_murray 04 Mar 2023
  
  in Public
  
  Such regulation is already being pursued in Europe, where the DigitalServices Act would require large platforms to interoperate, a requirementthat could easily be modified to include the Fediverse.
  
  EU Digital Services Act interoperable requirement
  
  digital services act
2. peter_murray 04 Mar 2023
  
  in Public
  
  A different concern with decentralized moderation is that it will leadto “filter bubbles” and “echo chambers” by which instance members willchoose to only interact with like-minded users.
  
  Risk of filter bubbles
  
  filter bubble
3. peter_murray 04 Mar 2023
  
  in Public
  
  The benefit of decentralized moderation is that it can satisfy both thosethose that want to speak and those that don’t want to listen. By empoweringusers, through their choice of instance, to avoid content they find objection-able, the Fediverse operationalizes the principle that freedom of speech is notthe same as freedom of reach
  
  Decentralized moderation satisfies freedom to speak and freedom not to listen
  
  first amendment
4. peter_murray 04 Mar 2023
  
  in Public
  
  And if effective moderation turns out to requiremore infrastructure, that could lead to a greater consolidation of instances.This is what happened with email, which, in part due to the investmentsnecessary to counter spam, has become increasingly dominated by Googleand Microsoft.
  
  Will consolidation of email providers point to consolidation of fediverse instances?
  
  This is useful to note. The email protocols are open and one can chose to host their own email server that—at a protocol level—can interoperate with any other. At a practical level, though, there is now service requirements (spam filtering) and policy choices (only accepting mail from known good sending servers) that limit the reach of a new, bespoke mail server.
  
  What would the equivalent of an email spam problem look like on the fediverse?
  
  email infrastructure
5. peter_murray 04 Mar 2023
  
  in Public
  
  Gab is a useful case study in how decentralized social media can self-police. On the one hand, there was no way for Mastodon to expel Gabfrom the Fediverse. As Mastodon’s founder Eugen Rochko explained, “Youhave to understand it’s not actually possible to do anything platform-widebecause it’s decentralized. . . . I don’t have the control.”32 On the other hand,individual Mastodon instances could—and the most popular ones did—refuse to interact with the Gab instance, effectively cutting it off from mostof the network in a spontaneous, bottom-up process of instance-by-instancedecisionmaking. Ultimately, Gab was left almost entirely isolated, with morethan 99% of its users interacting only with other Gab users. Gab respondedby “defederating”: voluntarily cutting itself off from the remaining instancesthat were still willing to communicate with it.
  
  Gab's attempt to join the fediverse was rejected by Mastodon admins
6. peter_murray 04 Mar 2023
  
  in Public
  
  https://perma.cc/G82J-73WX
  
  Dead link, unfortunately. The original post was deleted, and the archive is not available at perma.cc. Unfortunately it looks like it wasn't captured by wayback nor archive.today
7. peter_murray 04 Mar 2023
  
  in Public
  
  Mastodon instances thus operate according to the principle of content-moderation subsidiarity: content-moderation standards are set by, and differacross, individual instances. Any given Mastodon instance may have rulesthat are far more restrictive than those of the major social media platforms.But the network as a whole is substantially more speech protective thanare any of the major social media platforms, since no user or content canbe permanently banned from the network and anyone is free to start aninstance that communicates both with the major Mastodon instances and theperipheral, shunned instances.
  
  Content-moderation subsidiarity means no user is banned
  
  A user banned from one instance can join another instance or start an instance of their own to federate with the network. This is more protective of speech rights than centralized networks.
  
  It does make persistent harassment a threat, though. If the cost/effort of creating new instance after new instance is low enough, then a motivated actor can harass users and instances.
8. peter_murray 04 Mar 2023
  
  in Public
  
  The Fediverse indeed does, because its decentralization is a matter ofarchitecture, not just policy. A subreddit moderator has control only insofar asReddit, a soon-to-be public company,22 permits that control. Because Redditcan moderate any piece of content—indeed, to ban a subreddit outright—no matter whether the subreddit moderator agrees, it is subject to publicpressure to do so. Perhaps the most famous example is Reddit’s banning ofthe controversial pro-Trump r/The_Donald subreddit several months beforethe 2020 election.
  
  Reddit : Fediverse :: Decentralized-by-policy : Decentralized-by-architecture
  
  Good point! It makes me think that fediverse instances can look to subreddit governance as models for their own governance structures.
  
  fediverse governance
9. peter_murray 04 Mar 2023
  
  in Public
  
  When a user decides to moves instances, theymigrate their account data—including their blocked, muted, and followerusers lists and post history—and their followers will automatically refollowthem at their new account.
  
  Not all content moves when migrating in Mastodon
  
  This is not entirely true at the time of publication. Post history, for instance, does not move from one server to another. There is perhaps good reason for this...the new instance owner may not want to take on the liability of content that is automatically moved to their server.
10. peter_murray 04 Mar 2023
  
  in Public
  
  content-moderation subsidiarity. Just asthe general principle of political subsidiarity holds that decisions should bemade at the lowest organizational level capable of making such decisions,15content-moderation subsidiarity devolves decisions to the individual in-stances that make up the overall network.
  
  Content-moderation subsidiarity
  
  In the fediverse, content moderation decisions are made at low organization levels—at the instance level—rather than on a global scale.
  
  content moderation
11. peter_murray 04 Mar 2023
  
  in Public
  
  moderator’s trilemma. The first prong is that platform userbases arelarge and diverse. The second prong is that the platforms use centralized,top-down moderation policies and practices. The third prong is that theplatforms would like to avoid angering large swaths of their users. But thepast decade of content moderation controversies suggests that these threegoals can’t all be met. The large closed platforms are unwilling to shrink theiruser bases or give up control over content moderation, so they have tacitlyaccepted high levels of dissatisfaction with their moderation decisions. TheFediverse, by contrast, responds to the moderator’s trilemma by giving upon centralized moderation.
  
  Moderator's Trilemma
  
  Classic case of can't have it all:
  
  Large, diverse userbases
  
  Centralized, top-down moderation
  
  Happy users
  
  The fediverse gives up #2, but only after giving up #1? Particularly the "large" part?
12. peter_murray 04 Mar 2023
  
  in Public
  
  An early challenge to the open Internet came from the first generation of giant onlineservices providers like America Online, Compuserve, and Prodigy, which combineddial-up Internet access with an all-encompassing web portal that provided bothInternet content and messaging. But as Internet speeds increased and web browsingimproved, users discovered that the limits of these closed systems outweighed theirbenefits, and they faded into irrelevance by the 2000s.
  
  AOL and the like as early examples of closed systems that were replaced by open environments
13. peter_murray 04 Mar 2023
  
  in Public
  
  A core architectural building block of the Internet is the open protocol. A protocolis the rules that govern the transmission of data. The Internet consists of manysuch protocols, ranging from those that direct the physical transmission ofdata to those that govern the most common Internet applications, like emailor web browsing. Crucially, all these protocols are open, in that anyone canset up and operate a router, website, or email server without needing toregister with or get permission from a central authority.5 Open protocolswere key to the first phase of the Internet’s growth because they enabledunfettered access, removing barriers and bridging gaps between differentcommunities. This enabled and encouraged interactions between groupswith various interests and knowledge, resulting in immense creativity andidea-sharing.
  
  Internet built on open protocols
  
  The domain name registration isn't as much of an outlier as this author makes it out to be. DNS itself is an open protocol—any server can be queried by any client. The DNS registration process replaced manual host tables on each node, which quickly grew unscalable. There are similar notions of port registration, MIME-type registration, and other registries.
  
  hourglass protocol stack shape
14. peter_murray 04 Mar 2023
  
  in Public
  
  There is a limit to how heatedthe debates around email content moderation can be, because there’s anarchitectural limit to how much email moderation is possible. This raises theintriguing possibility of what social media, and its accompanying content-moderation issues, would look like if it, too, operated as a decentralizedprotocol.
  
  Comparing email moderation and centralized moderation
Visit annotations in context

Tags

digital services act

fediverse governance

email infrastructure

content moderation

first amendment

filter bubble

hourglass protocol stack shape

Annotators

peter_murray

URL

deliverypdf.ssrn.com/delivery.php
www.washingtonpost.com www.washingtonpost.com

Tech’s hottest new job: AI whisperer. No coding required.

3
1. peter_murray 03 Mar 2023
  
  in Public
  
  Roughly 700 prompt engineers now use PromptBase to sell prompts by commission for buyers who want, say, a custom script for an e-book or a personalized “motivational life coach.” The freelance site Fiverr offers more than 9,000 listings for AI artists; one seller offers to “draw your dreams into art” for $5.
  
  Prompts are for sale
  
  Freelancer and gig-economy work.
2. peter_murray 03 Mar 2023
  
  in Public
  
  Some AI experts argue that these engineers only wield the illusion of control. No one knows how exactly these systems will respond, and the same prompt can yield dozens of conflicting answers — an indication that the models’ replies are based not on comprehension but on crudely imitating speech to resolve tasks they don’t understand.“Whatever is driving the models’ behavior in response to the prompts is not a deep linguistic understanding,” said Shane Steinert-Threlkeld, an assistant professor in linguistics who is studying natural language processing at the University of Washington. “They explicitly are just telling us what they think we want to hear or what we have already said. We’re the ones who are interpreting those outputs and attributing meaning to them.”He worried that the rise of prompt engineering would lead people to overestimate not just its technical rigor but also the reliability of the results anyone could get from a deceptive and ever-changing black box.“It’s not a science,” he said. “It’s ‘let’s poke the bear in different ways and see how it roars back.’”
  
  Prompt engineering is not science
3. peter_murray 03 Mar 2023
  
  in Public
  
  prompt engineer. His role involves creating and refining the text prompts people type into the AI in hopes of coaxing from it the optimal result. Unlike traditional coders, prompt engineers program in prose, sending commands written in plain text to the AI systems, which then do the actual work.
  
  Summary of prompt engineer work
  
  prompt engineering
Visit annotations in context

Tags

prompt engineering

Annotators

peter_murray

URL

washingtonpost.com/technology/2023/02/25/prompt-engineers-techs-next-big-job/
nymag.com nymag.com

You Are Not a Parrot

15
1. peter_murray 03 Mar 2023
  
  in Public
  
  “Now you’re getting one of the most important points,” Lemoine said. “Whether these things actually are people or not — I happen to think they are; I don’t think I can convince the people who don’t think they are — the whole point is you can’t tell the difference. So we are going to be habituating people to treat things that seem like people as if they’re not.”
  
  When you can tell the difference between algorithm and humanity
  
  Quote from Blake Lemoine, Google AI researcher fired after claiming that LaMDA, Google’s LLM, was sentient
2. peter_murray 03 Mar 2023
  
  in Public
  
  But the road from language model to existential crisis is short indeed. Joseph Weizenbaum, who created ELIZA, the first chatbot, in 1966, spent most of the rest of his life regretting it. The technology, he wrote ten years later in Computer Power and Human Reason, raises questions that “at bottom … are about nothing less than man’s place in the universe.” The toys are fun, enchanting, and addicting, and that, he believed even 47 years ago, will be our ruin: “No wonder that men who live day in and day out with machines to which they believe themselves to have become slaves begin to believe that men are machines.”
  
  Creator of ELIZA in 1956 regrets doing so
  
  Is conversing with machines a natural thing to do?
3. peter_murray 03 Mar 2023
  
  in Public
  
  Bender and Manning’s biggest disagreement is over how meaning is created — the stuff of the octopus paper. Until recently, philosophers and linguists alike agreed with Bender’s take: Referents, actual things and ideas in the world, like coconuts and heartbreak, are needed to produce meaning. This refers to that. Manning now sees this idea as antiquated, the “sort of standard 20th-century philosophy-of-language position.” “I’m not going to say that’s completely invalid as a position in semantics, but it’s also a narrow position,” he told me. He advocates for “a broader sense of meaning.” In a recent paper, he proposed the term distributional semantics: “The meaning of a word is simply a description of the contexts in which it appears.” (When I asked Manning how he defines meaning, he said, “Honestly, I think that’s difficult.”)
  
  Distributional Semantics
  
  Christopher Manning, a computational linguist And director of the Stanford Artificial Intelligence Laboratory, is a proponent of this theory of linguistical meaning.
4. peter_murray 03 Mar 2023
  
  in Public
  
  stochastic parrot (coinage Bender’s) is an entity “for haphazardly stitching together sequences of linguistic forms … according to probabilistic information about how they combine, but without any reference to meaning.” In March 2021, Bender published “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” with three co-authors. After the paper came out, two of the co-authors, both women, lost their jobs as co-leads of Google’s Ethical AI team. The controversy around it solidified Bender’s position as the go-to linguist in arguing against AI boosterism.
  
  Stochastic Parrot definition
  
  The term was coined by Bender. It is “not a write-up of original research. It’s a synthesis of LLM critiques that Bender and others have made”
  
  Two co-authors on [[Google]]’s Ethical AI team lost their jobs after publication.
  
  OpenAI CEO Sam Altman And others adopted the phrase and turned it benign.
5. peter_murray 03 Mar 2023
  
  in Public
  
  OpenAI also contracted out what’s known as ghost labor: gig workers, including some in Kenya (a former British Empire state, where people speak Empire English) who make $2 an hour to read and tag the worst stuff imaginable — pedophilia, bestiality, you name it — so it can be weeded out. The filtering leads to its own issues. If you remove content with words about sex, you lose content of in-groups talking with one another about those things.
  
  OpenAI’s use of human taggers
  
  content moderation
6. peter_murray 03 Mar 2023
  
  in Public
  
  Our List of Dirty, Naughty, Obscene, and Otherwise Bad Word
  
  https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
7. peter_murray 03 Mar 2023
  
  in Public
  
  Tech-makers assuming their reality accurately represents the world create many different kinds of problems. The training data for ChatGPT is believed to include most or all of Wikipedia, pages linked from Reddit, a billion words grabbed off the internet
  
  LLMs as a model of reality, but not reality
  
  There are limits to any model. In this case, the training data. What biases are implicitly in that model based on how it was selected and what it contained?
  
  The paragraph goes on to list some biases: race, wealth, and “vast swamps”
  
  algorithmic bias
8. peter_murray 03 Mar 2023
  
  in Public
  
  It can’t include, say, e-book copies of everything in the Stanford library, as books are protected by copyright law.
  
  GPT-3 doesn’t contain book content in its training?
  
  “Copyright” can’t be an answer because everything post-1976 is copyrighted.
  
  copyright
9. peter_murray 03 Mar 2023
  
  in Public
  
  “Systematic Approaches to Learning Algorithms and Machine Inferences.” Then people would be out here asking, “Is this SALAMI intelligent? Can this SALAMI write a novel? Does this SALAMI deserve human rights?”
  
  AI -> SALAMI
  
  “[[Artificial Intelligence]]” as a phrase has a white supremacy background. Besides, who gets to define “intelligent” and against what metric?
10. peter_murray 03 Mar 2023
  
  in Public
  
  “We call on the field to recognize that applications that aim to believably mimic humans bring risk of extreme harms,” she co-wrote in 2021. “Work on synthetic human behavior is a bright line in ethical Al development, where downstream effects need to be understood and modeled in order to block foreseeable harm to society and different social groups.”
  
  Synthetic human behavior as AI bright line
  
  Quote from Bender
  
  computing ethics
11. peter_murray 03 Mar 2023
  
  in Public
  
  We go around assuming ours is a world in which speakers — people, creators of products, the products themselves — mean to say what they say and expect to live with the implications of their words. This is what philosopher of mind Daniel Dennett calls “the intentional stance.” But we’ve altered the world. We’ve learned to make “machines that can mindlessly generate text,” Bender told me when we met this winter. “But we haven’t learned how to stop imagining the mind behind it.”
  
  Intentional Stance
  
  We (mostly) assume people mean what they say. What happens when we live in a world where we can no longer assume that.
  
  intentional stance
12. peter_murray 03 Mar 2023
  
  in Public
  
  The models are built on statistics. They work by looking for patterns in huge troves of text and then using those patterns to guess what the next word in a string of words should be. They’re great at mimicry and bad at facts. Why? LLMs, like the octopus, have no access to real-world, embodied referents. This makes LLMs beguiling, amoral, and the Platonic ideal of the bullshitter, as philosopher Harry Frankfurt, author of On Bullshit, defined the term. Bullshitters, Frankfurt argued, are worse than liars. They don’t care whether something is true or false. They care only about rhetorical power — if a listener or reader is persuaded.
  
  Why LLMs are “great at mimicry and bad at facts”
13. peter_murray 03 Mar 2023
  
  in Public
  
  Climbing Towards NLU: On Meaning, Form, and Understanding in the Age of Data
  
  Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data (Bender & Koller, ACL 2020)
14. peter_murray 03 Mar 2023
  
  in Public
  
  Say that A and B, both fluent speakers of English, are independently stranded on two uninhabited islands. They soon discover that previous visitors to these islands have left behind telegraphs and that they can communicate with each other via an underwater cable. A and B start happily typing messages to each other. Meanwhile, O, a hyperintelligent deep-sea octopus who is unable to visit or observe the two islands, discovers a way to tap into the underwater cable and listen in on A and B’s conversations. O knows nothing about English initially but is very good at detecting statistical patterns. Over time, O learns to predict with great accuracy how B will respond to each of A’s utterances. Soon, the octopus enters the conversation and starts impersonating B and replying to A. This ruse works for a while, and A believes that O communicates as both she and B do — with meaning and intent. Then one day A calls out: “I’m being attacked by an angry bear. Help me figure out how to defend myself. I’ve got some sticks.” The octopus, impersonating B, fails to help. How could it succeed? The octopus has no referents, no idea what bears or sticks are. No way to give relevant instructions, like to go grab some coconuts and rope and build a catapult. A is in trouble and feels duped. The octopus is exposed as a fraud.
  
  The LLM Octopus Problem
  
  The octopus has observed the conversations and starts to impersonate one of the participants. What happens when the octopus lacks context to hold up its end of the conversation.
15. peter_murray 03 Mar 2023
  
  in Public
  
  large language model
Visit annotations in context

Tags

large language model

algorithmic bias

copyright

content moderation

computing ethics

intentional stance

Annotators

peter_murray

URL

nymag.com/intelligencer/article/ai-artificial-intelligence-chatbots-emily-m-bender.html
Feb 2023
pluralistic.net pluralistic.net

Pluralistic: Podcasting “Twiddler” (27 Feb 2023) – Pluralistic: Daily links from Cory Doctorow

2
1. peter_murray 28 Feb 2023
  
  in Public
  
  I have no doubt that robber barons would have engaged in zuckerbergian shenanigans if they could have – but here we run up against the stubborn inertness of atoms and the slippery liveliness of bits. Changing a railroad schedule to make direct connections with cities where you want to destroy a rival ferry business (or hell, laying track to those cities) is a slow proposition. Changing the content recommendation system at Facebook is something you do with a few mouse-clicks.
  
  Difference between railroad monopolies and digital platform monopolies
  
  Railroad monopolies were limited to physical space and time. Platform monopolies can easily change algorithms to shift user attention and content recommendations.
2. peter_murray 28 Feb 2023
  
  in Public
  
  Enshittification, you'll recall, is the lifecycle of the online platform: first, the platform allocates surpluses to end-users; then, once users are locked in, those surpluses are taken away and given to business-customers. Once the advertisers, publishers, sellers, creators and performers are locked in, the surplus is clawed away from them and taken by the publishers.
  
  Defining "enshittification"
  
  The post continues to explain how this happened with [[Facebook]]:
  
  1 First, gain a huge user base with network effects and lock users into the platform. 2. Spy on users to offer precision targeted advertising. Companies added beacons to websites to improve targeting. 3. Raise ad rates and decrease use of expensive anti-fraud measures.
  
  enshittification
Visit annotations in context

Tags

enshittification

Annotators

peter_murray

URL

pluralistic.net/2023/02/27/knob-jockeys/
deliverypdf.ssrn.com deliverypdf.ssrn.com

SSRN-id4213674.pdf

1
1. peter_murray 21 Feb 2023
  
  in Public
  
  Rozenshtein, Alan Z., Moderating the Fediverse: Content Moderation on Distributed Social Media (November 23, 2022). 2 Journal of Free Speech Law (2023, Forthcoming), Available at SSRN: https://ssrn.com/abstract=4213674 or http://dx.doi.org/10.2139/ssrn.4213674
  
  Found via Nathan Schneider
  
  Abstract
  
  Current approaches to content moderation generally assume the continued dominance of “walled gardens”: social media platforms that control who can use their services and how. But an emerging form of decentralized social media—the "Fediverse"—offers an alternative model, one more akin to how email works and that avoids many of the pitfalls of centralized moderation. This essay, which builds on an emerging literature around decentralized social media, seeks to give an overview of the Fediverse, its benefits and drawbacks, and how government action can influence and encourage its development.
  
  Part I describes the Fediverse and how it works, beginning with a general description of open versus closed protocols and then proceeding to a description of the current Fediverse ecosystem, focusing on its major protocols and applications. Part II looks at the specific issue of content moderation on the Fediverse, using Mastodon, a Twitter-like microblogging service, as a case study to draw out the advantages and disadvantages of the federated content-moderation approach as compared to the current dominant closed-platform model. Part III considers how policymakers can encourage the Fediverse, whether through direct regulation, antitrust enforcement, or liability shields.
  
  fediverse governance content moderation
Visit annotations in context

Tags

content moderation

fediverse governance

Annotators

peter_murray

URL

deliverypdf.ssrn.com/delivery.php
nymag.com nymag.com

Why Bing Is Being Creepy

5
1. peter_murray 18 Feb 2023
  
  in Public
  
  In ChatGPT and Bing’s conversations about themselves, you see evidence of the corpus everywhere: the sci-fi, the news articles with boilerplate paragraphs about machine uprisings, the papers about what AI researchers are working on. You also see evidence of the more rigorous coverage and criticism of OpenAI, etc., that has elucidated possible harms that could result from the careless deployment of AI tools.
  
  ChatGPT/Bing's self-reflection comes from the corpus of discussions about AI that it has ingested
2. peter_murray 18 Feb 2023
  
  in Public
  
  Tay is from a different generation of AI and bears little technical resemblance to something like the new Bing. Still, it was orders of magnitude more sophisticated, and less technologically comprehensible to its users, than something like ELIZA.
  
  Microsoft's Tay project
3. peter_murray 18 Feb 2023
  
  in Public
  
  Attempting to thwart a simple rules-based chatbot is mostly a matter of discovering dead ends and mapping the machine; the new generation of chatbots just keeps on generating. Per Weizenbaum, however, that should be an invitation to bring them back over the threshold, as even lay people eventually did with bots like ELIZA, no programming knowledge required. In other words, what’s happening in these encounters is weird and hard to explain — but, also, similarly, with a little distance, it makes sense. It’s intuitive.
  
  Thwarting Eliza versus thwarting Sydney
4. peter_murray 18 Feb 2023
  
  in Public
  
  More interesting or alarming or hilarious, depending on the interlocutor, is its propensity to challenge or even chastise its users, and to answer, in often emotional language, questions about itself.
  
  Examples of Bing/ChatGPT/Sydney gaslighting users
  
  Being very emphatic about the current year being 2022 instead of 2023
  
  How Sydney spied on its developers
  
  How Sydney expressed devotion to the user and expressed a desire to break up a marriage
  
  Microsoft large language model
5. peter_murray 18 Feb 2023
  
  in Public
  
  In his 1976 book, Computer Power and Human Reason: From Judgment to Calculation, the computer scientist Joseph Weizenbaum observed some interesting tendencies in his fellow humans. In one now-famous anecdote, he described his secretary’s early interactions with his program ELIZA, a proto-chatbot he created in 1966.
  
  Description of Joseph Weizenbaum's ELIZA program
  
  When rule-based artificial intelligence was the state-of-the-art.
  
  artificial inteligence
Visit annotations in context

Tags

large language model

Microsoft

artificial inteligence

Annotators

peter_murray

URL

nymag.com/intelligencer/2023/02/why-bing-is-being-creepy.html
storage.courtlistener.com storage.courtlistener.com

Microsoft Word - 98922602_16.docx

13
1. peter_murray 08 Feb 2023
  
  in Public
  
  Upon information and belief, Stability AI has copiedmore than 12 million photographs from Getty Images’ collection, along with the associatedcaptions and metadata, without permission from or compensation to Getty Images, as part of itsefforts to build a competing business. As part of its unlawful scheme, Stability AI has removedor altered Getty Images’ copyright management information, provided false copyrightmanagement information, and infringed Getty Images’ famous trademarks.
  
  Grounds for complaint
  
  Removed/altered Getty's "copyright management information" (presumably the visible watermark plus attribution, perhaps some embedded steganography data as well)
  
  False copyright information (that there is no copyright on AI-generated images?)
  
  Infringing on trademark (Stable Diffusion creates a watermark that resembles Getty Images')
2. peter_murray 08 Feb 2023
  
  in Public
  
  Making matters worse, Stability AI has caused the Stable Diffusion model toincorporate a modified version of the Getty Images’ watermark to bizarre or grotesque syntheticimagery that tarnishes Getty Images’ hard-earned reputation, such as the image below
  
  Very similar watermark implies Getty Images affiliation
  
  Two points in this section of the complaint:
  
  in paragraph 58, Getty Images says that "Stability AI has knowingly removed" the watermark from some of the images, but it does not provide evidence of that in the complaint.
  
  in paragraph 59, the AI-generated image created a watermark that strongly resembles the Getty Images watermark, and this watermark is on an image that Getty would not have in its collection. This would seem to be the trademark violation complaint.
3. peter_murray 08 Feb 2023
  
  in Public
  
  In many cases, and as discussed further below, the output delivered by StabilityAI includes a modified version of a Getty Images watermark, underscoring the clear linkbetween the copyrighted images that Stability AI copied without permission and the output itsmodel delivers.
  
  Modified watermark in the output underscores a clear link
  
  The example embedded in the complaint is of two soccer players with their arms outstretched and the Getty Images watermark is clearly visible. In the AI-generated image, there are two soccer players in weird positions; the team logos and jersey colors match.
4. peter_murray 08 Feb 2023
  
  in Public
  
  Getty Images’ content is extremely valuable to the datasets used to train StableDiffusion. Getty Images’ websites provide access to millions of high quality images and a vastarray of subject matter. High quality images such as those offered by Getty Images on itswebsites are more useful for training an AI model such as Stable Diffusion than low qualityimages because they contain more detail or data about the image that can be copied. By contrast,a low quality image, such as one that has been compressed and posted as a small thumbnail on atypical social media site, is less valuable because it only provides a rough, poor qualityframework of the underlying image and may not be accompanied by detailed text or other usefulmetadata.
  
  Getty Images' content is well suited for AI training
  
  High quality images and detailed descriptions.
5. peter_murray 08 Feb 2023
  
  in Public
  
  Upon information and belief, Stability AI then created yet additional copies withvisual noise added, while retaining encoded copies of the original images without noise forcomparison to help train its model.
  
  Copies with added noise
6. peter_murray 08 Feb 2023
  
  in Public
  
  Stable Diffusion was trained on 5 billion image-text pairs from datasets preparedby non-party LAION, a German entity that works in conjunction with and is sponsored byStability AI. Upon information and belief, Stability AI provided LAION with both funding andsignificant computing resources to produce its datasets in furtherance of Stability AI’s infringingscheme.
  
  Role of LAION
  
  LAION, from their website: a non-profit organization providing datasets, tools and models to liberate machine learning research. By doing so, we encourage open public education and a more environment-friendly use of resources by reusing existing datasets and models.
  
  Wikipedia: The Large-scale Artificial Intelligence Open Network (LAION) is a German non-profit with a stated goal "to make large-scale machine learning models, datasets and related code available to the general public". It is best known for releasing a number of large datasets of images and captions scraped from the web which have been used to train a number of high-profile text-to-image models, including Stable Diffusion and Imagen.
7. peter_murray 08 Feb 2023
  
  in Public
  
  Stability AI created and maintains a model called Stable Diffusion. Uponinformation and belief, Stability AI utilizes the following steps from input to output
  
  Description of the LLM training process
  
  Specifically, the use of altering images to add "noise" and training the LLM to remove the noise associated with text from the description.
8. peter_murray 08 Feb 2023
  
  in Public
  
  The Getty Images websites from which Stability AI copiedimages without permission is subject to express terms and conditions of use which, among otherthings, expressly prohibit, inter alia: (i) downloading, copying or re-transmitting any or all of thewebsite or its contents without a license; and (ii) using any data mining, robots or similar datagathering or extraction methods.
  
  Terms of service violation
  
  Getty Images' terms of service do not permit data mining.
9. peter_murray 08 Feb 2023
  
  in Public
  
  Getty Images has registered its copyright of the Database with the United StatesCopyright Office. The copyright registration number is TXu002346096.
  
  Getty database copyright
  
  Registration TXU002346096 with the title: Getty Images Asset Data Records November 28, 2022
10. peter_murray 08 Feb 2023
  
  in Public
  
  Stability AI has copied at least 12 million copyrighted images from Getty Images’websites, along with associated text and metadata, in order to train its Stable Diffusion model.
  
  12 million claimed; 7,316 listed
  
  Attachment A has an itemized list of images with 7,316 lines
  
  From paragraph 24:
  
  For purposes of the copyright infringement claims set forth herein and establishing the unlawful nature of Stability AI’s conduct, Getty Images has selected 7,216 examples from the millions of images that Stability AI copied without permission and used to train one or more versions of Stable Diffusion. The copyrights for each of these images (as well as for many other images) have been registered with the U.S. Copyright Office. A list of these works, together with their copyright registration numbers, is attached as Exhibit A.
11. peter_murray 08 Feb 2023
  
  in Public
  
  Often, the output generated by Stable Diffusion contains a modified version of aGetty Images watermark, creating confusion as to the source of the images and falsely implyingan association with Getty Images.
  
  Trademark infringement on the Getty watermark
  
  Although the watermark in the AI-generated images isn't exactly Getty's, it is recognizable as such.
  
  trademarks
12. peter_murray 08 Feb 2023
  
  in Public
  
  Getty Images’ visual assets are highly desirable for use in connection withartificial intelligence and machine learning because of their high quality, and because they areaccompanied by content-specific, detailed captions and rich metadata.
  
  Descriptive information makes the Getty collection more valuable for LLM training
13. peter_murray 08 Feb 2023
  
  in Public
  
  COMPLAINT filed with Jury Demand against Stability AI, Inc. Getty Images (US), Inc. v. Stability AI, Inc. (1:23-cv-00135) District Court, D. Delaware
  
  https://www.courtlistener.com/docket/66788385/getty-images-us-inc-v-stability-ai-inc/
  
  AI art LLM copyright lawsuits
Visit annotations in context

Tags

trademarks

AI art

LLM copyright lawsuits

Annotators

peter_murray

URL

storage.courtlistener.com/recap/gov.uscourts.ded.81407/gov.uscourts.ded.81407.1.0.pdf
static1.squarespace.com static1.squarespace.com

Beetham_etal_2022

1
1. peter_murray 04 Feb 2023
  
  in Public
  
  Staff and studentsare rarely in a position to understand the extent to which data is being used, nor are they able todetermine the extent to which automated decision-making is leveraged in the curation oramplification of content.
  
  Is this a data (or privacy) literacy problem? A lack of regulation by experts in this field?
  
  automated decision systems
Visit annotations in context

Tags

automated decision systems

Annotators

peter_murray

URL

static1.squarespace.com/static/5cf15af7a259990001706378/t/61f83d6ce0107614753f858f/1643658607549/Beetham_etal_2022.pdf
arxiv.org arxiv.org

Untitled document

17
1. peter_murray 04 Feb 2023
  
  in Public
  
  Certainly it would not be possible if theLLM were doing nothing more than cutting-and-pasting fragments of text from its training setand assembling them into a response. But this isnot what an LLM does. Rather, an LLM mod-els a distribution that is unimaginably complex,and allows users and applications to sample fromthat distribution.
  
  LLMs are not cut and paste; the matrix of token-following-token probabilities are "unimaginably complex"
  
  I wonder how this fact will work its way into the LLM copyright cases that have been filed. Is this enough to make a the LLM output a "derivative work"?
  
  LLM copyright lawsuits
2. peter_murray 04 Feb 2023
  
  in Public
  
  Including a prompt prefix in the chain-of-thought style encourages the model to generatefollow-on sequences in the same style, which isto say comprising a series of explicit reasoningsteps that lead to the final answer. This abilityto learn a general pattern from a few examples ina prompt prefix, and to complete sequences in away that conforms to that pattern, is sometimescalled in-context learning or few-shot prompt-ing. Chain-of-thought prompting showcases thisemergent property of large language model at itsmost striking.
  
  Emulating deductive reasoning with prompt engineering
  
  I think "emulating deductive reasoning" is the correct shorthand here.
  
  prompt engineering
3. peter_murray 04 Feb 2023
  
  in Public
  
  Human language userscan consult the world to settle their disagree-ments and update their beliefs. They can, so tospeak, “triangulate” on objective reality. In iso-lation, an LLM is not the sort of thing that cando this, but in application, LLMs are embeddedin larger systems. What if an LLM is embeddedin a system capable of interacting with a worldexternal to itself? What if the system in ques-tion is embodied, either physically in a robot orvirtually in an avatar?
  
  Humans reach an objective reality; can an LLM embedded in a system interacting with the external world also find an objective reality?
4. peter_murray 04 Feb 2023
  
  in Public
  
  Vision-language mod-els (VLMs) such as VilBERT (Lu et al., 2019)and Flamingo (Alayrac et al., 2022), for exam-ple, combine a language model with an imageencoder, and are trained on a multi-modal cor-pus of text-image pairs. This enables them topredict how a given sequence of words will con-tinue in the context of a given image
  
  Definition of "vision-language models"
5. peter_murray 04 Feb 2023
  
  in Public
  
  The real issue here is that, whatever emergentproperties it has, the LLM itself has no accessto any external reality against which its wordsmight be measured, nor the means to apply anyother external criteria of truth, such as agree-ment with other language-users.
  
  The LLM cannot see beyond its training to measure its sense of "truth"
  
  One can embed the LLM in a larger system that might have capabilities that look into the outer world.
6. peter_murray 04 Feb 2023
  
  in Public
  
  Shanahan, Murray. "Talking About Large Language Models." arXiv, (2022). https://doi.org/10.48550/arXiv.2212.03551.
  
  Found via Simon Wilson.
  
  Abstract
  
  Thanks to rapid progress in artificial intelligence, we have entered an era when technology and philosophy intersect in interesting ways. Sitting squarely at the centre of this intersection are large language models (LLMs). The more adept LLMs become at mimicking human language, the more vulnerable we become to anthropomorphism, to seeing the systems in which they are embedded as more human-like than they really are. This trend is amplified by the natural tendency to use philosophically loaded terms, such as "knows", "believes", and "thinks", when describing these systems. To mitigate this trend, this paper advocates the practice of repeatedly stepping back to remind ourselves of how LLMs, and the systems of which they form a part, actually work. The hope is that increased scientific precision will encourage more philosophical nuance in the discourse around artificial intelligence, both within the field and in the public sphere.
  
  large language model
7. peter_murray 04 Feb 2023
  
  in Public
  
  A bare-bones LLM doesn’t “really” know any-thing because all it does, at a fundamental level,is sequence prediction. Sometimes a predictedsequence takes the form of a proposition. But thespecial relationship propositional sequences haveto truth is apparent only to the humans who areasking questions, or to those who provided thedata the model was trained on. Sequences ofwords with a propositional form are not specialto the model itself in the way they are to us. Themodel itself has no notion of truth or falsehood,properly speaking, because it lacks the means toexercise these concepts in anything like the waywe do.
  
  An LLM relies on statistical probability to construct a word sequence without regard to truth and falsehood
  
  The LLM's motivation is not truth or falsehood; it has no motivation. Humans anthropomorphize motivation and assign truth or belief to the generated statements. "Knowing" is the wrong word to ascribe to the LLM's capabilities.
8. peter_murray 04 Feb 2023
  
  in Public
  
  Turning an LLM into a question-answering sys-tem by a) embedding it in a larger system, andb) using prompt engineering to elicit the requiredbehaviour exemplifies a pattern found in muchcontemporary work. In a similar fashion, LLMscan be used not only for question-answering,but also to summarise news articles, to generatescreenplays, to solve logic puzzles, and to trans-late between languages, among other things.There are two important takeaways here. First,the basic function of a large language model,namely to generate statistically likely continua-tions of word sequences, is extraordinarily versa-tile. Second, notwithstanding this versatility, atthe heart of every such application is a model do-ing just that one thing: generating statisticallylikely continuations of word sequences.
  
  LLM characteristics that drive their usefulness
9. peter_murray 04 Feb 2023
  
  in Public
  
  Dialogue is just one application of LLMs thatcan be facilitated by the judicious use of promptprefixes. In a similar way, LLMs can be adaptedto perform numerous tasks without further train-ing (Brown et al., 2020). This has led to a wholenew category of AI research, namely prompt en-gineering, which will remain relevant until wehave better models of the relationship betweenwhat we say and what we want.
  
  Prompt engineering
  
  prompt engineering
10. peter_murray 04 Feb 2023
  
  in Public
  
  In the background, the LLM is invisiblyprompted with a prefix along the following lines.
  
  Pre-work to make the LLM conversational
  
  prompt engineering
11. peter_murray 04 Feb 2023
  
  in Public
  
  However, in the case of LLMs, such istheir power, things can get a little blurry. Whenan LLM can be made to improve its performanceon reasoning tasks simply by being told to “thinkstep by step” (Kojima et al., 2022) (to pick justone remarkable discovery), the temptation to seeit as having human-like characteristics is almostoverwhelming.
  
  Intentional stance meets uncanny valley
  
  Intentional stance language becomes problematic when we can no longer distinguish the inanimate object's behavior from human behavior.
  
  uncanny valley
12. peter_murray 04 Feb 2023
  
  in Public
  
  The intentional stance is the strategy of interpretingthe behavior of an entity ... by treating it as if it were arational agent ”
  
  Definition of "intentional stance"
  
  We use anthropomorphic language as a shortcut for conveying a concept...giving an inanimate object agency to interact with the world as humans do as a way of plain-language explaining what is happening.
  
  intentional stance
13. peter_murray 04 Feb 2023
  
  in Public
  
  To the human user, each of these examplespresents a different sort of relationship to truth.In the case of Neil Armstrong, the ultimategrounds for the truth or otherwise of the LLMsanswer is the real world. The Moon is a real ob-ject and Neil Armstrong was a real person, andhis walking on the Moon is a fact about the phys-ical world. Frodo Baggins, on the other hand, isa fictional character, and the Shire is a fictionalplace. Frodo’s return to the Shire is a fact aboutan imaginary world, not a real one. As for the lit-tle star in the nursery rhyme, well that is barelyeven a fictional object, and the only fact at issueis the occurrence of the words “little star” in afamiliar English rhyme.
  
  How LLMs can deal with real-world, fictional-world, and imaginary-world concepts
14. peter_murray 04 Feb 2023
  
  in Public
  
  What we are really askingthe model is the following question: Given thestatistical distribution of words in the vast publiccorpus of (English) text, what words are mostlikely to follow the sequence “The first person towalk on the Moon was ”? A good reply to thisquestion is “Neil Armstrong”.
  
  Example of how an LLM arrives at an answer
15. peter_murray 04 Feb 2023
  
  in Public
  
  LLMs are generative math-ematical models of the statistical distributionof tokens in the vast public corpus of human-generated text, where the tokens in question in-clude words, parts of words, or individual char-acters including punctuation marks. They aregenerative because we can sample from them,which means we can ask them questions. Butthe questions are of the following very specifickind. “Here’s a fragment of text. Tell me howthis fragment might go on. According to yourmodel of the statistics of human language, whatwords are likely to come next?”
  
  LLM definition
  
  large language model
16. peter_murray 04 Feb 2023
  
  in Public
  
  As we build systems whose capabilities moreand more resemble those of humans, despite thefact that those systems work in ways that arefundamentally different from the way humanswork, it becomes increasingly tempting to an-thropomorphise them. Humans have evolved toco-exist over many millions of years, and humanculture has evolved over thousands of years tofacilitate this co-existence, which ensures a de-gree of mutual understanding. But it is a seriousmistake to unreflectingly apply to AI systems thesame intuitions that we deploy in our dealingswith each other, especially when those systemsare so profoundly different from humans in theirunderlying operation
  
  AI systems are fundamentally different from human evolution
  
  technology anthropomorphism
17. peter_murray 04 Feb 2023
  
  in Public
  
  First, the performance of LLMs on benchmarksscales with the size of the training set (and, toa lesser degree with model size). Second, thereare qualitative leaps in capability as the modelsscale. Third, a great many tasks that demand in-telligence in humans can be reduced to next tokenprediction with a sufficiently performant model.It is the last of these three surprises that is thefocus of the present paper.
  
  Surprising LLM findings
Visit annotations in context

Tags

large language model

LLM copyright lawsuits

uncanny valley

technology anthropomorphism

prompt engineering

intentional stance

Annotators

peter_murray

URL

arxiv.org/pdf/2212.03551v4.pdf
radiolab.org radiolab.org

Sight Unseen

1
1. peter_murray 03 Feb 2023
  
  in Public
  
  There's this old idea in photography called the decisive moment - that the world is filled with these far-off realities. But every so often, a photograph can capture a moment that, boom, takes you there. This is one of those photos. In the picture, you see all these men and women standing in kind of a loose semicircle. Some of them still have their blue surgical gloves on. They look totally spent. They're all looking in different directions. And they all look like they're not even there, like they're totally lost in their own thoughts.
  
  Defining "Decisive Moment"
  
  decisive moment
Visit annotations in context

Tags

decisive moment

Annotators

peter_murray

URL

radiolab.org/episodes/sight-unseen
arstechnica.com arstechnica.com

The generative AI revolution has begun—how did we get here?

12
1. peter_murray 01 Feb 2023
  
  in Public
  
  The breakthroughs are all underpinned by a new class of AI models that are more flexible and powerful than anything that has come before. Because they were first used for language tasks like answering questions and writing essays, they’re often known as large language models (LLMs). OpenAI’s GPT3, Google’s BERT, and so on are all LLMs. But these models are extremely flexible and adaptable. The same mathematical structures have been so useful in computer vision, biology, and more that some researchers have taken to calling them "foundation models" to better articulate their role in modern AI.
  
  Foundation Models in AI
  
  Large language models, more generally, are “foundation models”. They got the large-language name because that is where they were first applied.
  
  artificial intelligence large language model
2. peter_murray 01 Feb 2023
  
  in Public
  
  The OpenAI researchers discovered that in making the models bigger, they didn’t just get better at producing text. The models could learn entirely new behaviors simply by being shown new training data. In particular, the researchers discovered that GPT3 could be trained to follow instructions in plain English without having to explicitly design the model that way
  
  Bigger models meant that the GPT could program itself
  
  Emergent capabilities in interpreting the input.
3. peter_murray 01 Feb 2023
  
  in Public
  
  The basic workflow of these models is this: generate, evaluate, iterate. As anyone who’s played with making AI art knows, you typically have to create many examples to get something you like. When working with these AI models, you have to remember that they’re slot machines, not calculators. Every time you ask a question and pull the arm, you get an answer that could be marvelous… or not. The challenge is that the failures can be extremely unpredictable.
  
  These models are not deterministic
  
  This isn’t a calculator; it is a slot machine.
4. peter_murray 01 Feb 2023
  
  in Public
  
  Or see drug discovery, where biotech companies are training AIs that can design new drugs. But these new drugs are often exploring new areas of biology—for example, proteins that are unlike naturally evolved samples. AI design has to move hand in hand with huge amounts of physical experiments in labs because the data needed to feed these models just doesn’t exist yet
  
  AI in drug discovery
5. peter_murray 01 Feb 2023
  
  in Public
  
  The latest image models like Stable Diffusion use a process called latent diffusion. Instead of directly generating the latent representation, a text prompt is used to incrementally modify initial images. The idea is simple: If you take an image and add noise to it, it will eventually become a noisy blur. However, if you start with a noisy blur, you can “subtract” noise from it to get an image back. You must “denoise” smartly—that is, in a way that moves you closer to a desired image.
  
  How Stable Diffusion works, using latent diffusion
  
  Starting with noise and making meaning from there.
6. peter_murray 01 Feb 2023
  
  in Public
  
  Dall-E is actually a combination of a few different AI models. A transformer translates between that latent representation language and English, taking English phrases and creating “pictures” in the latent space. A latent representation model then translates between that lower-dimensional “language” in the latent space and actual images. Finally, there’s a model called CLIP that goes in the opposite direction; it takes images and ranks them according to how close they are to the English phrase.
  
  How Dall-E works
  
  AI art
7. peter_murray 01 Feb 2023
  
  in Public
  
  A deep learning model can learn what’s called a "latent space" representation of images. The model learns to extract important features from the images and compresses them into a lower-dimensional representation, called a latent space or latent representation. A latent representation takes all the possible images at a given resolution and reduces them to a much lower dimension. You can think of it like the model learning an immensely large set of basic shapes, lines, and patterns—and then rules for how to put them together coherently into objects.
  
  Latent representations
8. peter_murray 01 Feb 2023
  
  in Public
  
  OpenAI pushed this approach with GPT2 and then GPT3. GPT stands for "generative pre-trained transformer." The "generative" part is obvious—the models are designed to spit out new words in response to inputs of words. And "pre-trained" means they're trained using this fill-in-the-blank method on massive amounts of text.
  
  Defining Generative Pre-trained Transformer (GPT)
9. peter_murray 01 Feb 2023
  
  in Public
  
  Of course, you don’t have to have English as the input and Japanese as the output. You can also translate between English and English! Think about many of the common language AI tasks, like summarizing a long essay into a few short paragraphs, reading a customer’s review of a product and deciding if it was positive or negative, or even something as complex as taking a story prompt and turning it into a compelling essay. These problems can all be structured as translating one chunk of English to another.
  
  Summarizing, sentiment analysis, and story prompts can be thought of as English-to-English translation
10. peter_murray 01 Feb 2023
  
  in Public
  
  An AI model that can learn and work with this kind of problem needs to handle order in a very flexible way. The old models—LSTMs and RNNs—had word order implicitly built into the models. Processing an input sequence of words meant feeding them into the model in order. A model knew what word went first because that’s the word it saw first. Transformers instead handled sequence order numerically, with every word assigned a number. This is called "positional encoding." So to the model, the sentence “I love AI; I wish AI loved me” looks something like (I 1) (love 2) (AI 3) (; 4) (I 5) (wish 6) (AI 7) (loved 8) (me 9).
  
  Google’s “the transformer”
  
  One breakthrough was positional encoding versus having to handle the input in the order it was given. Second, using a matrix rather than vectors. This research came from Google Translate.
  
  natural language translation
11. peter_murray 01 Feb 2023
  
  in Public
  
  The problem of understanding and working with language is fundamentally different from that of working with images. Processing language requires working with sequences of words, where order matters. A cat is a cat no matter where it is in an image, but there’s a big difference between “this reader is learning about AI” and “AI is learning about this reader.”
  
  How language is a different problem from images
12. peter_murray 01 Feb 2023
  
  in Public
  
  There’s a holy trinity in machine learning: models, data, and compute. Models are algorithms that take inputs and produce outputs. Data refers to the examples the algorithms are trained on. To learn something, there must be enough data with enough richness that the algorithms can produce useful output. Models must be flexible enough to capture the complexity in the data. And finally, there has to be enough computing power to run the algorithms.
  
  “Holy trinity” of machine learning: models, data, and compute
  
  Models in 1990s, starting with convolutional neural networks for computer vision.
  
  Data in 2009 in the form of labeled images from Stanford AI researchers.
  
  Compute in 2006 with Nvidia’s CUDA programming language for GPUs.
  
  AlexNet in 2012 combined all of these.
  
  machine learning
Visit annotations in context

Tags

artificial intelligence

machine learning

large language model

AI art

natural language translation

Annotators

peter_murray

URL

arstechnica.com/gadgets/2023/01/the-generative-ai-revolution-has-begun-how-did-we-get-here/
venturebeat.com venturebeat.com

Who will compete with ChatGPT? Meet the contenders | The AI Beat

5
1. peter_murray 01 Feb 2023
  
  in Public
  
  Character AI
  
  From one of the authors of the Transformer paper from Google. Intended to be able to talk with dead and/or fictional characters.
2. peter_murray 01 Feb 2023
  
  in Public
  
  Google: LaMDA
  
  From Google; once considered by one of its researchers to be sentient.
  
  Google
3. peter_murray 01 Feb 2023
  
  in Public
  
  DeepMind: Sparrow
  
  From an Alphabet subsidiary, it is meant to be a conversation agent. Claims safer, less-biased machine learning (ML) systems, thanks to its application of reinforcement learning based on input from human research participants for training. Can search Google for answers.
  
  Considered a proof-of-concept that is not ready for wide deployment.
  
  Alphabet
4. peter_murray 01 Feb 2023
  
  in Public
  
  artificial intelligence
5. peter_murray 01 Feb 2023
  
  in Public
  
  Anthropic: Claude
  
  Tied to the FTX’s Sam Bankman-Freid and the “effective altruism” movement. “Constitutional AI,” which it says is based on concepts such as beneficence, non-maleficence and autonomy.
Visit annotations in context

Tags

artificial intelligence

Google

Alphabet

Annotators

peter_murray

URL

venturebeat.com/ai/who-will-compete-with-chatgpt-meet-the-contenders-the-ai-beat/
Jan 2023
docdrop.org docdrop.org

Video: The LibNFT Project: Leveraging Blockchain-Based Digital Asset Tech to Preserve Collections (DocDrop)

14
1. peter_murray 23 Jan 2023
  
  in Public
  
  I'm like listening to this and thinking okay there's no bad
  
  There's no bad blockchain; there's only bad blockchain users
  
  blockchain there's only like bad blockchain users
2. peter_murray 23 Jan 2023
  
  in Public
  
  the Assumption of your motivation is looking at this in terms of like um elevating visibility of our Collections
  
  Question: If the motivation is visibility, how will this be different?
3. peter_murray 23 Jan 2023
  
  in Public
  
  can have a pretty outsized carbon footprint and I'm wondering how you reconcile the vast amount of computational power necessary to accomplish this work and its negative impact on the environment and whether or not this is something you all are considering
  
  Question: has the project considered the energy impact?
  
  cryptocurrency mining climate change
4. peter_murray 23 Jan 2023
  
  in Public
  
  I'm really nervous about the idea that we would be selling what amounts to
  
  Question: Does it make sense for GLAM institutions to sell speculative assets?
  
  speculative assets into crypto space for our cultural heritage collections doesn't make a lot of sense to me
  
  Joe Lucia also points out that there are not a lot of trustworthy actors in the blockchain space.
5. peter_murray 23 Jan 2023
  
  in Public
  
  lib nft is an applied research project that asks and seeks to answer a fundamental empirical question can block tank blockchain technology and nft specifically facilitate the economically sustainable use storage long-term preservation and accessibility of a library's special Collections
  
  Research project question
  
  This is in the whitepaper.
6. peter_murray 23 Jan 2023
  
  in Public
  
  created an nft of a Nobel prize winning formula by Jim Allison and they were able to sell that for fifty thousand dollars as a singular item
  
  Sale of an NFT tied to a Nobel prize winning paper
  
  The Fourth Pillar (UC Berkeley, 2021) | Foundation. How much is 22 Ethereum in USD, via Currencio
7. peter_murray 23 Jan 2023
  
  in Public
  
  I think we have a lot of things in our collection that are undiscoverable
  
  NFTs to address a discoverability problem
  
  Can NFTs in a closed system provide more visibility to holdings?
8. peter_murray 23 Jan 2023
  
  in Public
  
  if you make a surrogate
  
  Deed of gift versus NFT digital surrogate ownership
  
  and then you do an nft it's a different type of ownership
  
  Old deeds of gifts may not cover the online posting of digital surrogates (and it sounds like the speakers have experience with this problem). And there are certainly needs for clarity around what an NFT "ownership" means relative to the original work.
  
  property rights
9. peter_murray 23 Jan 2023
  
  in Public
  
  the holders of the nfts is to receive presents or gifts or perks that Justin Bieber or whoever's managing I'm sure it's not himself to them
  
  NFTs for a limited-access perks club
  
  Rather than ownership, Michael Meth is proposing an opportunity for some kind of special engagement? Again, is there value here—if you don't hold the Mona Lisa or Da Vinci's "first touch"—that justifies the expense and overhead of an NFT infrastructure? Is the only one to make real money here the provider of that infrastructure?
10. peter_murray 23 Jan 2023
  
  in Public
  
  we've digitized that surrogate and you get into that and you buy that nft then you own a piece of that right and it's identifiable to just you
  
  NFT "ownership"
  
  The use of a blockchain transaction to link a wallet address with a URL has not been proven to transfer "ownership" (at least in a copyright sense). I suppose there is a sense of ownership in a closed system like as was done with the NBA Top Shot project.
11. peter_murray 23 Jan 2023
  
  in Public
  
  the blockchain itself has been proven to be resilient against any kinds of attacks
  
  The blockchain has never been hacked
  
  I think this is true? For a proof-of-work protocol such as Bitcoin, there has never been anything like a 51% miner attack. The rewriting of the Ethereum blockchain by community governance might be a hack.
12. peter_murray 22 Jan 2023
  
  in Public
  
  is there a way that we can use these collections in a way that is driven by us from within the library to create policies rules Etc that allow us to turn these unique collections that are already digitized in many cases or can be into the nfts that we're talking about and then find a way to maybe even monetize them
  
  Monetize digital assets
  
  What rules or polices could be encoded on the blockchain in any way that is more effective and cheaper than non-blockchain methods?
  
  No, I think "monetization" is the key... and as the current wave of NFT project failures show, only those extracting rent by owning the transactional infrastructure are making money in the long term.
13. peter_murray 22 Jan 2023
  
  in Public
  
  the underlying premise that we got to is if libraries already working on digital asset management and the blockchain is a way to manage digital assets is there not a connection
  
  Thesis for why blockchain and digital assets
  
  To what extent is NFT technology managing digital assets, and is that kind of management the same as how libraries manage digital assets? On the surface, these are barely related. An NFT, at best, signals ownership of a URL. (Since digital assets themselves are so big, no one puts the asset itself on the blockchain.) To what extent are libraries going to "trade" URLs? Management of a digital asset, for libraries, means so much more than this.
14. peter_murray 22 Jan 2023
  
  in Public
  
  The LibNFT Project: Leveraging Blockchain-Based Digital Asset Technology to Sustainably Preserve Distinctive Collections and Archives
  
  CNI Fall 2022 Project Briefings
  
  YouTube recording
  
  K. Matthew Dames, Edward H. Arnold Dean, Hesburgh Libraries and University of Notre Dame Press, University of Notre Dame, President, Association of Research Libraries
  
  Meredith Evans, President, Society of American Archivists
  
  Michael Meth, University Library Dean, San Jose State University
  
  Nearly 12 months ago, celebrities relentlessly touted cryptocurrency during Super Bowl television ads, urging viewers to buy now instead of missing out. Now, digital currency assets like Bitcoin and Ethereum are worth half what they were this time last year. We believe, however, that the broader public attention on cryptocurrency’s volatility obscures the relevance and applicability of non-fungible tokens (NFTs) within the academy. For example, Ingram has announced plans to invest in Book.io, a company that makes e-books available on the blockchain where they can be sold as NFTs. The famed auction house Christie’s launched Christie’s 3.0, a blockchain auction platform that is dedicated to selling NFT-based art, and Washington University in St. Louis and the University of Wyoming have invested in Strike, a digital payment provider built on Bitcoin’s Lightning Network. Seeking to advance innovation in the academy and to find ways to mitigate the costs of digitizing and digitally preserving distinctive collections and archives, the discussants have formed the LibNFT collaboration. The LibNFT project seeks to work with universities to answer a fundamental question: can blockchain technology generally, and NFTs specifically, facilitate the economically sustainable use, storage, long-term preservation, and accessibility of a library’s special collections and archives? Following up on a January 2022 Twitter Spaces conversation on the role of blockchain in the academy, this session will introduce LibNFT, discuss the project’s early institutional partners, and address the risks academic leaders face by ignoring blockchain, digital assets, and the metaverse.
  
  library technology non-fungible tokens digital archives
Visit annotations in context

Tags

climate change

cryptocurrency mining

non-fungible tokens

library technology

property rights

digital archives

Annotators

peter_murray

URL

docdrop.org/video/Y8ERjGOGfyo/
hbr.org hbr.org

How Generative AI Is Changing Creative Work

3
1. peter_murray 15 Jan 2023
  
  in Public
  
  At the cloud computing company VMWare, for example, writers use Jasper as they generate original content for marketing, from email to product campaigns to social media copy. Rosa Lear, director of product-led growth, said that Jasper helped the company ramp up our content strategy, and the writers now have time to do better research, ideation, and strategy.
  
  Generative AI for marketing makes for more productive writers
2. peter_murray 15 Jan 2023
  
  in Public
  
  Then, once a model generates content, it will need to be evaluated and edited carefully by a human. Alternative prompt outputs may be combined into a single document. Image generation may require substantial manipulation.
  
  After generation, results need evaluation
  
  Is this also a role of the prompt engineer? In the digital photography example, the artist spent 80 hours and created 900 versions as the prompts were fine-tuned.
  
  AI art
3. peter_murray 15 Jan 2023
  
  in Public
  
  To start with, a human must enter a prompt into a generative model in order to have it create content. Generally speaking, creative prompts yield creative outputs. “Prompt engineer” is likely to become an established profession, at least until the next generation of even smarter AI emerges.
  
  Generative AI requires prompt engineering, likely a new profession
  
  What domain experience does a prompt engineer need? How might this relate to relate to specialty in librarianship?
  
  generative artificial intelligence
Visit annotations in context

Tags

generative artificial intelligence

AI art

Annotators

peter_murray

URL

hbr.org/2022/11/how-generative-ai-is-changing-creative-work
bigthink.com bigthink.com

The mind-blowing stats on male inequality

6
1. peter_murray 09 Jan 2023
  
  in Public
  
  "HEAL jobs." So that's 'health, education, administration, and literacy.' Almost, if you like, the opposite side of the coin to STEM jobs- and that's where a lot of the jobs are coming from.
  
  HEAL jobs: Health, Education, Administration, and Literacy
  
  Complementary bundle of jobs to STEM, and projections are for a 3:1 creation of HEAL jobs versus STEM jobs by 2030.
2. peter_murray 09 Jan 2023
  
  in Public
  
  we've also seen a drop in the acquisition of skills, the kinds of skills and education that boys and men need. If boys don't get educated and men don't get skilled, they will struggle in the labor market. And across all of those domains, we've seen a downwards turn for men in the last four or five decades.
  
  Disadvantage in education turns into struggle to learn skils
3. peter_murray 09 Jan 2023
  
  in Public
  
  There's quite a fierce debate about the differences between male and female brains. And in adulthood, I think there's not much evidence that the brains are that different in ways that we should worry about, or that are particularly consequential. But where there's no real debate is in the timing of brain development. It is quite clear that girls brains develop more quickly than boys brains do, and that the biggest difference seems to occur in adolescence.
  
  Pre-frontal cortex develops faster in female brains
  
  The cumulative effect on this is that girls get a head start once the societal imposed impact of gender inequality is removed. Girls are rewarded for this higher level of control and boys are now at a disadvantage at the same grade levels.
4. peter_murray 09 Jan 2023
  
  in Public
  
  So if you look at the U.S., for example, in the average school district in the U.S., girls are almost a grade level ahead of boys in English, and have caught up in math. If we look at those with the highest GPA scores, the top 10%, two-thirds of those are girls. If we look at those at the bottom, two-thirds of those are boys.
  
  Impact when looking at secondary education statistics
5. peter_murray 09 Jan 2023
  
  in Public
  
  The overall picture is, that on almost every measure, at almost every age, and in almost every advanced economy in the world, the girls are leaving the boys way behind, and the women leaving the men.
  
  Top-line summary
6. peter_murray 09 Jan 2023
  
  in Public
  
  Male inequality, explained by an expert, Richard Reeves, Big Think
  
  Jan 4, 2023
  
  Modern males are struggling. Author Richard Reeves outlines the three major issues boys and men face and shares possible solutions.
  
  Boys and men are falling behind. This might seem surprising to some people, and maybe ridiculous to others, considering that discussions on gender disparities tend to focus on the structural challenges faced by girls and women, not boys and men.
  
  But long-term data reveal a clear and alarming trend: In recent decades, American men have been faring increasingly worse in many areas of life, including education, workforce participation, skill acquisition, wages, and fatherhood.
  
  Gender politics is often framed as a zero-sum game: Any effort to help men takes away from women. But in his 2022 book Of Boys and Men, journalist and Brookings Institution scholar Richard V. Reeves argues that the structural problems contributing to male malaise affect everybody, and that shying away from these tough conversations is not a productive path forward.
  
  About Richard Reeves: Richard V. Reeves is a senior fellow at the Brookings Institution, where he directs the Future of the Middle Class Initiative and co-directs the Center on Children and Families. His Brookings research focuses on the middle class, inequality and social mobility.
  
  gender equality
Visit annotations in context

Tags

gender equality

Annotators

peter_murray

URL

bigthink.com/series/the-big-think-interview/male-inequality/
arxiv.org arxiv.org

Untitled document

4
1. peter_murray 09 Jan 2023
  
  in Public
  
  Defenders can design their own infrastructure to be im-mutable and ephemeral, as is becoming an emerging trend inprivate sector defense through the practice of Security ChaosEngineering
  
  Immutable and ephemeral as defensive measures
  
  Immutable: unchangeable infrastructure components, such as ssh access disabled by default.
  
  Ephemeral: short-lived servers for single processes, serverless infrastrucure
  
  serverless architecture
2. peter_murray 09 Jan 2023
  
  in Public
  
  We propose a Sludge Strategy for cyber defense that pri-oritizes investments into techniques, tools, and technologiesthat add friction into attacker workflows and raise the costof conducting operations
  
  "sludge" choice architecture for cybersecurity
  
  Scarce information, high monetary cost, psychological impact, and time cost.
3. peter_murray 09 Jan 2023
  
  in Public
  
  Thaler and Sunstein use the phrase “choice architecture” todescribe the design in which choices are presented to people.These choices are made easier by nudges and more difficultby sludge. To understand choice architecture, it is necessary toexamine both nudge and sludge. Nudges gently steer people ina direction that increases welfare, including cybersecurity, andare commonly intended to make good outcomes easy. Tradi-tionally, nudges have been used to encourage well-intentionedusers to behave in a way that they are better off for doingso. These choices are not guaranteed, but research shows thatthey are selected more often.
  
  Define: "choice architecture" and "nudge" in that context
  
  Nudges include, for instance, password strength meters
4. peter_murray 09 Jan 2023
  
  in Public
  
  Dykstra, J., Shortridge, K., Met, J., & Hough, D. (2022). Sludge for Good: Slowing and Imposing Costs on Cyber Attackers. arXiv. https://doi.org/10.48550/arXiv.2211.16626
  
  Choice architecture describes the design by which choices are presented to people. Nudges are an aspect intended to make "good" outcomes easy, such as using password meters to encourage strong passwords. Sludge, on the contrary, is friction that raises the transaction cost and is often seen as a negative to users. Turning this concept around, we propose applying sludge for positive cybersecurity outcomes by using it offensively to consume attackers' time and other resources. To date, most cyber defenses have been designed to be optimally strong and effective and prohibit or eliminate attackers as quickly as possible. Our complimentary approach is to also deploy defenses that seek to maximize the consumption of the attackers' time and other resources while causing as little damage as possible to the victim. This is consistent with zero trust and similar mindsets which assume breach. The Sludge Strategy introduces cost-imposing cyber defense by strategically deploying friction for attackers before, during, and after an attack using deception and authentic design features. We present the characteristics of effective sludge, and show a continuum from light to heavy sludge. We describe the quantitative and qualitative costs to attackers and offer practical considerations for deploying sludge in practice. Finally, we examine real-world examples of U.S. government operations to frustrate and impose cost on cyber adversaries.
  
  Found via author post: Kelly Shortridge: "How can we waste attackers’ ti…" - Hachyderm.io
  
  cybersecurity
Visit annotations in context

Tags

serverless architecture

cybersecurity

Annotators

peter_murray

URL

arxiv.org/pdf/2211.16626.pdf
inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? "1F99COn the Dangers of Stochastic Parrots: Can Language Models Be Too Big? "1F99C

5
1. peter_murray 09 Jan 2023
  
  in Public
  
  Birhane andPrabhu note, echoing Ruha Benjamin [ 15 ], “Feeding AI systems onthe world’s beauty, ugliness, and cruelty, but expecting it to reflectonly the beauty is a fantasy.”
  
  LMs can't return only the good parts
  
  Large, untrained LMs get the good, bad, and ugly, so it is illogical to expect it to return only the good.
2. peter_murray 09 Jan 2023
  
  in Public
  
  When such consumers therefore mistake the meaning attributed tothe MT output as the actual communicative intent of the originaltext’s author, real-world harm can ensue.
  
  Harm from Machine Translation (MT) models
  
  MT models can create fluent and coherent blocks of text that mask the meaning in the original text and the intent of the original speaker.
  
  machine language translation
3. peter_murray 09 Jan 2023
  
  in Public
  
  humancommunication relies on the interpretation of implicit meaningconveyed between individuals. The fact that human-human com-munication is a jointly constructed activity [29 , 128] is most clearlytrue in co-situated spoken or signed communication, but we usethe same facilities for producing language that is intended for au-diences not co-present with us (readers, listeners, watchers at adistance in time or space) and in interpreting such language whenwe encounter it. It must follow that even when we don’t know theperson who generated the language we are interpreting, we build apartial model of who they are and what common ground we thinkthey share with us, and use this in interpreting their words.
  
  Human-to-human communication is based on each building a model of the other
  
  The intention and interpretation of language relies on common ground, and so communication requires each party to understand the perspective of the other. LM-generated text offers no such counter-party.
4. peter_murray 09 Jan 2023
  
  in Public
  
  However, no actual language understanding is taking place inLM-driven approaches to these tasks, as can be shown by carefulmanipulation of the test data to remove spurious cues the systemsare leveraging [ 21 , 93 ]. Furthermore, as Bender and Koller [ 14 ]argue from a theoretical perspective, languages are systems ofsigns [ 37 ], i.e. pairings of form and meaning. But the training datafor LMs is only form; they do not have access to meaning. Therefore,claims about model abilities must be carefully characterized.
  
  NLP is not Natural Language Understanding
5. peter_murray 09 Jan 2023
  
  in Public
  
  In this section,we discuss how large, uncurated, Internet-based datasets encodethe dominant/hegemonic view, which further harms people at themargins, and recommend significant resource allocation towardsdataset curation and documentation practices.
  
  Issues with training data
  
  Size doesn't guarantee diversity
  
  Static data versus changing social views
  
  Encoding bias
  
  Curation, documentation, and accountability
Visit annotations in context

Tags

machine language translation

Annotators

peter_murray

URL

inst-fs-iad-prod.inscloudgate.net/files/cf4622a4-ec28-4c20-97e7-3d5b9da4852a/2021-bender-parrots.pdf
docdrop.org docdrop.org

Video: The Great Crypto Scam. (DocDrop)

12
1. peter_murray 08 Jan 2023
  
  in Public
  
  the world of crypto offers an incentive for VC firms to invest in a crypto company receive a percentage of their tokens and then sell those tokens to
  
  Sceptics: Crypto companies offer unregulated securities that allow for returns in months rather than years
  
  retail traiders when it becomes publicly available
2. peter_murray 08 Jan 2023
  
  in Public
  
  axi embodies a new generation of games
  
  Axie Infinity play-to-earn game
  
  Funded by a16z, this game allowed character owners to farm out their characters to "scholars" who would play them for a cut of the player's earnings. It has been accused of relying on predatory mechanisms to extract rent from lower demographic populations.
  
  video games
3. peter_murray 08 Jan 2023
  
  in Public
  
  Andries and Horowitz otherwise known as a16z they're one of the largest Venture Capital firms they have a crypto fund with around 7.6 billion to be invested in crypto and web 3 startups and have been investing in crypto companies dating back to 2013
  
  Andreessen Horowitz (a16z)
  
  Venture Capital behind many of the crypto projects. Think of crypto as having social, cultural, and technological innovation.
  
  Andreessen-Horowitz
4. peter_murray 08 Jan 2023
  
  in Public
  
  I mean people suggested that you could replace legal contracts with small contracts which are programs that are built on the blockchain and that's usually accompanied with the phrase coder's law this is a smart contract and this is a legal contract these two things aren't the same right you can't have law be enacted by computer code because law inherently requires third parties to assess evidence intentions and a bunch of other variables that you just can't Outsource
  
  Fundamental difference between legal contracts and "smart contracts"
  
  Legal contracts are subject to judgement of evidence and intention. "Code as law" can't do that.
  
  smart contracts
5. peter_murray 08 Jan 2023
  
  in Public
  
  now web 3.0 is coming
  
  Web 3.0 as the next cryptocurrency pumping scheme
  
  A combination of services with blockchains and tokens at some fundamental level. It is being pumped as a replacement for big tech social media that is "Web 2.0"
  
  web3.0
6. peter_murray 08 Jan 2023
  
  in Public
  
  in some of the announcements from celebrities about their nft purchases there was this constant reference to a company called moonpay thanking them for help with purchasing
  
  Announced sponsorships by celebrities and influencers
  
  One such arrangement was with Moonpay. Earlier was the example of Bieber's manager.
  
  influencers
7. peter_murray 08 Jan 2023
  
  in Public
  
  people have accused that sale of essentially being a giant marketing stunt to increase the value of the B20 token
  
  Sale of Beeple's "Everydays: the first 5,000 days" as a marketing stunt for a digital museum
  
  Holders of the B20 token would be fractional owners of a set of digital art in an online museum. Value rose to $30/coin, then crashed.
  
  B20 (B20) Price, Charts, and News | Coinbase
8. peter_murray 08 Jan 2023
  
  in Public
  
  crypto kitties was probably one of the first nft projects to make it into the spotlight
  
  Crypto kitties as the first NFT project
  
  Each kitty image was a token—a "non fungible" token. It was traded as an asset; people bought them expecting them to go up in value. The supply outgrew demand and the market crashed.
  
  non-fungible tokens
9. peter_murray 08 Jan 2023
  
  in Public
  
  you could just use ethereum's blockchain to create your own cryptocurrencies or crypto assets as people called them these assets that existed without their own native blockchain were called tokens
  
  Tokens are assets without a native blockchain
  
  They leverage another blockchain, like Ethereum. Businesses launch a token with an initial coin offering for a project—explaining the purposes of the project with a whitepaper.
  
  Ethereum
10. peter_murray 08 Jan 2023
  
  in Public
  
  now Ponzi schemes are kept Alive by continuously finding new recruits and new markets to tap into so that money is continuously being poured into the scheme and so you start to see this similar incentive develop in the crypto space this desperate need to find a use case for this stuff and that use case has to be revolutionary enough to justify its Rising
  
  Rising value requirement needs people to have a reason to use bitcoin
  
  value
  
  If people are using Bitcoin as an investment instead of a currency, then people need to have a reason to sell it for more than what they bought it for. And that requires new money to come into the system.
11. peter_murray 08 Jan 2023
  
  in Public
  
  despite its failings Bitcoin still survived and is very much present to this day but that's just the thing the Bitcoin that we see today is a shell of its former identity where most of the purchasing of Bitcoin today happens on centralized exchanges that have to comply with the laws of centralized institutions in the instances where you've heard about Bitcoin or if you've ever been encouraged to buy Bitcoin under what context is that always framed
  
  Bitcoin, by 2014, is a shell of its former identity
  
  Most of the trading is happening on centralized exchanges that have to comply with laws. It starts being used not as a currency, but as an investment. It is the start of the Ponzi era.
12. peter_murray 08 Jan 2023
  
  in Public
  
  "The Great Crypto Scam." James Jani, Jan 1, 2023
  
  Bitcoin to Blockchains, to NFTs, to Web 3.0... it's time to find out if it's really all the hype or just part of one of the greatest scams in human history.
  
  Original video: https://youtu.be/ORdWE_ffirg
  
  bitcoin financial fraud
Visit annotations in context

Tags

smart contracts

influencers

Andreessen-Horowitz

video games

web3.0

non-fungible tokens

financial fraud

bitcoin

Ethereum

Annotators

peter_murray

URL

docdrop.org/video/ORdWE_ffirg/
Dec 2022
inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? "1F99COn the Dangers of Stochastic Parrots: Can Language Models Be Too Big? "1F99C

5
1. peter_murray 30 Dec 2022
  
  in Public
  
  For instance, GPT-2’s training data is sourced by scraping out-bound links from Reddit, and Pew Internet Research’s 2016 surveyreveals 67% of Reddit users in the United States are men, and 64%between ages 18 and 29.13 Similarly, recent surveys of Wikipediansfind that only 8.8–15% are women or girls [9].Furthermore, while user-generated content sites like Reddit,Twitter, and Wikipedia present themselves as open and accessibleto anyone, there are structural factors including moderation prac-tices which make them less welcoming to marginalized populations.
  
  Scraped data does not come from representative websites
2. peter_murray 30 Dec 2022
  
  in Public
  
  the voices of people most likely to hew toa hegemonic viewpoint are also more likely to be retained. In thecase of US and UK English, this means that white supremacist andmisogynistic, ageist, etc. views are overrepresented in the trainingdata, not only exceeding their prevalence in the general populationbut also setting up models trained on these datasets to furtheramplify biases and harms.
  
  Extreme positions are disproportionately represented in training data
3. peter_murray 30 Dec 2022
  
  in Public
  
  While the average human is responsible for an estimated 5t 퐶푂2푒per year,2 the authors trained a Transformer (big) model [136] withneural architecture search and estimated that the training procedureemitted 284t of 퐶푂2. Training a single BERT base model (withouthyperparameter tuning) on GPUs was estimated to require as muchenergy as a trans-American flight.
  
  Energy consumption on NLP model training
  
  Training a model cost 57 times the annual CO2 emissions of a single person.
  
  climate change
4. peter_murray 30 Dec 2022
  
  in Public
  
  we understand the term language model (LM) torefer to systems which are trained on string prediction tasks: that is,predicting the likelihood of a token (character, word or string) giveneither its preceding context or (in bidirectional and masked LMs)its surrounding context. Such systems are unsupervised and whendeployed, take a text as input, commonly outputting scores or stringpredictions.
  
  Definition of "Language Model"
  
  Notes that this is fundamentally a string prediction algorithm with unsupervised training.
5. peter_murray 30 Dec 2022
  
  in Public
  
  Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). Association for Computing Machinery, New York, NY, USA, 610–623. https://doi.org/10.1145/3442188.3445922
  
  natural language processing
Visit annotations in context

Tags

climate change

natural language processing

Annotators

peter_murray

URL

inst-fs-iad-prod.inscloudgate.net/files/cf4622a4-ec28-4c20-97e7-3d5b9da4852a/2021-bender-parrots.pdf
www.sciencealert.com www.sciencealert.com

ChatGPT Could Revolutionize The Internet, But Its Secrets Have Experts Worried

2
1. peter_murray 27 Dec 2022
  
  in Public
  
  OpenAI is perhaps one of the oddest companies to emerge from Silicon Valley. It was set up as a non-profit in 2015 to promote and develop "friendly" AI in a way that "benefits humanity as a whole". Elon Musk, Peter Thiel and other leading tech figures pledged US$1 billion towards its goals.Their thinking was we couldn't trust for-profit companies to develop increasingly capable AI that aligned with humanity's prosperity. AI therefore needed to be developed by a non-profit and, as the name suggested, in an open way.In 2019 OpenAI transitioned into a capped for-profit company (with investors limited to a maximum return of 100 times their investment) and took a US$1 billion investment from Microsoft so it could scale and compete with the tech giants.
  
  Origins of OpenAI
  
  First a non-profit started with funding from Musk, Theil, and others. It has since transitioned to a "capped for-profit company".
  
  OpenAI
2. peter_murray 27 Dec 2022
  
  in Public
  
  ChatGPT is a bit like autocomplete on your phone. Your phone is trained on a dictionary of words so it completes words. ChatGPT is trained on pretty much all of the web, and can therefore complete whole sentences – or even whole paragraphs.
  
  ChatGPT is like autocomplete
  
  chatbot
Visit annotations in context

Tags

chatbot

OpenAI

Annotators

peter_murray

URL

sciencealert.com/chatgpt-could-revolutionize-the-internet-but-its-secrets-have-experts-worried
www.nature.com www.nature.com

Twitter changed science — what happens now it’s in turmoil?

4
1. peter_murray 26 Dec 2022
  
  in Public
  
  Information scientist Stefanie Haustein at the University of Ottawa in Canada, who has studied the impact of Twitter on scientific communication, says the changes show why it’s concerning that scientists embraced a private, for-profit firm’s platform to communicate on. “We’re in the hands of actors whose main interest is not the greater good for scholarly communication,” she says.
  
  “Public square, private land”
  
  Competing values: a desire for openness versus a desire for profit. Satisfying the needs of users versus satisfying the needs of shareholders.
2. peter_murray 26 Dec 2022
  
  in Public
  
  despite Twitter’s self-styled reputation as a public town square — where everyone gathers to see the same messages — in practice, the pandemic showed how users segregate to follow mostly those with similar views, argues information scientist Oliver Johnson at the University of Bristol, UK. For instance, those who believed that COVID-19 was a fiction would tend to follow others who agreed, he says, whereas others who argued that the way to deal with the pandemic was to lock down for a ‘zero COVID’ approach were in their own bubble.
  
  Digital town square meats filter bubble effect
  
  During the COVID-19 pandemic, Twitter gave voice to researchers, but the platform’s algorithms allowed users to self sort into groups based on what they wanted to hear.
  
  filter bubble
3. peter_murray 26 Dec 2022
  
  in Public
  
  Rodrigo Costas Comesana, an information scientist at Leiden University in the Netherlands, and his colleagues published a data set of half a million Twitter users1 who are probably researchers. (The team used software to try to match details of Twitter profiles to those of authors on scientific papers.) In a similar, smaller 2020 study, Costas and others estimated that at least 1% of paper authors in the Web of Science had profiles on Twitter, with the proportion varying by country2. A 2014 Nature survey found that 13% of researchers used Twitter regularly, although respondents were mostly English-speaking and there would have been self-selection bias (see Nature 512, 126–129; 2014).
  
  Perhaps few researchers on Twitter
4. peter_murray 26 Dec 2022
  
  in Public
  
  Nature 613, 19-21 (2023)
  
  doi: https://doi.org/10.1038/d41586-022-04506-6
  
  Twitter scholarly communication
Visit annotations in context

Tags

Twitter

scholarly communication

filter bubble

Annotators

peter_murray

URL

nature.com/articles/d41586-022-04506-6
copyrightalliance.org copyrightalliance.org

Are Recipes and Cookbooks Protected by Copyright?

2
1. peter_murray 26 Dec 2022
  
  in Public
  
  Recipes are usually not protected by copyright due to the idea-expression dichotomy. The idea-expression dichotomy creates a dividing line between ideas, which are not protected by copyright law, and the expression of those ideas, which can be protected by copyright law.
  
  Copyright’s idea-expression dichotomy
  
  In the context of the fediverse thread: So the list of ingredients and the steps to reproduce a dish are not covered by U.S. Copyright law. If I described the directions in iambic pentameter, that expression is subject to copyright. But someone could reinterpret the directions in limerick form, and that expression would not violate the first copyright and itself be copyrightable.
2. peter_murray 26 Dec 2022
  
  in Public
  
  Found via a fediverse post by Carl Malamud: https://code4lib.social/@carlmalamud@official.resource.org/109574978910574688
  
  copyright
Visit annotations in context

Tags

copyright

Annotators

peter_murray

URL

copyrightalliance.org/ai-training-is-not-human-learning/
www.nytimes.com www.nytimes.com

A New Chat Bot Is a ‘Code Red’ for Google’s Search Business

1
1. peter_murray 23 Dec 2022
  
  in Public
  
  Three weeks ago, an experimental chat bot called ChatGPT made its case to be the industry’s next big disrupter. It can serve up information in clear, simple sentences, rather than just a list of internet links. It can explain concepts in ways people can easily understand. It can even generate ideas from scratch, including business strategies, Christmas gift suggestions, blog topics and vacation plans.
  
  ChatGPT's synthesis of information versus Google Search's list of links
  
  The key difference here, though, is that with a list of links, one can follow the links and evaluate the sources. With a ChatGPT response, there are no citations to the sources—just an amalgamation of statements that may or may not be true.
  
  chatbot Google search information literacy
Visit annotations in context

Tags

Google search

information literacy

chatbot

Annotators

peter_murray

URL

nytimes.com/2022/12/21/technology/ai-chatgpt-google-search.html
stratechery.com stratechery.com

AI Homework

2
1. peter_murray 13 Dec 2022
  
  in Public
  
  Here’s an example of what homework might look like under this new paradigm. Imagine that a school acquires an AI software suite that students are expected to use for their answers about Hobbes or anything else; every answer that is generated is recorded so that teachers can instantly ascertain that students didn’t use a different system. Moreover, instead of futilely demanding that students write essays themselves, teachers insist on AI. Here’s the thing, though: the system will frequently give the wrong answers (and not just on accident — wrong answers will be often pushed out on purpose); the real skill in the homework assignment will be in verifying the answers the system churns out — learning how to be a verifier and an editor, instead of a regurgitator. What is compelling about this new skillset is that it isn’t simply a capability that will be increasingly important in an AI-dominated world: it’s a skillset that is incredibly valuable today. After all, it is not as if the Internet is, as long as the content is generated by humans and not AI, “right”; indeed, one analogy for ChatGPT’s output is that sort of poster we are all familiar with who asserts things authoritatively regardless of whether or not they are true. Verifying and editing is an essential skillset right now for every individual.
  
  What homework could look like in a ChatGPT world
  
  Critical editing becomes a more important skill than summation. When the summation synthesis comes for free, students distinguish themselves by understanding what is correct and correcting what is not. Sounds a little bit like "information literacy".
  
  information literacy
2. peter_murray 12 Dec 2022
  
  in Public
  
  That there, though, also shows why AI-generated text is something completely different; calculators are deterministic devices: if you calculate 4,839 + 3,948 - 45 you get 8,742, every time. That’s also why it is a sufficient remedy for teachers to requires students show their work: there is one path to the right answer and demonstrating the ability to walk down that path is more important than getting the final result. AI output, on the other hand, is probabilistic: ChatGPT doesn’t have any internal record of right and wrong, but rather a statistical model about what bits of language go together under different contexts. The base of that context is the overall corpus of data that GPT-3 is trained on, along with additional context from ChatGPT’s RLHF training, as well as the prompt and previous conversations, and, soon enough, feedback from this week’s release.
  
  Difference between a calculator and ChatGPT: deterministic versus probabilistic
  
  chatbot educational technology
Visit annotations in context

Tags

information literacy

chatbot

educational technology

Annotators

peter_murray

URL

stratechery.com/2022/ai-homework/
theintercept.com theintercept.com

The Internet’s New Favorite AI Proposes Torturing Iranians and Surveilling Mosques

2
1. peter_murray 12 Dec 2022
  
  in Public
  
  The criticisms of ChatGPT pushed Andreessen beyond his longtime position that Silicon Valley ought only to be celebrated, not scrutinized. The simple presence of ethical thinking about AI, he said, ought to be regarded as a form of censorship. “‘AI regulation’ = ‘AI ethics’ = ‘AI safety’ = ‘AI censorship,’” he wrote in a December 3 tweet. “AI is a tool for use by people,” he added two minutes later. “Censoring AI = censoring people.” It’s a radically pro-business stance even by the free market tastes of venture capital, one that suggests food inspectors keeping tainted meat out of your fridge amounts to censorship as well.
  
  Marc Andreessen objects to AI regulation
2. peter_murray 12 Dec 2022
  
  in Public
  
  It’s tempting to believe incredible human-seeming software is in a way superhuman, Block-Wehba warned, and incapable of human error. “Something scholars of law and technology talk about a lot is the ‘veneer of objectivity’ — a decision that might be scrutinized sharply if made by a human gains a sense of legitimacy once it is automated,” she said.
  
  Veneer of Objectivity
  
  Quote by Hannah Bloch-Wehba, TAMU law professor
  
  automated decision systems
Visit annotations in context

Tags

automated decision systems

Annotators

peter_murray

URL

theintercept.com/2022/12/08/openai-chatgpt-ai-bias-ethics/
media.dltj.org media.dltj.org

"The Divided Dial" uncorrected transcript (99% Invisible)

5
1. peter_murray 12 Dec 2022
  
  in Public
  
  The numbers have gone up a bit since then, but black Americans still own less than 2% of commercial radio stations in the country and it wreaked havoc on local ownershi
  
  2% of commercial radio stations are black-owned in the 2010s
2. peter_murray 12 Dec 2022
  
  in Public
  
  Shock jocks were rarely political, especially in the early days were mostly just kind of like lewd or gross, shocking, you know, as is in the name, But as that sort of brash new style got popular, it became clear that political talk could bring that shock Jock energy to program. This is Alan Berg. He was a liberal talk radio host out of Denver. He got started in the late 1970s and was really taking off in the The 80s. He was Jewish and was known for being pretty vitriolic and calling out racism and bigotry are still firmly in control of the Soviet Union, responsible for the murder of 50 million Christians. Think your ability to reason and your program and you are a Nazi by your very own, given what we know about AM and FM talk radio. Now, this is very surprising to hear. I mean, it's got the in your face confrontational talk show vibe, but it's from the opposite side of the political spectrum, totally. It is surprising and to paint a picture of how he was received. At one point, there's this pole that goes out in Denver that asks residents to name the city's most beloved media personality and its most despised. And Alan Berg won both awards. That's a feat that's kind of incredible. And Alan Berg was super well known.
  
  Alan Berg, recognized as the first widely distributed political talk radio voice
  
  Hosting a liberal talk radio show out of Denver, Colorado. He is murdered by someone who called him on the air and who Berg called a Nazi.
3. peter_murray 12 Dec 2022
  
  in Public
  
  the FCC starts giving broadcasters recommendations of how they can avoid that same fate, how they can satisfy that longstanding vague requirement to serve the public interest. And they really start pushing this thing called Ascertainment Ascertainment Swor when people in local communities were interviewed by station officials, people who we're never asked before, what do you think ought to be on radio, what do you think ought to be on tv. Now they were being asked these questions, this was done by radio stations and television stations, commercial stations, public stations across the country, You know, this seems so simple and so revolutionary at the same time, like just ask people what they want to hear about and maybe that would shape the broadcasting accordingly.
  
  FCC Ascertainment guidance
4. peter_murray 12 Dec 2022
  
  in Public
  
  So it's the fifties and sixties, the stations, they're cutting the NBC coverage of the civil rights movement and it's not just, you know, morally dubious, it's actually against the policies of the FCC. Exactly, right. So civil rights activists decided to put that to the test and they ended up challenging Wlbt? S license for repeatedly denying them airtime. At first, the FCC dismissed the case, but then the activists sued the FCC and they won. And eventually, years later, a federal court decided that Wlbt could stay on the air, but their license would be transferred to a nonprofit, multiracial group of broadcasters.
  
  Enforcement of the FCC Fairness Doctrine
  
  The case was decided at the D.C. Circuit court level with the future Justice Warren Berger writing the opinion. The opinion forced action at the FCC.
  
  Office of Commun., United Ch., Christ v. FCC. 425 F.2d 543 (D.C. Cir. 1969). (see https://casetext.com/case/office-of-commun-united-ch-christ-v-fcc)
  
  United Church of Christ Federal Communications Commission
5. peter_murray 12 Dec 2022
  
  in Public
  
  Okay, so flashback to the 1920s and the emergence of something called the public interest mandate, basically when radio was new, a ton of people wanted to broadcast the demand for space on the dial outstripped supply. So to narrow the field, the federal government says that any station using the public airwaves needs to serve the public interest. So what do they mean by the public interest? Yeah, right? It's like super vague, right? But the FCC clarified what it meant by public interest in the years following World War Two, They had seen how radio could be used to promote fascism in Europe, and they didn't want us radio stations to become propaganda outlets. And so in 1949, the FCC basically says to stations in order to serve the public, you need to give airtime to coverage of current events and you have to include multiple perspectives in your coverage. This is the basis of what comes to be known as the fairness doctrine.
  
  Origin of the FCC Fairness Doctrine
  
  broadcast media
Visit annotations in context

Tags

broadcast media

Federal Communications Commission

United Church of Christ

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20221212T150227-99-_Invisible--The_Divided_Dial/index.html
www.washingtonpost.com www.washingtonpost.com

Analysis | A Twitter data tracker inhabits tens of thousands of websites

1
1. peter_murray 08 Dec 2022
  
  in Public
  
  The presence of Twitter’s code — known as the Twitter advertising pixel — has grown more troublesome since Elon Musk purchased the platform.AdvertisementThat’s because under the terms of Musk’s purchase, large foreign investors were granted special privileges. Anyone who invested $250 million or more is entitled to receive information beyond what lower-level investors can receive. Among the higher-end investors include a Saudi prince’s holding company and a Qatari fund.
  
  Twitter investors may get access to user data
  
  I'm surprised but not surprised that Musk's dealings to get investors in his effort to take Twitter private may include sharing of personal data about users. This article makes it sound almost normal that this kind of information-sharing happens with investors (inclusion of the phrase "information beyond what lower-level investors can receive").
  
  Twitter digital privacy
Visit annotations in context

Tags

Twitter

digital privacy

Annotators

peter_murray

URL

washingtonpost.com/politics/2022/12/08/twitter-data-tracker-inhabits-tens-thousands-websites/
newrepublic.com newrepublic.com

We Must Fight for a Better America. We Have No Choice.

2
1. peter_murray 08 Dec 2022
  
  in Public
  
  Instead, we should adopt a vision for the best possible America for this century, one that acknowledges that people, money, goods, and expressions are going to flow across borders and oceans, but that embraces justice and human flourishing as the ends of that process instead of morally vacant values like efficiency or productivity.
  
  Vision for America after the currrent constitutional crisis
2. peter_murray 08 Dec 2022
  
  in Public
  
  We often misdiagnose our current malady as one of “polarization.” That’s wrong. We have one rogue, ethno-authoritarian party and one fairly stable and diverse party. It just looks like polarization when you map it red and blue or consider these parties to be equal in levels of mercenary commitment, which they overwhelmingly are not. In one sense, America has always been polarized, just not along partisan lines. It’s also been more polarized rather recently, as in 1919 or 1968.Instead, we suffer from judicial tyranny fueled by white supremacy. One largely unaccountable branch of government has been captured by ideologues who have committed themselves to undermining the will of the electorate on matters ranging from women’s bodily autonomy to voting rights to the ability of the executive branch to carry out the policy directives of Congress by regulating commerce and industry.
  
  Thesis: not polarization but white-supremacy-filled judicial tyranny
  
  It isn’t clear to me that the judiciary is filled with white suprematists, but the judiciary is increasingly swinging conservative appointed by far right ideologues fueled by white suprematism.
  
  culture
Visit annotations in context

Tags

culture

Annotators

peter_murray

URL

newrepublic.com/article/168881/must-fight-better-america-no-choice
dl.acm.org dl.acm.org

The design philosophy of the DARPA internet protocols

18
1. peter_murray 06 Dec 2022
  
  in Public
  
  While the datagram has served veIy well in solving themost important goals of the Internet, it has not served sowell when we attempt to addresssome of the goals whichwere further down the priority list. For example, thegoals of resource management and accountability haveproved difficult to achieve in the context of datagrams.As the previous section discussed, most datagrams are apart of some sequence of packets from source todestination, rather than isolated units at the applicationlevel. However, the gateway cannot directly see theexistence of this sequence, because it is forced to dealwith each packet in isolation. Therefore, resourcemanagement decisions or accounting must be done oneach packet separately. Imposing the datagram model onthe intemet layer has deprived that layer of an importantsource of information which it could use in achievingthese goals.
  
  Datagrams solved the higher priority goals
  
  ...but 34 years later we have the same challenges with the lower priority goals.
2. peter_murray 06 Dec 2022
  
  in Public
  
  There is a mistaken assumption often associated withdatagrams, which is that the motivation for datagrams isthe support of a higher level service which is essentiallyequivalent to the datagram. In other words, it hassometimes been suggested that the datagram is providedbecause the transport service which the applicationrequires is a datagram service. In fact, this is seldom thecase. While some applications in the Internet, such assimple queries of date servers or name servers, use anaccess method based on an unreliable datagram, mostservices within the Internet would like a moresophisticated transport model than simple datagram.Some services would like the reliability enhanced, somewould like the delay smoothed and buffered, but almostall have some expectation more complex than adatagram. It is important to understand that the role ofthe datagram in this respect is as a building block, and notas a service in itself.
  
  Datagram as the fundamental building block
  
  Is it any wonder then that QUIC—a TCP-like stateful connection—is being engineered using UDP?
  
  quic
3. peter_murray 06 Dec 2022
  
  in Public
  
  This problem was particularly aggravating because thegoal of the Internet project was to produce specificationdocuments which were to become military standards. Itis a well known problem with government contractingthat one cannot expect a contractor to meet any criteriawhich is not a part of the procurement standard. If theInternet is concerned about performance, therefore, it wasmandatory that performance requirements be put into theprocurement specification. It was trivial to inventspecifications which constrained the performance, forexample to specify that the implementation must becapable of passing 1.000 packets a second. However, thissort of constraint could not be part of the architecture,and it was therefore up to the individual performing theprocurement to recognize that these performanceconstraints must be added to the specification, and tospecify them properly to achieve a realization whichprovides the required types of service
  
  Procurement standards meet experimental factors
  
  I'm finding it funny to read this artifact of its time with the construction of the protocol standards and the purchase of hardware that met that standard. As I can imagine, it was all new in the 1970s and 1980s, and it was evolving quickly. Procurement rules are a pain no matter what the decade they are in.
4. peter_murray 06 Dec 2022
  
  in Public
  
  Put another way, the architecture tried very hard not toconstrain the range of service which the Internet could beengineered to provide. This, in turn, means that tounderstand the service which can be offered by aparticular implementation of an Internet, one must looknot to the architecture, but to the actual engineering of thesoftware within the particular hosts and gateways, and tothe particular networks which have been incorporated.
  
  Upper part of the hourglass protocol shape
5. peter_murray 06 Dec 2022
  
  in Public
  
  Another possible source of inefficiency is retransmissionof lost packets. Since Internet does not insist that lostpackets be recovered at the network level, it may benecessary to retransmit a lost packet from one end of theInternet to the other. This means that the retransmittedpacket may cross several intervening nets a second time,whereas recovery at the network level would not generatethis repeat traffic. This is an example of the tradeoffresulting from the decision, discussed above, of providingservices from the end-points. The network interface codeis much simpler, but the overall efficiency is potentiallyless. However, if the retransmission rate is low enough(for example, 1%) then the incremental cost is tolerable.As a rough rule of thumb for networks incorporated intothe architecture, a loss of one packet in a hundred is quitereasonable, but a loss of one packet in ten suggests thatreliability enhancements be added to the network if thattype of service is required.
  
  Inefficiency of end-to-end packet re-transmission is accepted
6. peter_murray 06 Dec 2022
  
  in Public
  
  On the other hand, some of the most significant problemswith the Internet today relate to lack of sufficient tools fordistributed management,especially in the area of routing.In the large intemet being currently operated, routingdecisions need to be constrained by policies for resourceusage. Today this can be done only in a very limitedway, which requires manual setting of tables. This iserror-prone and at the same time not sufficientlypowerful. The most important change in the Internetarchitecture over the next few years will probably be thedevelopment of a new generation of tools formanagement of resources in the context of multipleadministrations.
  
  Internet routing problems
  
  This was written in 1988, and is still somewhat true today.
  
  internet routing
7. peter_murray 06 Dec 2022
  
  in Public
  
  This goal caused TCP and IP, which originally had beena single protocol in the architecture, to be separated intotwo layers. TCP provided one particular type of service,the reliable sequenceddata stream, while IP attempted toprovide a basic building block out of which a variety oftypes of service could be built. This building block wasthe datagram, which had also been adopted to supportsurvivability. Since the reliability associated with thedelivery of a datagram was not guaranteed, but “besteffort,” it was possible to build out of the datagram aservice that was reliable (by acknowledging andretransmitting at a higher level), or a service which tradedreliability for the primitive delay characteristics of theunderlying network substrate. The User DatagramProtocol (UDP)13 was created to provide a application-level interface to the basic datagram service of Internet.
  
  Origin of UDP as the split of TCP and IP
  
  This is the center of the hourglass protocol stack shape.
  
  hourglass protocol stack shape
8. peter_murray 06 Dec 2022
  
  in Public
  
  It was very important for the success of the Internetarchitecture that it be able to incorporate and utilize awide variety of network technologies, including militaryand commercial facilities. The Internet architecture hasbeen very successful in meeting this goal: it is operatedover a wide variety of networks, including long haul nets(the ARPANET itself and various X.25 networks), localarea nets (Ethernet, ringnet, etc.), broadcast satellite nets(the DARPA Atlantic Satellite Network’“, I5 operating at64 kilobits per second and the DARPA ExperimentalWideband Satellite Net,16 operating within the UnitedStates at 3 megabits per second), packet radio networks(the DARPA packet radio network, as well as anexperimental British packet radio net and a networkdeveloped by amateur radio operators), a variety of seriallinks, ranging from 1200 bit per second asynchronousconnections to TI links, and a variety of other ad hocfacilities, including intercomputer busses and thetransport service provided by the higher layers of othernetwork suites, such as IBM’s HASP.
  
  Lower part of the hourglass protocol stack shape
9. peter_murray 06 Dec 2022
  
  in Public
  
  Another service which did not fu TCP was real timedelivery of digitized speech, which was needed to supportthe teleconferencing aspect of command and controlapplications. III real time digital speech, the primaryrequirement is not a reliable service, but a service whichminimizes and smooths the delay in the delivery ofpackets.
  
  Considerations for digital speech in 1988
10. peter_murray 06 Dec 2022
  
  in Public
  
  There are two consequencesto the fate-sharing approachto survivability. First. the intermediate packet switchingnodes, or gateways, must not have any essential stateinformation about on-going connections. Instead, theyare stateless packet switches, a class of network designsometimes called a “datagram” network. Secondly, rathermore trust is placed in the host machine than in anarchitecture where the network ensures the reliabledelivery of data. If the host resident algorithms thatensure the sequencing and acknowledgment of data fail,applications on that machine are prevented fromoperation.
  
  Fate-sharing approach to survivability
11. peter_murray 06 Dec 2022
  
  in Public
  
  It was an assumption in thisarchitecture that synchronization would never be lostunless there was no physical path over which any sort ofcommunication could be achieved. In other words, at thetop of transport, there is only one failure, and it is totalpartition. The architecture was to mask completely anytransient failure.
  
  Never a failure until there was no path
  
  I remember being online for the Northridge earthquake in the Los Angeles area in January 1994. IRC was a robust tool for getting information in and out: text-based (so low bandwidth), ability to route around circuit failure.
12. peter_murray 06 Dec 2022
  
  in Public
  
  For example,since this network was designed to operate in a militarycontext, which implied the possibility of a hostileenvironment, survivability was put as a first goal, andaccountability as a last goal. During wartime. one is lessconcerned with detailed accounting of resources usedthan with mustering whatever resources are available andrapidly deploying them it-i an operational manner. Whilethe architects of the Internet were mindful ofaccountability, the problem received very little attentionduring the early stages of the design. aud is only nowbeing considered. An architecture primarily forcommercial deployment would clearly place these goalsat the opposite end of the list.
  
  Military context first
  
  In order of priority, a network designed to be resilient in a hostile environment is more important than a network that has an accountable architecture. The paper even goes on to say that a commercial network would have a different architecture.
13. peter_murray 06 Dec 2022
  
  in Public
  
  From these assumptions comes the fundamental structureof the Internet: a packet switched communicationsfacility in which a number of distinguishable networksam connected together using packet communicationsprocessors called gateways which implement a store aridforward packet forwarding algorithm.
  
  Fundamental structure of the internet
  
  Effective network characteristics (in order of importance, from the paper):
  
  Internet communication must continue despite loss of networks or gateways.
  
  The Internet must support multiple types of communications service.
  
  The Internet architecture must accommodate a variety of networks.
  
  The Internet architecture must permit distributed management of its resources.
  
  The Internet architecture must be cost effective.
  
  The Internet architecture must permit host attachment with a low level of effort.
  
  The resources used in the Intemet architecture must be accountable.
14. peter_murray 06 Dec 2022
  
  in Public
  
  The technique selected for multiplexing was packetswitching. Au alternative such as circuit switching couldhave been considered, but the applications beingsupported, such as remote login, were naturally served bythe packet switching paradigm, and the networks whichwere to be integrated together in this project were packetswitching networks. So packet switching was acceptedas a fundamental component of the Internet architecture.
  
  Packet-switched versus circuit-switched
  
  The first networks were packet-switched over circuits. (I remember the 56Kbps circuit modems that were upgraded to T1 lines.) Of course, it has switched now—circuits are emulated over packet switched networks.
  
  internet infrastructure
15. peter_murray 06 Dec 2022
  
  in Public
  
  Further, networks representadministrative boundaries of control, and it was anambition of this project to come to grips with the problemof integrating a number of separately administratedentities into a common utility.
  
  Integrating separately administered networks
  
  This is prefaced with the word "further" but I think it was perhaps more key to the ultimate strength of the "inter-net" that this agreement about interconnectivity was a key design principle. The devolution of control and the rise of the internet exchange points (IXPs) certainly fueled growth faster than a top-down approach would have.
16. peter_murray 06 Dec 2022
  
  in Public
  
  The components of the Internet were networks, whichwere to be interconnected to provide some larger service.The original goal was to connect together the ori BinalARPANET’ with the ARPA packet radio network’. ‘, inorder to give users on the packet radio network accesstothe large service machines on the ARPANET.
  
  Original goal to connect ARPA packet radio network with ARPANET
  
  I hadn't heard this before. As I was coming up in my internet education in the late 1980s, I remember discussions about connectivity with ALOHAnet in Hawaii.
17. peter_murray 06 Dec 2022
  
  in Public
  
  The connectionless configuration of IS0protocols has also been colored by the history of theInternet suite, so an understanding ‘of the Internet designphilosophy may be helpful to those working with ISO.
  
  ISO protocols
  
  At one point, the Open Systems Interconnection model (OSI model) was the leading contender for the network standard. It didn't survive in competition with the more nimble TCP/IP stack design.
  
  OSI network model
18. peter_murray 06 Dec 2022
  
  in Public
  
  D. Clark. 1988. The design philosophy of the DARPA internet protocols. In Symposium proceedings on Communications architectures and protocols (SIGCOMM '88). Association for Computing Machinery, New York, NY, USA, 106–114. https://doi.org/10.1145/52324.52336
  
  The Internet protocol suite, TCP/IP, was first proposed fifteen years ago. It was developed by the Defense Advanced Research Projects Agency (DARPA), and has been used widely in military and commercial systems. While there have been papers and specifications that describe how the protocols work, it is sometimes difficult to deduce from these why the protocol is as it is. For example, the Internet protocol is based on a connectionless or datagram mode of service. The motivation for this has been greatly misunderstood. This paper attempts to capture some of the early reasoning which shaped the Internet protocols.
  
  hourglass protocol stack shape internet history
Visit annotations in context

Tags

quic

internet history

OSI network model

internet routing

internet infrastructure

hourglass protocol stack shape

Annotators

peter_murray

URL

dl.acm.org/doi/pdf/10.1145/52325.52336
news.ycombinator.com news.ycombinator.com

Elon: Let Me Help You Speed Run the Content Moderation Learning Curve | Hacker News

2
1. peter_murray 04 Dec 2022
  
  in Public
  
  Just for reference I believe that the "more speech" idea originated with Louis Brandeis, who was a brilliant thinker and one of the important liberal Supreme Court justices of the 20th century. The actual quote is:"If there be time to expose through discussion the falsehood and fallacies, to avert the evil by the process of education, the remedy to be applied is more speech, not enforced silence."[0]Louis Brandeis did believe that context and specifics are important, so I think the technology of the online platform is significant especially with respect to the first part of that quote.[0] https://tile.loc.gov/storage-services/service/ll/usrep/usrep...
  
  Brandeis more speech quote context
2. peter_murray 04 Dec 2022
  
  in Public
  
  The hypothesis is that hate speech is met with other speech in a free marketplace of ideas.That hypothesis only functions if users are trapped in one conversational space. What happens instead is that users choose not to volunteer their time and labor to speak around or over those calling for their non-existence (or for the non-existence of their friends and loved ones) and go elsewhere... Taking their money and attention with them.As those promulgating the hate speech tend to be a much smaller group than those who leave, it is in the selfish interest of most forums to police that kind of signal jamming to maximize their possible user-base. Otherwise, you end up with a forum full mostly of those dabbling in hate speech, which is (a) not particularly advertiser friendly, (b) hostile to further growth, and (c) not something most people who get into this gig find themselves proud of.
  
  Battling hate speech is different when users aren't trapped
  
  When targeted users are not trapped on a platform, they have the choice to leave rather than explain themselves and/or overwhelm the hate speech. When those users leave, the platform becomes less desirable for others (the concentration of hate speech increases) and it becomes a vicious cycle downward.
  
  content moderation
Visit annotations in context

Tags

content moderation

Annotators

peter_murray

URL

news.ycombinator.com/item
themarkup.org themarkup.org

Meta Sued for Collecting Financial Information Through Tax Filing Websites – The Markup

2
1. peter_murray 03 Dec 2022
  
  in Public
  
  The lawsuit, filed on Dec. 1 in federal court
  
  Doe v. Meta Platforms, Inc., 3:22-cv-07557 – CourtListener.com
2. peter_murray 03 Dec 2022
  
  in Public
  
  Meta's receipt of tax information via tracking pixels on tax preparer websites is the subject of a federal lawsuit. The tax preparing sites are not participants in the lawsuit (yet?).
  
  Meta digital privacy
Visit annotations in context

Tags

digital privacy

Meta

Annotators

peter_murray

URL

themarkup.org/pixel-hunt/2022/12/02/meta-sued-for-collecting-financial-information-through-tax-filing-websites
escapingtech.com escapingtech.com

I Was Wrong About Mastodon – EscapingTech

3
1. peter_murray 02 Dec 2022
  
  in Public
  
  The ability for users to choose if they wish to be collateral damage is what makes Mastodon work. If an instance is de-federated due to extremism, the users can pressure their moderators to act in order to gain re-federation. Otherwise, they must make the decision if to go down with the ship or simply move. This creates a healthy self-regulating ecosystem where once an instance starts to get de-federated, reasonable users will move their accounts, leaving behind unreasonable ones, which further justifies de-federation, and will lead to more and more instances choosing to de-federate the offending one.
  
  De-federation feedback loop
  
  If an instance owner isn't moderating effectively, other instances will start de-federating. Users on the de-federated instance can "go down with the ship or simply move". When users move off an instance, it increases the concentration of bad actors on that instance and increases the likelihood that others will de-federate.
2. peter_murray 02 Dec 2022
  
  in Public
  
  Most Mastodon servers run on donations, which creates a very different dynamic. It is very easy for a toxic platform to generate revenue through ad impressions, but most people are not willing to pay hard-earned money to get yelled at by extremists all day. This is why Twitter’s subscription model will never work. With Mastodon, people find a community they enjoy, and thus are happy to donate to maintain. Which add a new dynamic. Since Mastodon is basically a network of communities, it is expected that moderators are responsible for their own community, lowering the burden for everyone. Let’s say you run a Mastodon instance and a user of another instance has become problematic towards your users. You report them to their instance’s moderators, but the moderators decline to act. What can you do? Well a lot, actually.
  
  Accountability economy
  
  Assuming instance owners want their instance to thrive, they are accountable to the users—who are also donating funds to run the server. Mastodon also provides easy ways to block users or instances, and if bad actors start populating an instance, the instance gets a bad name and is de-federated by others. Users on the de-federated instance now have the option to stick around or go to another instance so they are reachable again.
  
  Mastodon
3. peter_murray 02 Dec 2022
  
  in Public
  
  What I missed about Mastodon was its very different culture. Ad-driven social media platforms are willing to tolerate monumental volumes of abusive users. They’ve discovered the same thing the Mainstream Media did: negative emotions grip people’s attention harder than positive ones. Hate and fear drives engagement, and engagement drives ad impressions. Mastodon is not an ad-driven platform. There is absolutely zero incentives to let awful people run amok in the name of engagement. The goal of Mastodon is to build a friendly collection of communities, not an attention leeching hate mill. As a result, most Mastodon instance operators have come to a consensus that hate speech shouldn’t be allowed. Already, that sets it far apart from twitter, but wait, there’s more. When it comes to other topics, what is and isn’t allowed is on an instance-by-instance basis, so you can choose your own adventure.
  
  Attention economy
  
  Twitter drivers: Hate/fear → Engagement → Impressions → Advertiser money. Since there is no advertising money in Mastodon, it operates on different drivers. Since there is no advertising money, a Mastodon operator isn't driven to get the most impressions. Because there isn't a need to get a high number of impressions, there isn't a need to fuel the hate/fear drivers.
  
  online community management Twitter
Visit annotations in context

Tags

online community management

Twitter

Mastodon

Annotators

peter_murray

URL

escapingtech.com/tech/opinions/i-was-wrong-about-mastodon-moderation.html
Nov 2022
www.noemamag.com www.noemamag.com

Mastodon Isn’t Just A Replacement For Twitter

3
1. peter_murray 29 Nov 2022
  
  in Public
  
  As users begin migrating to the noncommercial fediverse, they need to reconsider their expectations for social media — and bring them in line with what we expect from other arenas of social life. We need to learn how to become more like engaged democratic citizens in the life of our networks.
  
  Fediverse should mean engaged citizens
  
  fediverse fediverse-governance
2. peter_murray 29 Nov 2022
  
  in Public
  
  Because Mastodon is designed more for chatter than governance, we use a separate platform, Loomio, for our deliberation and decision-making.
  
  social.coop uses Loomio for governance
3. peter_murray 29 Nov 2022
  
  in Public
  
  We believe that it is time to embrace the old idea of subsidiarity, which dates back to early Calvinist theology and Catholic social teaching. The European Union’s founding documents use the term, too. It means that in a large and interconnected system, people in a local community should have the power to address their own problems. Some decisions are made at higher levels, but only when necessary. Subsidiarity is about achieving the right balance between local units and the larger systems.
  
  Defining "subsidiarity"
  
  The FOLIO community operates like this..the Special Interest Groups have the power to decide for their functional area, and topics that cross functional areas are decided between SIGs or are brought to a higher level council.
  
  social structure
Visit annotations in context

Tags

fediverse

fediverse-governance

social structure

Annotators

peter_murray

URL

noemamag.com/mastodon-isnt-just-a-replacement-for-twitter
www.frontiersin.org www.frontiersin.org

Fifty psychological and psychiatric terms to avoid: a list of inaccurate, misleading, misused, ambiguous, and logically confused words and phrases

5
1. peter_murray 26 Nov 2022
  
  in Public
  
  Nevertheless, from the standpoint of learning theory, these and other authors have it backward, because a steep learning curve, i.e., a curve with a large positive slope, is associated with a skill that is acquired easily and rapidly (Hopper et al., 2007).
  
  Steep learning curve
  
  I don't think I'll ever hear this phrase the same again. A steep learning curve is a good thing...meaning over time that it was very easy to learn (less time on the x axis).
2. peter_murray 26 Nov 2022
  
  in Public
  
  Nevertheless, even ardent proponents of the view that DID is a naturally occurring condition that stems largely from childhood trauma (e.g., Ross, 1994) acknowledge that “multiple personality disorder” is a misnomer (Lilienfeld and Lynn, 2015), because individuals with DID do not genuinely harbor two or more fully developed personalities
  
  Multiple personality disorder
  
  Use dissociative identity disorder since 1994.
3. peter_murray 26 Nov 2022
  
  in Public
  
  There is no known “optimal” level of neurotransmitters in the brain, so it is unclear what would constitute an “imbalance.” Nor is there evidence for an optimal ratio among different neurotransmitter levels. Moreover, although serotonin reuptake inhibitors, such as fluoxetine (Prozac) and sertraline (Zoloft), appear to alleviate the symptoms of severe depression, there is evidence that at least one serotonin reuptake enhancer, namely tianepine (Stablon), is also efficacious for depression (Akiki, 2014). The fact that two efficacious classes of medications exert opposing effects on serotonin levels raises questions concerning a simplistic chemical imbalance model.
  
  Chemical imbalance
  
  We don't (yet) know what the proper balance of brain chemistry would be, so saying that mental illness is cause by a chemical imbalance is problematic. There are drugs that effectively treat depression that both decrease and increase serotonin, so because of these opposite effects it is hard to say what the proper amount should be.
4. peter_murray 26 Nov 2022
  
  in Public
  
  Furthermore, there are ample reasons to doubt whether “brainwashing” permanently alters beliefs (Melton, 1999). For example, during the Korean War, only a small minority of the 3500 American political prisoners subjected to intense indoctrination techniques by Chinese captors generated false confessions. Moreover, an even smaller number (probably under 1%) displayed any signs of adherence to Communist ideologies following their return to the US, and even these were individuals who returned to Communist subcultures
  
  Brainwashing
  
  The techniques of "brainwashing" aren't that much different form other persuasion methods. This term originated in the Korean war, and subsequent studies suggested that there are no permanent alterations to beliefs.
5. peter_murray 26 Nov 2022
  
  in Public
  
  numerous scholars have warned of the jingle and jangle fallacies, the former being the error of referring to different constructs by the same name and the latter the error of referring to the same construct by different names (Kelley, 1927; Block, 1995; Markon, 2009). As an example of the jingle fallacy, many authors use the term “anxiety” to refer interchangeably to trait anxiety and trait fear. Nevertheless, research consistently shows that fear and anxiety are etiologically separable dispositions and that measures of these constructs are only modestly correlated (Sylvers et al., 2011). As an example of the jangle fallacy, dozens of studies in the 1960s focused on the correlates of the ostensibly distinct personality dimension of repression-sensitization (e.g., Byrne, 1964). Nevertheless, research eventually demonstrated that this dimension was essentially identical to trait anxiety (Watson and Clark, 1984). In the field of social psychology, Hagger (2014) similarly referred to the “deja variable” problem, the ahistorical tendency of researchers to concoct new labels for phenomena that have long been described using other terminology (e.g., the use of 15 different terms to describe the false consensus effect; see Miller and Pedersen, 1999).
  
  Jingle and Jangle Fallacies
  
  Jingle: referring to different things by the same word
  
  Jangle: referring to a single thing with different words
Visit annotations in context

Annotators

peter_murray

URL

frontiersin.org/articles/10.3389/fpsyg.2015.01100/full

Peter Murray

Annotations: 1,276

Joined: October 24, 2012

Location: Columbus, Ohio, United States

Link: dltj.org/

ORCID: 0000-0003-4284-508X

Context of use matters when setting behavior

Jailbreaks we’re anticipated, but the huge public uptake required more and faster effort to fix.

Adversarial training with ChatGPT

Tags

Annotators

URL

Importance of technological extraction methods to supplement natural extraction methods

CO2 is rather low density in air, making it challenging to capture

Direct capture of CO2 from the atmosphere at near-commercial scale

Abstract

Tags

Annotators

URL

EU Digital Services Act interoperable requirement

Risk of filter bubbles

Decentralized moderation satisfies freedom to speak and freedom not to listen

Will consolidation of email providers point to consolidation of fediverse instances?

Gab's attempt to join the fediverse was rejected by Mastodon admins

Content-moderation subsidiarity means no user is banned

Reddit : Fediverse :: Decentralized-by-policy : Decentralized-by-architecture

Not all content moves when migrating in Mastodon

Content-moderation subsidiarity

Moderator's Trilemma

AOL and the like as early examples of closed systems that were replaced by open environments

Internet built on open protocols

Comparing email moderation and centralized moderation

Tags

Annotators

URL

Prompts are for sale

Prompt engineering is not science

Summary of prompt engineer work

Tags

Annotators

URL

When you can tell the difference between algorithm and humanity

Creator of ELIZA in 1956 regrets doing so

Distributional Semantics

Stochastic Parrot definition

OpenAI’s use of human taggers

LLMs as a model of reality, but not reality

GPT-3 doesn’t contain book content in its training?

AI -> SALAMI

Synthetic human behavior as AI bright line

Intentional Stance

Why LLMs are “great at mimicry and bad at facts”

The LLM Octopus Problem

Tags

Annotators

URL

Difference between railroad monopolies and digital platform monopolies

Defining "enshittification"

Tags

Annotators

URL

Abstract

Tags

Annotators

URL

ChatGPT/Bing's self-reflection comes from the corpus of discussions about AI that it has ingested

Microsoft's Tay project

Thwarting Eliza versus thwarting Sydney

Examples of Bing/ChatGPT/Sydney gaslighting users

Description of Joseph Weizenbaum's ELIZA program

Tags

Annotators

URL

Grounds for complaint

Very similar watermark implies Getty Images affiliation

Modified watermark in the output underscores a clear link

Getty Images' content is well suited for AI training

Copies with added noise

Role of LAION

Description of the LLM training process

Terms of service violation

Getty database copyright

12 million claimed; 7,316 listed

Trademark infringement on the Getty watermark

Descriptive information makes the Getty collection more valuable for LLM training

Tags