15 Matching Annotations
  1. Dec 2024
    1. As many are trying to get women into programming, so that they aren’t cut out of profitable and important fields, Amy Nguyen warns that men might just decide that programming is low status again (as has happened before in many fields):

      Especially with the rise of AI in the programming industry, this poses a very serious threat. There was already a somewhat significant discrepancy within the industry between the prestigious, higher-level programming jobs and the lower level grunt work like web dev and IT, and those jobs are often treated as a "lower class" of programmer. AI has only increased the degree to which already established programmers can de-legitimize these roles, allowing establishments of men to (consciously or not) maintain a workplace where women are not hired for the roles deemed "too complex."

    1. Meta’s way of making profits fits in a category called Surveillance Capitalism.

      The industry of surveillance is the driving force of current trends in social media. Collecting and selling the data of users is the explicit business model of just about every social media platform, with a huge amount of the remaining tech sector using this data to maximize their margins. However, it is worth noting that, to a certain degree, the excessive collection of data is not necessarily an inherent evil, or tied to capitalism. Regardless of economic structure, data collection can be used to improve the efficiency of development. Capitalism simply introduces a profit incentive which rewards companies for performing this collection as much as possible and with as few moral guardrails.

  2. Nov 2024
    1. The Nazi crimes, it seems to me, explode the limits of the law; and that is precisely what constitutes their monstrousness. For these crimes, no punishment is severe enough. It may well be essential to hang Göring, but it is totally inadequate.

      This seems like a very apt comparison to me, as a common reason for "cancellation" is influencers collaborating with and/or revealing themselves to be neo-nazis. It is also apt in that there were very few actual deaths. Judging by the continual presence of nazi ideology in the world, it seems that shame itself was not, in fact, a strong enough tool to actually affect the direction of society, in much the same way that supposed cancellations very rarely end a career.

    1. The consequences for being “canceled” can range from simply the experience of being criticized, to loss of job or criminal charges.

      I've never really liked the phrase "cancel culture" because it implies that there is something larger going on than people whose careers depend on their public image suffering the consequences of damaging their own image. "Cancellation" is not a real thing. Overwhelmingly, figures who are "cancelled" continue their careers unfazed, having received a huge amount of free press. While we definitely should be mindful of the way we view and utilize shame as a weapon, I think that public shaming is often the only recourse that victims have.

    1. This small percentage of people doing most of the work in some areas is not a new phenomenon. In many aspects of our lives, some tasks have been done by a small group of people with specialization or resources. Their work is then shared with others. This goes back many thousands of years with activities such as collecting obsidian and making jewelry, to more modern activities like writing books, building cars, reporting on news, and making movies.

      I think that this statistic becomes much less interesting the more you think about it. It makes sense that the more enfranchised users would use sites more, especially because they are the ones more likely to be receiving some sort of payment for their contributions. There is also a considerable barrier of anxiety that comes with posting personal on social media (on instagram it could be related to concerns about one's physical appearance or their social standing, on stackoverflow it could be concerns about being wrong or misleading) that becomes increasingly negligible the more experienced someone is with posting. I have friends who will try to workshop their tweets with me to make them funny, meanwhile I will tweet without a second thought because I've been doing it for years and don't feel the same pressure to make my tweets perfect.

    1. You probably already have some ideas of how crowds can work together on things like editing articles on a site like Wikipedia or answer questions on a site like Quora, but let’s look at some other examples of how crowds can work together.

      An interesting topic that isn't mentioned here is covert crowdsourcing. Huge quantities of data are created through crowdsourced tasks like Captchas which often go unnoticed. This is obviously very closely tied to data mining, but it is an increasingly common practice as companies discover just how valuable these datasets are. I feel that this chapter frames crowdsourcing in a purely positive light, without discussing the ethical concerns that many implementations of it raise.

  3. Oct 2024
    1. Inferred Data: Sometimes information that doesn’t directly exist can be inferred through data mining (as we saw last chapter), and the creation of that new information could be a privacy violation. This includes the creation of Shadow Profiles, which are information about the user that the user didn’t provide or consent to

      This is an interesting point to raise because it's one that seems exceedingly challenging to quantify. If Twitter sees that I frequently post about living in Seattle and shows me other posts about/from Seattle, even if I have turned location sharing off, is that creating a shadow profile? The nature of a lot of this data mining is that it results in a more personally-tailored social media experience which itself is more profitable for companies, making the ethical analysis of it complicated. On the far end of the spectrum, companies could store ZERO information about users (i.e. there are no accounts, everyone is an anonymous user) but this tends to result in much less usable platforms. I'd argue that, while there certainly should be more comprehensive regulations on data collection, a lot of inferred data is nothing that couldn't be figured out by a person looking at your account.

    1. But while that is the proper security for storing passwords. So for example, Facebook stored millions of Instagram passwords in plain text, meaning the passwords weren’t encrypted and anyone with access to the database could simply read everyone’s passwords. And Adobe encrypted their passwords improperly and then hackers leaked their password database of 153 million users.

      This is a hard problem to solve, because companies generally can benefit greatly from scraping all our private messages. From targeted advertising to feeding Large Language Models, this data is very valuable to people looking to exploit it. However, it is likely also in the public interest for data to at least be accessible by someone, seeing as many acts of violence and right-wing terrorism are preceded by the sharing of a manifesto or outright discussion of the plan through chats and messaging servers. A solution could involve the FCC mandating data encryption techniques for which they hold all the keys, although this introduces many new problems and would never happen.

    1. Data can be poisoned intentionally as well. For example, in 2021, workers at Kellogg’s were upset at their working conditions, so they agreed to go on strike, and not work until Kellogg’s agreed to improve their work conditions. Kellogg’s announced that they would hire new workers to replace the striking workers:

      Another example of this sort of motivated data poisoning is the design of Nightshade and other anti-ai "washes," programs which add invisible overlays, aberrations, and metadata which damage an LLM's ability to accurately parse data from the affected image. These were largely introduced to help artists protect their own work and style from being stolen for the sake of LLMs.

    2. Additionally, spam and output from Large Language Models like ChatGPT can flood information spaces (e.g., email, Wikipedia) with nonsense, useless, or false content, making them hard to use or useless.

      In recent years, this has become an increasingly large concern for internet stability and longevity. Generative-AI has flooded nearly all aspects of the internet, damaging search functions, discovery of human creators, and ironically the growth of AI itself. Because of the torrent of gen-AI content, these LLMs are now training on considerably lower-quality datasets. In order to preserve the LLM, datasets must be rigorously maintained. In order to preserve the usable internet, LLM outputs must be pruned quickly.

    1. And they also talked about the advertising motives behind supporting social causes (even if some employees do indeed support them), and the advertising motivation behind tweeting openly about how they are basing their decisions based on advertising.

      This can be a deceptive strategy which obfuscates authenticity. Brand accounts will tend to align themselves with trends and social causes which they believe are popular and uncontroversial. This can serve as an authentic metric for societal trends while also usually being an inauthentic representation of the company and their corporate "morals." As with many things on the internet, authentic patterns may emerge from a collection of independent, potentially inauthentic behavior.

    1. Does anonymity discourage authenticity and encourage inauthentic behavior?

      This is an interesting question that should be answered it two parts. In most cases, it seems that anonymity inspires authenticity among users. Anonymity gives users the freedom to express their views and opinions that they may not be able to in the real world, which seems to be a much more common use of anonymity than using it to pretend to hold other beliefs (i.e. a type of trolling) in my anecdotal experience. However, if the metric of authenticity is truth, then anonymity may do harm. When there is no person to blame for the spread of hateful rhetoric and/or conspiratorial ideation, it quickly becomes challenging to combat the spread of misinformation.

    1. Before this centralization of media in the 1900s, newspapers and pamphlets were full of rumors and conspiracy theories. And now as the internet and social media have taken off in the early 2000s, we are again in a world full of rumors and conspiracy theories.

      While it's reasonable to claim that a decentralized media system results in the proliferation of conspiracy theories, I think it's worth noting that conspiratorial press is also very common within centralized media systems. News networks and daytime TV alike both have a tendency to report shocking and dubiously truthful (if not outright false and dangerous) news, often manufacturing outrage just the same.

    1. Friction is anything that gets in the way of a user performing an action. For example, if you have to open and navigate through several menus to find the privacy settings, that is significant friction. Or if one of the buttons has a bug and doesn’t work when you press it, so you have to find another way of performing that action, which is significant friction.

      It's very interesting to think about the idea of friction as an intentional choice, which seems counterintuitive. However, on further thought it seems like MORE friction should be added to social media. Services like Meta's Reels and TikTok have very low friction levels (even ads can be swiftly swiped past) which cause many people--including me--to get sucked into an endless scroll for dopamine.

    1. Metadata is information about some data. So we often think about a dataset as consisting of the main pieces of data (whatever those are in a specific situation), and whatever other information we have about that data (metadata).

      While metadata is information about data, I feel it is worth considering that the metadata itself is often interpreted by the audience as an equal part of the data. Alt text on a picture, the status of the author, and the popularity of the content are not only information about the content, but a part of the experience of it. This is often important for understanding social trends, as well as developing accessibility features.