13 Matching Annotations
  1. Jan 2026
    1. We must work together to create a more rigorous understanding of what these technologies do (and don’t do) rather than developing value statements (and laws) that buy into corporate fiction.

      great conclusion -- it's important to recongnize then business tries to sell you something, showcasing it as a product - magic that you never had in your life before...

    2. These will not be fixed by scraping larger, equally poorly vetted datasets from the same biased sources — unfortunately, hate scales.

      That's actually connect to the points of large data that might bring more problems from the previous lecture

    3. However, the tool is not creative, nor does the use of a tool infuse a person with creativity by automating the outcome of a creative process.

      The tool is just trained on the human artworks to replicate them :)

    4. LLM responses are statistically likely rather than factually accurate. Sometimes these things correspond, but often they do not.

      It helps to know the basics on how LLM models answers on the questions where it analyses each word as a token, let's image some set of number, and try to find the most related words. That's why LLM models claim something as factual data, but you tell that they are wrong, they admit it and deeply apologize as a part of their polite training

    5. writing a letter on behalf of his daughter to an inspiring Olympic athlete.

      From a parenting and educational perspective, daughter would write an awesome and authentic letter to the Olympic Athlete. However, the real problem isn't that the letter needs to be perfect, but that the father is prioritizing efficiency over dedicating time to help her learn and express herself.

    1. non-profit include the incentives of the organization, the transparency of the model, and ethical considerations for LLMs

      It was interesting to get to know about non-profit LLM models that are accessible to anyone. Never heard of them

    2. was taken down after producing convincing but false scientific articles

      A great connection to the LIAON - 5 statement, that data isn't filtered, and it might take over 100 years (forgive me if im wrong) to filter all of the data that is getting scrapped while training the models

    3. “tedious and time-consuming editing tasks”

      Love this idea of equity in the research communities however editing and work wtih time consuming tasks might be used by everyone. I mean that there is no specific 'marginalized' or group who in need for this. All of the researchers no matter of the language, level of expertise, and gender use LLM for tedious and time-consuming editing

    4. Based on self-reporting, we found that researchers who are non-White, non-native English speaking, and junior researchers both use LLMs more frequently and also perceive higher benefits and lower risks. As people with these demographics traditionally tend to face certain structural barriers, our findings suggest that LLMs can help with improving research equity.

      I think this might due to a simple reason that most of the researchers nowadays just non-white and non native English speakers. However, can't reason why they perceive higher benefits and lower risks? Maybe simply because they have more experience with LLM models and prompting?

    1. Datasets have human-oriented stories behind them and implicit within them, and the stories of how and why data was created ought to be integrally connected to the datasets themselves.

      Reinforcing the 3 focuses of the information sciences: people, techonology and information

    2. each dataset’s construction, history, quirks, flaws, and strengths from different humanistic perspectives.

      It is a big help for data novice people because a lot of the time these details arent provided

    3. students to use datasets that they find on websites like Kaggle

      Just today I watched a video lecture by professor Wickes on her research about STEM education for non-stem people. Basically, she raised the same problem about Kaggle - it's place with all of the perfect datasets that and rarely you can see in real work, fr example in the company. The process of learning by using Kaggle omits the essential parts of the computing and data education - data cleaning and preparation.