7 Matching Annotations
  1. Last 7 days
    1. Anthropic, who accused DeepSeek, Moonshot, and MiniMax of distilling from Claude's outputs.

      Anthropic 公开指控 DeepSeek、月之暗面和 MiniMax 从 Claude 的输出中蒸馏数据——这是一个令人震惊的商业伦理事件。更深层的含义是:这些中国公司被迫采用「寄生式追赶」策略,以 Claude 为「免费教师」压缩训练成本。这既是技术现实的写照,也暗示了「无算力优势」下的竞争逻辑:当你无法花钱训练更好的模型,就借用别人训练好的。

  2. Mar 2025
    1. method and madness by [[Alan Jacobs]]

      via In which I describe my writing “methods." by [[Alan Jacobs]]

      reply:

      @ayjay Thanks for sharing this. My method is often very much like yours. Lots of internal distillation, slowly over time. I remember hearing a story that Mozart wrote music "like a cow pees" (in one giant and immediate flood and then done). I feel like large works of writing, composing, etc. springing, as if fully formed from the head of Zeus is more common than is acknowledged. Cory Doctorow hints at a similar sort of method in his own work in The Memex Method. I'm also reminded of bits of what neuroscientist Barbara Oakley calls "diffuse thinking" or a more internalized version of Michael Ondaatje's "thinkering" described in The English Patient.

  3. Oct 2024
  4. Sep 2024
  5. Jul 2023
  6. Apr 2022
    1. On Zettel 9/8a2 he called the Zettelkasten "eine Klärgrube" or a "septic tank;" (perhaps even "cesspool"). Waste goes in, and gets separated from the clearer stuff.

      Niklas Luhmann analogized his zettelkasten to a septic tank. You put in a lot of material, a lot of seemingly waste, and it allows a process of settling and filtering to allow the waste to be separated and distill into something useful.

  7. Nov 2018
    1. Dataset Distillation

      一篇知识蒸馏(Distillation)的新应用。Distillation 本身的应用目标是为了“模型压缩”从而知识转化实现小模型上的快速训练,而此文是对“数据压缩”以实现不错的蒸馏效果。