Hypothesis

3 Matching Annotations

Last 7 days
x.com x.com

https://x.com/shao__meng/status/2042016334574448858

1
1. fxp007 16 Apr 2026
  
  in Public
  
  31B QLoRA：22GB 显存可运行
  
  令人惊讶的是：310亿参数的模型现在只需要22GB显存就能运行，这比传统方法节省了大量计算资源，使得在消费级硬件上运行大型语言模型成为可能， democratizing AI access。
  
  surprising model-compression democratization-ai
Visit annotations in context

Tags

surprising

model-compression

democratization-ai

Annotators

fxp007

URL

x.com/shao__meng/status/2042016334574448858
x.com x.com

https://x.com/i/web/status/2044553427448172741

1
1. fxp007 16 Apr 2026
  
  in Public
  
  The era of 1-bit LLMs is here — now with WebGPU acceleration!
  
  令人惊讶的是：1位大语言模型时代的到来意味着每个参数只需1位存储空间，相比传统的32位浮点表示，这代表了模型压缩技术的重大突破，结合WebGPU加速，使AI计算效率提升数十倍。
  
  surprising model-compression webgpu
Visit annotations in context

Tags

webgpu

surprising

model-compression

Annotators

fxp007

URL

x.com/i/web/status/2044553427448172741
www.tomtunguz.com www.tomtunguz.com

https://www.tomtunguz.com/gemma-4-vs-gpt-4o/

1
1. fxp007 16 Apr 2026
  
  in Public
  
  In 23 months, the same capability that needed 1.8 trillion parameters now fits in 4 billion parameters. A 450x compression.
  
  令人惊讶的是：AI模型参数量在短短23个月内实现了450倍的压缩，这意味着原本需要超级计算机才能运行的强大AI模型现在可以完全在手机上运行。这种技术进步的速度远超摩尔定律，展示了算法优化和模型压缩技术的惊人突破。
  
  surprising ai-compression model-optimization
Visit annotations in context

Tags

ai-compression

model-optimization

surprising

Annotators

fxp007

URL

tomtunguz.com/gemma-4-vs-gpt-4o/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL