The release includes DeepSeek-V4-Pro (1.6T total / 49B active) and DeepSeek-V4-Flash (284B total / 13B active), both trained natively at 1M context length.
DeepSeek V4的模型规模之大令人震惊,这表明了在长上下文处理方面取得的显著进步。
The release includes DeepSeek-V4-Pro (1.6T total / 49B active) and DeepSeek-V4-Flash (284B total / 13B active), both trained natively at 1M context length.
DeepSeek V4的模型规模之大令人震惊,这表明了在长上下文处理方面取得的显著进步。