Hypothesis

4 Matching Annotations

Feb 2019
iphysresearch.github.io iphysresearch.github.io

A Paper A Day

1
1. Herb 26 Feb 2019
  
  in Public
  
  Interplay Between Optimization and Generalization of Stochastic Gradient Descent with Covariance Noise
  
  一个有趣的事实：batch-size 对训练收敛和模型泛化表现是有影响的，batch-size 越大，收敛越好，泛化变差。。。
  
  batch-size
Visit annotations in context

Tags

batch-size

Annotators

Herb

URL

iphysresearch.github.io/paper_summary/APaperADay.html
Nov 2018
iphysresearch.github.io iphysresearch.github.io

A Paper A Day

2
1. Herb 21 Nov 2018
  
  in Public
  
  Revisiting Small Batch Training for Deep Neural Networks
  
  这篇文章简而言之就是mini-batch sizes取得尽可能小一些可能比较好。自己瞅了一眼正在写的 paper，这不禁让我小肝微微一颤，心想：还是下次再把 batch-size 取得小一点吧。。。[挖鼻]
  
  Optimization batch-size
2. Herb 20 Nov 2018
  
  in Public
  
  Don't Use Large Mini-Batches, Use Local SGD
  
  最近(2018/8)在听数学与系统科学的非凸最优化进展时候，李博士就讲过：现在其实不太欣赏变 learning rate 了，反而逐步从 SGD 到 MGD 再到 GD 的方式，提高 batch-size 会有更好的优化效果！
  
  Optimization batch-size
Visit annotations in context

Tags

Optimization

batch-size

Annotators

Herb

URL

iphysresearch.github.io/paper_summary/APaperADay.html
Oct 2018
iphysresearch.github.io iphysresearch.github.io

A Paper A Day

1
1. Herb 18 Oct 2018
  
  in Public
  
  Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks
  
  深度神经网络训练(收敛/泛化性能)的近似Fisher信息矩阵表征，可自动优化mini-batch size/learning rate
  
  挺有趣的 paper，提出了从 Fisher 矩阵抽象出新的量用来衡量训练过程中的模型表现，来优化mini-batch sizes and learning rates | 另外 paper 中的figure画的很好看 | 作者认为逐步增加batch sizes的传统理解只是partially true，存在逐步递减该 size 来提高 model 收敛和泛化能力的可能。
  
  Generalization batch-size learning rate fisher matrix model evaluation
Visit annotations in context

Tags

learning rate

batch-size

model evaluation

fisher matrix

Generalization

Annotators

Herb

URL

iphysresearch.github.io/paper_summary/APaperADay.html

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL