215 Matching Annotations
  1. Sep 2018
    1. 生成模型 vs. 判别模型

      总体来看,如果样本足够多,判别模型的正确率高于生成模型的正确率。

      生成模型和判别模型最大的区别在于,生成模型预先假设了很多东西,比如预先假设数据来自高斯,伯努利,符合朴素贝叶斯等等,相当于预先假设了 Hypothesis 函数集,只有在此基础上才有可能求出这个概率分布的参数。

      生成模型,进行了大量脑补。脑补听起来并不是一件好事,但是当你的数据量太小的时候,则必须要求你的模型具备一定的脑补能力。

      判别模型非常依赖样本,他就是很传统,死板,而生成模型比较有想象力,可以“想象”出不存在于当前样本集中的样本,所以他不那么依赖样本。

      关于 想象出不能存在于当前样本集的样本 ,见本课程 40:00 老师举例。

      生成模型在如下情形比判别模型好:

      1. 数据量较小时。
      2. 数据是noisy,标签存在noisy。
      3. 先验概率和类别相关的概率可以统计自不同的来源。

      释疑第三条优点:老师举例,在语音辨识问题中,语音辨识部分虽然是 DNN --- 一个判别模型,但其整体确实一个生成模型,DNN 只是其中一块而已。为什么会这样呢?因为你还是要去算一个先验概率 --- 某一句话被说出来的概率,而获得这个概率并不需要样本一定是声音,只要去网络上爬很多文字对话,就可以估算出这个概率。只有 类别相关的概率 才需要声音和文字pair,才需要判别模型 --- DNN 出马。

  2. Nov 2017
  3. Oct 2017
  4. Jun 2017
  5. May 2017
    1. Precision: It is a measure of correctness achieved in positive prediction i.e. of observations labeled as positive, how many are actually labeled positive. Precision = TP / (TP + FP) Recall: It is a measure of actual observations which are labeled (predicted) correctly i.e. how many observations of positive class are labeled correctly. It is also known as ‘Sensitivity’. Recall = TP / (TP + FN)

      Example: In cancer research you may want higher recall, Since you want all actual positive observations to classified as True Positive. A lower Precision maybe alright because some healthy people classified as cancerous can be rectified later.

  6. Apr 2017
    1. if your goal is word representation learning,you should consider both NCE and negative sampling

      Wonder if anyone has compared these two approaches

  7. Feb 2017
  8. Dec 2015
    1. some of the deep learning libraries we may look at later in the class

      what c++ libraries are used?

  9. Apr 2015
    1. Same as above but the pre- trained vectors are fine-tuned for each task.

      How?

      Backpropagating to the input layer, changing the vector representation with the training examples?

  10. Sep 2014