15 Matching Annotations
  1. Jul 2023
  2. Dec 2022
    1. The attention distribution is usually generated with content-based attention. The attending RNN generates a query describing what it wants to focus on. Each item is dot-producted with the query to produce a score, describing how well it matches the query. The scores are fed into a softmax to create the attention distribution.

      This is the Key, Value, Query, yes?

  3. Nov 2021
    1. To review, the Forget gate decides what is relevant to keep from prior steps. The input gate decides what information is relevant to add from the current step. The output gate determines what the next hidden state should be.Code DemoFor those of you who understand better through seeing the code, here is an example using python pseudo code.
  4. Sep 2021
    1. These results nonetheless show that it could be feasible to develop recurrent neural network modelsable to infer input-output behaviours of real biological systems, enabling researchers to advance theirunderstanding of these systems even in the absence of detailed level of connectivity.

      Too strong a claim?

  5. Aug 2021
  6. Jan 2021
  7. May 2020
    1. Teacher forcing is a training technique that isapplicable to RNNs that have connections from their output to their hidden states at thenext time step. (Left) At train time, we feed the correct outputy(t)drawn from the trainset as input toh(t+1). (Right) When the model is deployed, the true output is generallynot known. In this case, we approximate the correct outputy(t)with the model’s outputo(t), and feed the output back into the model.

      Teacher forcing strategy to parellelize training.

    2. Parameter sharingmakes it possible to extend and apply the model to examples of different forms(different lengths, here) and generalize across them. If we had separate parametersfor each value of the time index, we could not generalize to sequence lengths notseen during training, nor share statistical strength across different sequence lengthsand across different positions in time. Such sharing is particularly important whena specific piece of information can occur at multiple positions within the sequence.

      RNN have the same parameters for each time step. This allows to generalize the inferred "meaning", even when it's inferred at different steps.

  8. Jan 2019
    1. MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

      这 paper 挺神的,用 GAN 做时序数据异常检测。主要神在 G 和 D 都仅用 LSTM-RNN 来构造的!不仅因此值得我关注,更因为该模型可以为自己思考“非模板引力波探测”带来启发!

  9. Dec 2018
    1. Seeing in the dark with recurrent convolutional neural networks

      目测一些结果和自己的 paper 很接近,同时此 paper 于我而言,有太多值得借鉴的地方!同时又看到了 recurrency(类循环记忆单元)在模式识别领域有着很必要的用武之地!

  10. Jul 2015