35 Matching Annotations
  1. Sep 2025
  2. Aug 2025
    1. The best way to do this is by first using tesseract to get OCR text in whatever languages you might feel are in there, using langdetect to find what languages are included in the OCR text and then run OCR again with the languages found.

      how about the accuracy?

    1. As is observed in the above results, after an nn.Sequential instance is scripted using the torch.jit.script function, computing performance is improved through the use of symbolic programming.

      but longer time

    1. The photorealistic text-to-image examples in Fig. 11.9.5 suggest that the T5 encoder alone may effectively represent text even without fine-tuning.

      t5和输出之间应该还有网络?

    1. Note that h heads can be computed in parallel if we set the number of outputs of linear transformations for the query, key, and value to pqh=pkh=pvh=po.

      不一致就不能平行运算吗?

    1. In the case of a (scalar) regression with observations (xi,yi) for features and labels respectively, vi=yi are scalars, ki=xi are vectors, and the query q denotes the new location where f should be evaluated.

      x_i和q相等?

    1. Using word-level tokenization, the vocabulary size will be significantly larger than that using character-level tokenization, but the sequence lengths will be much shorter.

      the sequence lengths?

    1. While we can use the chain rule to compute ∂ht/∂wh recursively, this chain can get very long whenever t is large. Let’s discuss a number of strategies for dealing with this problem.

      我不明白为什么可以这么替换

    1. Having a small value for this upper bound might be viewed as good or bad. On the downside, we are limiting the speed at which we can reduce the value of the objective. On the bright side, this limits by just how much we can go wrong in any one gradient step.