84 Matching Annotations
  1. Nov 2024
    1. Bytecode Is Smaller The bytecode generated by SQLite is usually smaller than the corresponding AST coming out of the parser. During initial processing of SQL text (during the call to sqlite3_prepare() and similar) both the AST and the bytecode exist in memory at the same time, so more memory is used then. But that is a transient state. The AST is quickly discarded and its memory recycled

      Does SQLite even need to construct an AST? It's just SQL. Can't it just emit the bytecode directly?

  2. Sep 2024
  3. Jan 2024
    1. You can do this with recursive descent, but it’s a chore.

      Jonathan Blow recently revisited this topic with Casey Muratori. (They last talked about this 3 years ago.)

      What's a little absurd is that (a) the original discussion is something like 3–6 hours long and doesn't use recursive descent—instead they descended into some madness about trying to work out from first principles how to special-case operator precedence—and (b) they start out in this video poo-pooing people who speak about "recursive descent", saying that it's just a really obnoxious way to say writing ordinary code—again, all this after they three years ago went out of their way to not "just" write "normal" code—and (c) they do this while launching into yet another 3+ hour discussion about how to do it right—in a better, less confusing way this time, with Jon explaining that he spent "6 or 7 hours" working through this "like 5 days ago". Another really perverse thing is that when he talks about Bob's other post (Parsing Expressions) that ended up in the Crafting Interpreters book, he calls it stupid because it's doing "a lot" for something so simple. Again: this is to justify spending 12 hours to work out the vagaries of precedence levels and reviewing a bunch of papers instead of just spending, I dunno, 5 or 10 minutes or so doing it with recursive descent (the cost of which mostly comes down to just typing it in).

      So which one is the real chore? Doing it the straightforward, fast way, or going off and attending to one's unrestrained impulse that you for some reason need to special-case arithmetic expressions (and a handful of other types of operations) like someone is going to throw you off a building if you don't treat them differently from all your other ("normal") code?

      Major blind spots all over.

  4. Dec 2023
    1. Python is both a compiled and interpreted language

      The CPython interpreter really is an interpreter. But it also is a compiler. Python must go through a few stages before ever running the first line of code:

      1. scanning
      2. parsing

      Older versions of Python added an additional stage:

      1. scanning
      2. parsing
      3. checking for valid assignment targets

      Let’s compare this to the stages of compiling a C program:

      1. ~~preprocessing~~
      2. lexical analysis (another term for “scanning”)
      3. syntactic analysis (another term for “parsing”)
      4. ~~semantic analysis~~
      5. ~~linking~~
    2. next stage is parsing (also known as syntactic analysis) and the parser reports the first error in the source code. Parsing the whole file happens before running the first line of code which means that Python does not even see the error on line 1 and reports the syntax error on line 2.
    1. Recap

      In this article you started implementing your own version of Python. To do so, you needed to create four main components:

      A tokenizer: * accepts strings as input (supposedly, source code); * chunks the input into atomic pieces called tokens; * produces tokens regardless of their sequence making sense or not.

      A parser: * accepts tokens as input; * consumes the tokens one at a time, while making sense they come in an order that makes sense; * produces a tree that represents the syntax of the original code.

      A compiler: * accepts a tree as input; * traverses the tree to produce bytecode operations.

      An interpreter: * accepts bytecode as input; * traverses the bytecode and performs the operation that each one represents; * uses a stack to help with the computations.

    2. The parser is the part of our program that accepts a stream of tokens and makes sure they make sense.
    3. The four parts of our program
      • Tokenizer takes source code as input and produces tokens;
      • Parser takes tokens as input and produces an AST;
      • Compiler takes an AST as input and produces bytecode;
      • Interpreter takes bytecode as input and produces program results.
  5. Nov 2023
    1. A more efficient but more complicated way to simulate perfect guessing is to guess both options simultaneously

      NB: Russ talking here about flattening the NFA into a DFA that has enough synthesized states to represent e.g. in either state A or state B. He's not talking about CPU-level concurrency. But what if he were?

  6. Oct 2023
  7. Sep 2023
  8. Jul 2023
    1. ```js / * twitter-entities.js * This function converts a tweet with "entity" metadata * from plain text to linkified HTML. * * See the documentation here: http://dev.twitter.com/pages/tweet_entities * Basically, add ?include_entities=true to your timeline call * * Copyright 2010, Wade Simmons * Licensed under the MIT license * http://wades.im/mons * * Requires jQuery /

      function escapeHTML(text) { return $('<div/>').text(text).html() }

      function linkify_entities(tweet) { if (!(tweet.entities)) { return escapeHTML(tweet.text) }

      // This is very naive, should find a better way to parse this
      var index_map = {}
      
      $.each(tweet.entities.urls, function(i,entry) {
          index_map[entry.indices[0]] = [entry.indices[1], function(text) {return "<a href='"+escapeHTML(entry.url)+"'>"+escapeHTML(text)+"</a>"}]
      })
      
      $.each(tweet.entities.hashtags, function(i,entry) {
          index_map[entry.indices[0]] = [entry.indices[1], function(text) {return "<a href='http://twitter.com/search?q="+escape("#"+entry.text)+"'>"+escapeHTML(text)+"</a>"}]
      })
      
      $.each(tweet.entities.user_mentions, function(i,entry) {
          index_map[entry.indices[0]] = [entry.indices[1], function(text) {return "<a title='"+escapeHTML(entry.name)+"' href='http://twitter.com/"+escapeHTML(entry.screen_name)+"'>"+escapeHTML(text)+"</a>"}]
      })
      
      var result = ""
      var last_i = 0
      var i = 0
      
      // iterate through the string looking for matches in the index_map
      for (i=0; i < tweet.text.length; ++i) {
          var ind = index_map[i]
          if (ind) {
              var end = ind[0]
              var func = ind[1]
              if (i > last_i) {
                  result += escapeHTML(tweet.text.substring(last_i, i))
              }
              result += func(tweet.text.substring(i, end))
              i = end - 1
              last_i = end
          }
      }
      
      if (i > last_i) {
          result += escapeHTML(tweet.text.substring(last_i, i))
      }
      
      return result
      

      } ```

  9. May 2023
  10. Apr 2023
  11. Mar 2023
    1. absolute gem of a book, I use it for my compilers class:https://grugbrain.dev/#grug-on-parsing

      I didn't realize recursive descent was part of the standard grugbrain catechism, too, but it makes sense. Grugbrain gets it right again.

      Not unrelated—I always liked Bob's justification for using Java:

      I won't do anything revolutionary[...] I'll be coding in Java, the vulgar Latin of programming languages. I figure if you can write it in Java, you can write it in anything.

      https://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/

  12. Feb 2023
  13. Jan 2023
    1. The usefulness of JSON is that while both systems still need to agree on a custom protocol, it gives you an implementation for half of that custom protocol - ubiquitous libraries to parse and generate the format, so the application needs only to handle the semantics of a particular field.

      To be clear: when PeterisP says parse the format, they really mean lex the format (and do some minimal checks concerning e.g. balanced parentheses). To "handle the semantics of a particular field" is a parsing concern.

  14. Dec 2022
  15. Nov 2022
  16. Oct 2022
  17. Sep 2022
  18. Aug 2022
  19. Jun 2022
  20. Dec 2021
  21. Nov 2021
  22. Jun 2021
    1. while (( "$#" )); do case "$1" in -a|--my-boolean-flag) MY_FLAG=0 shift ;; -b|--my-flag-with-argument) if [ -n "$2" ] && [ ${2:0:1} != "-" ]; then MY_FLAG_ARG=$2 shift 2 else echo "Error: Argument for $1 is missing" >&2 exit 1 fi ;; -*|--*=) # unsupported flags echo "Error: Unsupported flag $1" >&2 exit 1 ;; *) # preserve positional arguments PARAMS="$PARAMS $1" shift ;; esacdone# set positional arguments in their proper placeeval set -- "$PARAMS"
  23. Mar 2021
  24. Dec 2020
    1. хорошая библиотека для парсинга и автоматизации тестирования "вёрстки с js"

  25. Oct 2020
    1. Parsing HTML has significant overhead. Being able to parse HTML statically, ahead of time can speed up rendering to be about twice as fast.
  26. Aug 2020
  27. Jul 2020
    1. JSON parsing is always pain in ass. If the input is not as expected it throws an error and crashes what you are doing. You can use the following tiny function to safely parse your input. It always turns an object even if the input is not valid or is already an object which is better for most cases.

      It would be nicer if the parse method provided an option to do it safely and always fall back to returning an object instead of raising exception if it couldn't parse the input.

    1. It does, however, provide the --porcelain option, which causes the output of git status --porcelain to be formatted in an easy-to-parse format for scripts, and will remain stable across Git versions and regardless of user configuration.
    2. Parsing the output of git status is a bad idea because the output is intended to be human readable, not machine-readable. There's no guarantee that the output will remain the same in future versions of Git or in differently configured environments.
  28. May 2020
  29. Feb 2020
  30. Nov 2019
  31. Sep 2019
  32. Aug 2019
  33. Dec 2016
    1. ἀνιάτῳ

      This word is composed of the alpha privative and the root for "healing," which is where my name comes from! ἰάομαι (pres mid) takes the future in the active, the participle of which is ἰάσων, meaning something like "being about to heal." Therefore, ἀνίατος is something that is incurable. It may refer to wickedness that cannot be atoned for, to a person who is unforgivably evil.