738 Matching Annotations
  1. Jan 2017
    1. The Websocket URLs provided by rtm.start are single-use and are only valid for 30 seconds, so make sure to connect quickly. If you connect successfully the first event received will be a hello:

      The temporary WebSocket URLs solve several problems. One is that they work around the fact that clients cannot supply Authorization headers directly using the browser's WebSocket API.

    1. However, remember that Click is Unicode-based, so the string will always be a Unicode value
    1. Income for services performed outside the United States is foreign-source and not subject to U.S. tax for a nonresident alien, even if paid by a U.S. employer or payer.
    2. If the recipient is a nonresident alien and the compensation payment is U.S.-source income, you must withhold 30 percent (called “NRA withholding”) unless an exception applies.
    1. One of the biggest questions I have from students as they use Hypothesis Groups is how to save annotations to the group, as opposed to Public/Private annotations

      As a developer working on Hypothesis, this is useful to know. Is there anything you think we could do to make this easier?

    1. C. Scott - I think the fuzzy matching is the part that's not in annotator.js

      The process of mapping quotes to DOM Ranges is referred to in the Hypothesis client as "anchoring". The code for anchoring quotes with fuzzy matching has been split out into a separate library with minimal dependencies - https://github.com/tilgovi/dom-anchor-text-quote

    2. DOM tree and UX muddled together in the code base.

      I presume "DOM tree" means "the logic which links annotations to the corresponding section of the page and renders highlights"? If so then we're currently discussing how to separate that out from the code which provides the sidebar.

      If there is anything we can do to make the frontend (or parts of the frontend) of Hypothesis useful for Wikipedia projects, please let us know here or on the dev mailing list (https://groups.google.com/a/list.hypothes.is/forum/#!forum/dev )

    1. TL;DR If window.opener is set, a page can trigger a navigation in the opener regardless of security origin.
    1. Automating your deployment helps reduce the frictions and delays that crop up in between getting the software "done" and getting it to realize its value

      A good summary of the value of automated deployment.

    1. H - Happiness

      E - Engagement

      A - Adoption

      R - Retention

      T - Task success

      More resources can be found on the site of the original paper's primary author: http://www.rodden.org/kerry/heart/

    1. Thank you for your query. I'm afraid that while improving the accessibility (particularly keyboard navigability) of Hypothesis is high on our list of priorities, it's still a little rough in places, and we don't have an accessibility report at the moment.

      I'm commenting here because the ticket is closed (and consequently locked), but we do have an accessibility report from August 2016 focused only on the client at https://docs.google.com/document/d/1aV4yOqR-rbBjy0t4z3cbmDfz4zKmJMt_YEXjYthGODQ/edit?ts=5745a68c

  2. Dec 2016
    1. There are Elm plugins for at least the following editors:

      For Vim users, in addition to the plugin listed, there is also https://github.com/ElmCast/elm-vim , which provides autocompletion, integration with elm-format (for automatic formatting), elm-oracle (for code completion) and more.

    1. A useful guide on how to implement new Effect Managers. It was linked from elm-dev and presumably written for Elm 0.17.

      I haven't checked how much of this is still applicable to Elm 0.18.

    1. Description of a technique applicable to most (all?) modern browsers for achieving high performance animation of an element from an initial state to some destination state.

    1. <img src="https://media.giphy.com/media/hDSy8w6rGHeTe/giphy.gif" ng-click="alert('foo')">

      Image with Angular attribute directive.

    2. <img src="https://media.giphy.com/media/hDSy8w6rGHeTe/giphy.gif" onclick="alert('foo')">

      Image tag with inline event handler

    3. <script>alert('foo')</script>

      Script tag

    4. <style>body { display: none }</style>

      Style tag

    5. <img src="https://media.giphy.com/media/OUjcFvpzMzlGU/giphy.gif">

      Image tag.

    6. <i>Hello World</i>

      Italic text

    7. <b>Hello World</b>

      Bold text

    1. The action IDL attribute must reflect the content attribute of the same name, except that on getting, when the content attribute is missing or its value is the empty string, the element's node document's URL must be returned instead

      If my understanding of this is correct, a <form> with a missing "action" attribute should return the document URL when form.action is read. This does not appear to be the case in current versions of Chrome and Firefox, both of which return undefined.

      When the "form" attribute is an empty string however, they do return the current document URL. Internet Explorer however returns the result of resolving an empty string against the document base URL (eg. if the page URL is /account/settings then form.action would return /account if the action attribute is present but empty).

    1. This describes an interesting hack using iframes and document.write to enable incremental parsing and rendering of HTML loaded via an AJAX request.

    1. This is somewhat true, but TypeScript suffers from the exact same problem Java does - it is not null safe.

      In other words, null and undefined are assignable to every type. This was fixed in TypeScript 2.0 via the strictNullChecks option.

  3. Nov 2016
    1. On PDFs, positions are expressed in relation to a physical printed document. Each position is measured from the bottom left corner of a page and expressed in “points” (one of which equals 1/72 of an inch on a printed page). Conversely, positions in the viewport are measured from the top left of the viewer’s viewport and expressed in pixels.

      In other words, the Y axis values start at 0 at the bottom of the page and increase going upwards to the top of the page - the opposite of most familiar graphics systems.

    1. Firefox still has pixel-rounding errors in SVG, though the icon font had the same issue
    1. You can see we’ve landed on directly injecting the SVGs directly in our page markup

      We copied this approach for the Hypothesis web service by implementing a Jinja extension

    1. Interesting dive into how string slicing with String#substring is implemented in the Dart and V8 VMs and the performance consequences of that. This investigation was prompted by poor performance of a port of less.js lexer to Dart vs. the original JS implementation.

      The article ends with benchmarks showing the cost of trying to match sequences of characters in a lexer using a regex vs. manually.

    2. A person with a bit more insight into RegExp features might come up with the following optimization:

      Neat trick for matching regular expressions within a string starting at a fixed position using pre-ES6 features:

      1. Create a regex with the global flag set which matches pattern|() where () is an irrefutable pattern which is guaranteed to match
      2. Set regex.lastIndex to the position you want to match at
      3. Use regex.exec(str)
    3. match can be easily implemented in any modern JavaScript interpreter that supports sticky RegExp flag introduced in ES6:

      Notes on how to match a regex starting at a given position in a string, making use of the sticky flag introduced in ES6.

    1. Summary: Displaying faceted-search controls on mobile devices in a ‘tray’ overlay is a new effective solution to the challenge of showing both results and filters on small screens.
    1. During mobile e-commerce usability study Baymard Institute observed that more than 50% of users tried to “search within” their currently navigated category path, in an attempt to “filter the product list on my screen with a search query”. However, 94% of mobile e-commerce sites and apps do not support such behavior.
    1. This was referenced by Google's Alex Russell no less when berating web developers for stuffing their apps/sites with too much JS and making users on slow networks/slower devices suffer from bad user experiences - https://twitter.com/slightlylate/status/799276604912377858

    1. Vorlon.JS is a tool for remotely debugging JavaScript on any device. Use involves running a Vorlon debugging server on the machine hosting the web service and adding a script tag containing the client runtime to the page that you want to debug.

    1. > raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) E TransportError: TransportError(503, u'ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/2/no master];]')

      Any idea what happened here?

  4. calteches.library.caltech.edu calteches.library.caltech.edu
    1. This is a criticism by Richard Feynman of maths textbooks that he reviewed while on the California State Curriculum Commission.

      I came across it from from the Pedagogy section of the Wikipedia article on Feynman.

      The complaints he makes are:

      • That the books teach only very specific approaches to solving specific problems, which does not encourage the freedom of thought necessary for making use of maths in the real world.
      • That the books unnecessarily phrase problems and explanations use the precise language of pure mathematics, when the problem could and should be stated in a way that laymen can understand
      • That technical terms are introduced without actually explaining the associated concepts and facts.
  5. Oct 2016
    1. Minimum Image Size The minimum image size is 200 x 200 pixels. If you try to use an image smaller than this you will see an error in the Sharing Debugger.
    1. Lessons learned:

      • The web platform provides a set of features which are useful for implementing modal overlays which appear above other content, trap focus etc.
      • CSS outline properties can be used to create borders around elements that do not take up space but which do prevent clicks going through to elements underneath
    1. A set of specs that allow mixins to be created natively in CSS. I learned about this from a talk by developers from Vaadin at the 2016 Polymer Summit

    1. This formula says that delay (D), the wait time for service or the time in queue, is a function of the utilization and the speed of the servicing entity.

      As utilization approaches 100%, the delay approaches infinity. Ignoring the Ts scaling factor the graph looks like this:

    2. For example, there is a formula that will predict the average time you will wait in a grocery checkout line if everyone waited in a single line that fed multiple cashiers - like the model used in banks.

      Possibly one of the motivations for the way queues leading to automated tills are arranged?

  6. Sep 2016
    1. Currently the word arsenal is filled with kinetic energy. When used in the sentence, "Jane has an arsenal of information,

      Look ma, I'm annotating the stream!

    1. One key difference here is that, in Haskell (not in all lan-guages), if one or both arguments are negative, the results ofmodwill have the same sign as the divisor, while the result ofremwill have the same sign as the dividend

      In other languages:

      • Python and Ruby: % is the modulus operator, same as mod in Haskell
      • JavaScript, Rust, Go: % is a remainder operator, so same as rem

    Tags

    Annotators

    1. In this section, we're going to look at an example of how a type gets made, identified as a monad and then given the appropriate Monad instance.

      Summary of steps for making a Monad instance:

      1. Understand conceptually what it would mean to put a value in a default context and to chain two operations in a particular context
      2. For chaining operations, it can be helpful to take advantage of the law that m >>= fjoin (fmap f m) - in other words, the bind operator is the same as mapping the values in the monadic context over the function and then flattening the result
      3. Define an instance Monad T typeclass instance which defines the return, >>= and (optionally) fail operations
      4. Check that the monadic laws hold:
        1. return m >>= ff x
        2. m >>= returnm
        3. f <=< (g <=< h)(f <=< g) <=< h
    1. Speaking about Brexit, Ms James warned "we have only just won a heat" in a "28-member state Olympic competition to leave the European Union".

      This is an annotation that has not been deleted

    1. The broad proposal here, making replies distinct things from annotations so we can optimize based on how we want to store/query/retrieve them differently from annotations, sounds pretty sensible.

      My suggestion for moving forwards would be to identify the restrictions that we might add (eg. replies must have the same access controls as their parent) and then introduce those artificially (eg. via UI changes in the client, validation in the API) as the first step. Then we can figure out if we're breaking anyone's workflow before we commit.

    2. would also considerably reduce the amount of work the client needs to do to render annotation threads.

      If by "work" you refer to number of API calls needed to get the relevant data, then yes - we want to make as few of those as possible.

      If by "work" you mean computation or code complexity, then at least in our client, the code that takes the result from the API and builds a conversation thread is pretty simple. The JWZ code/algorithm that we used to use just made it seem complex.

    3. This approach only delivers substantial operational benefit if we can assume that replies have the same access control restrictions as their thread root

      Can you clarify this? I don't immediately see why we couldn't switch to the proposed design and support private replies on public annotations for example. It would require us to filter the conversation thread that is returned to the user, and if we don't really need that feature we shouldn't build it, but I don't obviously see it as severe obstacle.

    4. Or could we just restrict the size of a conversation for now?

      What is the largest conversation we currently have in our DB, in terms of total number of characters and number of threads? My guess is that we could get away with imposing an arbitrary limit on the size of a conversation for now.

    5. This seems like it’d be much easier if reply threads weren’t arbitrarily nested…

      My personal view on this is that arbitrary levels of nesting are primarily a problem for a UI design perspective - because any piece of UX has to accommodate this, whereas designing for a fixed maximum depth would be easier.

      Sheetal pointed out that Facebook (and I think at least one other platform) supported nested replies but only to a depth of two.

      If we wanted to experiment with imposing a limit on the thread depth, without initially committing us it, is to change the thread building algorithm in the client, eg. behind a flag.

    6. Searching for threads that contain a given term (including in replies) is easy, but identifying which bit of the thread matched seems harder. I’m not yet sure how we’d do this

      To check I understand the requirements here, you want to be able to index a conversation thread (annotation + all replies) as one ES document, and then in response to a query, return a data structure which contains the IDs of matching conversations plus the IDs of matching items (annotation or original reply) within those conversations?

      So this is essentially the same problem as say, finding out which page matched if you were indexing multi-page documents?

      Presumably ES can store position information with indexed terms. In that case here is one possible approach: Take all of the original items in the thread and serialize them into a single string - which is indexed with positional information, and separately the offsets of each item within that string are stored as a non-indexed field.

      eg:

      "content" field: annotation content | first reply | second reply
      "offsets" field: <first reply ID>:<offset of first reply>,<second reply ID>:<offset of second reply>
      

      When a search query is received, an ES query is performed to find the matching documents and get the offsets of matches within the "content" field. These offsets are then looked up in the "offsets" field to get the thread IDs.

  7. Aug 2016
    1. CharacterLimitController uses the "input" event which I believe is supported by all the browsers that we support

      Yes it is. IE >= 10 and up is the baseline. Since we are building the site in a way that we always have an HTML fallback, we can actually even build upgrades on a finer-grained basis - ie. have particular enhancements that don't apply unless a browser has certain features.

    1. It's the same as pure, only with a different name

      The Monad class now has a default implementation for return that uses pure, so the minimal definition is now just >>=

    2. Shouldn't there be a class constraint in there along the lines of class (Applicative m) = > Monad m where so that a type has to be an applicative functor first before it can be made a monad?
    1. Isn't that just peachy? Now what if we wanted to make the tuple an instance of Functor in such a way that when we fmap a function over a tuple, it gets applied to the first component of the tuple?

      nb. fmap-ing a tuple applies the function to the second value of the pair because in order to make a 2-tuple the right kind to be used as a functor (* -> *), it has to be partially applied (ie. instance Functor ((,) x)) See http://stackoverflow.com/a/34604238/434243

    2. We can also use the deriving keyword with newtype just like we would with data. We can derive instances for Eq, Ord, Enum, Bounded, Show and Read.

      What is the syntax that a derived Read parses when reading a list?

    1. You can use the 'tm.manager_hook' setting to register a callback that provides a custom transaction manager, and h actually does this for some reason but we just use a lambda that returns the default transaction manager.

      git blame is your friend :) - Nick's commit message in a2dce7780acd60480006a5c594eb8267ae09c5c4 has a detailed explanation.

      This is done to force the use of a new TransactionManager instance for each request instead of using a thread-local instance.

    2. session.rollback() rolls back the transaction, and presumably starts a new transaction or one is automatically started on the next operation with the session, but rollback() presumably does not close the session or return the connection to the pool?

      According to http://docs.sqlalchemy.org/en/latest/orm/session_transaction.html , rollback() performs the same closure as commit(), which is what I would have expected.

    1. In this case, perhaps we’d send two queries – one to each service – which looked something like the following.

      If the client sends a query which excludes some annotations on the page, how will it then indicate how many other annotations there are to view? - As it currently does when you perform a local search.

    2. Does this make sense? Can you see major problems with this approach?

      There are two concerns here: 1) What data do we fetch from the server and 2) What subset of that data does the client initially show?

      In the case where we direct-link to a single annotation for example, we fetch all annotations but only initially show the specified one, as opposed to fetching just the single annotation and then fetching more data if the user clicks 'Show all annotations'.

    3. Does this make sense? Can you see major problems with this approach?

      Is the query intended to be transparent to the client, in the sense that it can understand the query and parse out the annotation ID in order to set the selection? If so, this sounds sensible.

    4. This got us thinking about direct link URLs. Specifically, about whether we could encode an entire query, if necessary, in the URL fragment that represents a “direct link

      What about direct links to annotations on services other than Hypothesis? I think it would make sense for the hash fragment format to support that.

      Some strawman ideas:

      1. Add a service=<URL> parameter within the hypothesis query fragment. If omitted, this defaults to Hypothesis
      2. Use JSON and encode an object like { serviceA : { query args }, serviceB : { query args } in the Hypothesis fragment. That seems too elaborate for the current requirements though.
    5. The idea we played around with was that this scope control is just another interface to interact with the underlying query that’s used to fetch annotations from each service.

      I like this idea. I can see some potential caveats in future if/when we decide to add offline support or local caching of annotations - but we can revisit that when we come to it.

    1. Several aspects of this article are out of date and do not reflect the final Shadow DOM v1 API. See https://developers.google.com/web/fundamentals/primers/shadowdom/ for a more up-to-date introduction

    2. CSS styles defined outside the Shadow Root won't affect the main page.

      I think this meant to say "CSS styles defined outside the Shadow Root will not affect elements within the Shadow Root". In other words, styling in the containing document will not affect content within a Shadow DOM.

  8. Jul 2016
  9. developer.mozilla.org developer.mozilla.org
    1. You can disable the glow using the following CSS, or completely override it to alter the appearance of invalid fields.
    1. Add authority arg to this method?

      What purposes is this used for other than login? Do we need to support the ability for Hypothesis users with 3rd-party accounts to login via our website. If not we could simply assume the default authority here.

    1. Library (originally from Mozilla) for building components based on the W3C Web Components specs

    1. A Web Component that can be used to pull fragments of HTML from the server and replace some placeholder content in the page once the fragment loads.

  10. youtube.github.io youtube.github.io
    1. Framework for fast PE-navigation by updating just sections of a page that change during navigation, rather than reloading the whole page.

    1. I think that for an _activity_ page you want to do some sort of "rolling up" of similar activities that happened at similar times.

      The purpose of rolling up/faceting is to help a user navigate a large number of search results by exposing different ways in which they can narrow their matches. It seems just as a relevant to me when presenting search results as when presenting activity.

    2. But you're not sorting the annotations by document.

      We're grouping annotations by document rather than sorting. AFAIK and Conor can say more here, but the reason for this is that it was a key piece of feedback from some of our current users about how they want to browse results

    3. https://dl.dropboxusercontent.com/u/136038/image03.png

    4. If my use-case is "I want to scroll through all the annotations that match this search" then I have to click on every "show more" link.

      How painful or pleasant paging through results is depends very much on how the "Show more" link works, how fast the results appear and how hard it is to go back and forwards.

    5. If you do that, and my use-case is "I want to find a particular annotation", then even if the annotation I'm looking for matched my search it might be one of the ones that got truncated so I won't be able to find it except by clicking on the right "Show more" link.

      I'm not sure I agree. The same issue arises in any search engine where you are looking for a particular thing but the query matches many other things and the output of the result sorting does not show the thing you are looking for on the first page.

      The user's options are either to page through the results or to make the query more specific so that they match fewer annotations

    6. If what you show for the result is a list of documents, then the user can't see why each document matched the search

      The current proposed design searches annotations and groups by document. This is clear from the discussion. A risk is that if we get the interface design wrong, users won't realize this. If users aren't able to grok this, they are going to have a hard time using the interface.

    1. An experimental performance comparison of client and server-side templating on desktop and mobile, focusing on time to first paint and time to last paint metrics.

      The server is written in Go. The client is the simplest possible client-side templating you can do (using a <template> element and a few DOM API calls), so no frameworks involved.

      Some takeaways:

      • Everything on mobile is ~5x slower than desktop
      • For small amounts of data, there is little difference in time to first paint
      • Server-side rendering generates a modestly larger HTML payload vs. sending JSON down to the client
      • Time to first paint is faster for SSR as the client can render markup as it is streaming down, but this is only significant when there is a decent amount of data on the page
    1. Server-rendered markup can be progressively enhanced as element definitions are registered and upgraded by the browser.

      Question - How is the server-rendering done and in what language?

    1. I came across this from a post reflecting on the last Chrome summit.

      The splash pages which appear to be basic static content with little interactivity load a 1.5MB JS bundle (500KB gzipped). Flipping back and forwards between pages feels sluggish in Firefox. My initial hypothesis is that letting the client's side router tear down the DOM for the current route and build up the DOM for the new route might be slower than just relying on the browser's back/forwards cache as a set of boring static pages would do.

    2. The main motivation for rebuilding our splash pages was to allow us to more easily manage pages across different locales.

      Most of the solution here seems relatively unrelated to the actual goal of making managing splash pages across multiple locales easier.

    1. Despite the misgivings I have about the way the whole site works, the styling code is simple and uses a pretty similar approach to what has been proposed for Hypothesis

    1. Hardly a scientific survey, but the answers on Twitter and offline have been surprisingly consistent: server-side React for graceful degradation, jQuery and possibly shared templates (e.g. Mustache) for progressive enhancement
    1. The user agent should allow the user to manually trigger elements that have an activation behaviour, for instance using keyboard or voice input, or through mouse clicks. When the user triggers an element with a defined activation behaviour in a manner other than clicking it, the default action of the interaction event must be to run synthetic click activation steps on the element.
    1. There are two reasons for having no warnings. First, if it's worth complaining about, it's worth fixing in the code.
    1. Saw this recommended on a Quora answer whilst looking for book and article recommendations for a newcomer to Haskell

    1. If the origin is not a scheme/host/port triple, then return the string null (i.e., the code point sequence U+006E, U+0075, U+006C, U+006C) and abort these steps.
  11. Jun 2016
    1. Errors can be prevented by (cheap) checks in advance, whereas exceptions can only be handled after a risky action was run

      This is a conclusion that follows from the earlier comment that errors are mistakes in the program, because a check was forgotten, whereas exceptions are anticipated failures that must be considered.

    1. import System.Environment  import System.IO  import System.IO.Error 

      In the current version of Haskell, this example can be written as:

      import Control.Exception
      import System.Environment
      
      main :: IO ()
      main = toTry `catch` handler
      
      toTry :: IO ()
      toTry = do
        (fileName:_) <- getArgs
        contents <- readFile fileName
        print (length (lines contents))
      
      handler :: IOError -> IO ()
      handler _ = putStrLn "Failed to read file"
      
    2. To deal with this by using exceptions, we're going to take advantage of the catch function from System.IO.Error

      The catch function is deprecated and when I tried this with ghci 7.10.3 got an error that System.IO.Error does not export catch. The replacement is the catch function in Control.Exception

    3. The bytestring version of : is called cons It takes a byte and a bytestring and puts the byte at the beginning. It's lazy though, so it will make a new chunk even if the first chunk in the bytestring isn't full. That's why it's better to use the strict version of cons, cons' if you're going to be inserting a lot of bytes at the beginning of a bytestring.

      I initially did not see the difference in the output of B.cons vs B.cons' here, both output a plain string.

      I think this is because of the Show typeclass implementation for Data.ByteString.Lazy which is used by ghci when printing values.

      If on the other hand I use :force <expr> which fully evaluates <expr> and prints out the resulting structure then I do see the difference.

    4. through the documentation
    1. This will probably require changes to the client and to the server and will require some careful design for rollout on account of old(er) clients

      Do we need to support older clients for more than a few days or can we simply drop support for real-time updates for any client released less than N days ago?

    1. Client-side: disable forms, cookies(?)

      Slightly mad thought - could we serve the proxied page inside a sandbox iframe and get the browser to handle these security restrictions for us?

    2. Override history.pushState?

      It looks like we will need to do this to handle SPAs at the moment or things like GitHub where some links are handled client-side. I started a discussion on the WICG group about adding an event that code could listen to in order to detect in-page URL changes.

    3. Update all links to point to via

      By "links" you mean anchor tags? I think it is very useful to rewrite at least basic anchor tag links so that when annotating eg. a play spread over multiple pages, going from one section to the next at least stays "in Via".

    1. Probably the best beginners introduction to what functional programming means "in the small" and why you would want to use this approach that I have read.

    2. I've seen several TDD'ers spin in circles about whether they should do black box or white box testing. The answer is, you ought to do black box testing - you ought to be able to ignore the implementation details - but if you allow side-effects, you can't. Side-effects close the door to black box testing, because you can't get to the inputs & outputs without cracking the box open and learning what's inside.

      So in other words, writing code in a more functional style enables black-box testing.

    1. If the exclude fragment flag is unset and url’s fragment is non-null, append "#", followed by url’s fragment, to output.
    2. If the given value is the empty string, set context object’s url’s fragment to null and terminate these steps
    1. If .ready() is called after the DOM has been initialized, the new handler passed in will be executed immediately.
    1. There's a little dance involving #ifdef's that can prevent a file being read twice, but it's usually done wrong in practice - the #ifdef's are in the file itself, not the file that includes it.  The result is often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers) the most expensive phase.

      AFAIK compilers usually have specific optimizations to recognize and skip include guards.

    2. I eschew embedded capital letters in names; to my prose-oriented eyes, they are too awkward to read comfortably.  They jangle like bad typography.

      Years later the Go language which Pike was hugely involved in mandated the use of capital letters for public struct fields and functions.

    1. We present a novel Locality-Sensitive Hashing scheme for the Ap-proximate Nearest Neighbor Problem underlpnorm, based onp-stable distributions

      Annotation made on a copy from the first search result in Google Scholar

    1. For User Experience Design we can also create a hierarchy of needs. A user goes through the different states of motivation before caring for the next need.

      Maslow's hierarchy applied to user experience.

    1. What should a user journey contain?

      In short:

      • Context
      • Progression (getting from one step to the next)
      • Devices
      • Emotion (as in 'emotional state')
    1. PDF files may contain references to other PDF files (see 7.11, “File Specifications”). Simply storing a file name, however, even in a platform-independent format, does not guarantee that the file can be found. Even if the file still exists and its name has not been changed, different server software applications may identify it in different ways. Servers running on DOS platforms convert all file names to 8 characters and a 3-character extension. Different servers may use different strategies for converting longer file names to this format.

      This is the section of the PDF specification that defines "file identifiers", comprised of two parts, the "permanent" file identifier which is generated when the PDF is first created and should not change on subsequent updates and the "update" identifier which may change when the PDF is modified.

      The "permanent identifier" is what PDF.js refers to as the "document fingerprint".

  12. May 2016
    1. runhaskell helloworld.hs

      runhaskell is an alias for the shorter runghc command and specifying the file extension is optional, so this can be shortened to:

      runghc helloworld
      
    1. This is the primary reason Genius was undermining the Content Security Policy

      I don't think this is accurate. The CSP needed to be modified at the very least in order to enable Genius' client-side code, that highlights annotations in the page and handles click events on them, to run. I assume they removed CSP entirely rather than restricting it because that was simply the technically easier thing to do.

    2. The only programming language a webpage can execute is JavaScript, and when JavaScript is executed in the form of a remotely hosted script — that’s the the safer kind, less likely to be an XSS exploit — the language simply doesn’t know anything about the page that called it

      I must be missing something. External scripts can access the location of the page they are loaded into via document.location.

    1. For this reason, integration tests should always be kept separate from unit tests, in order to keep the unit tests running as quickly as they can.

      Many integration tests can function without any network, I/O etc. and still run as fast, or almost as fast as a pure unit test.

      Perhaps worth separating integration tests into those which are purely in-memory/local and those which hit the outside world?

    1. Some interesting notes on the development of the YouTube Gaming site and Polymer.

    1. Figure 6-1 shows how the font metrics apply to glyph dimensions, and Table 6-1 lists the method names that correlate with the metrics. See the various method descriptions for more specific information.

      Useful diagram illustrating the various metrics of fonts. Unfortunately many of these are not currently accessible using Web Platform APIs.

    1. Introduction to the algebraic effects system in Eff which is one of the inspirations behind React's new reconciler infrastructure. See https://github.com/reactjs/react-basic#algebraic-effects

    1. This language and its effect system have been described as one of the inspirations behind work on a new reconciler (or "diff-ing system") in React > 15.

    1. Can I use the Credential Management API inside an iframe?The API is restricted to top-level contexts. Calls to .get() or .store() in an iframe will resolve immediately without effect.

      This rules it out from usage in the current version of the H client.

    1. All things considered, I suspect that Google Maps's city reduction was an optimization for reading the maps on mobile devices, and that the new roads were added to make the maps look less empty (once the cities were removed). After all, a map with fewer labels is a map that's faster to read.

      This sounds quite plausible but I would love to read an insider's perspective on whether this is reasonably close to the truth or whether the motivations were quite different.

  13. Apr 2016
    1. You’ll find your components much easier to reuse and reason about if you divide them into two categories: Container and Presentational.

      Again, good advice regardless of the framework. To some degree, this is another variation on the view/controller split

    2. If you separate your logic and state from the UI, and the state is manipulated with only plain JS functions, you can easily test it separately.

      This is good advice regardless of the framework.

    3. Good update on the state of React circa April 2016

    1. Setting the end point above (higher in the document) the start point will throw an ERROR_ILLEGAL_VALUE DOMException.

      According to the DOM spec (see https://hyp.is/AVRXkaPCFtPwhMO7DXw2/dom.spec.whatwg.org/ ) this is wrong. If the end point occurs before the start point in the document then the start and end points are set to endNode and endOffset and the resulting range will be collapsed.

    1. If bp is before the range’s start, or if range’s root is not equal to node’s root, set range’s start to bp.
    1. Interesting article on dependency injection and combining FP and OOP. The central question explored is how a language might work if function parameters were split into two categories, data and services/environment.

    1. That’s probably the Dropbox server. Two million lines of code and counting, and it serves hundreds of millions of users.

      I wonder what the biggest pain points are in maintaining a code base of this size in Python besides typing and performance, which Dropbox are already addressing via mypy and Pyston respectively.

    2. I want Python to be more effective for large projects, without losing sight of its use for small projects and teaching. It’s quite a challenge; my current hope lies in PEP 484 and mypy, an approach to optional static typing (a.k.a. gradual typing). It’s super exciting. There are also other things happening in the community that make Python faster.
    1. I’ve personally seen two big benefits after adopting this strict separation in production apps. The first is that teams can more easily reuse the Presentation Components because they are nothing more than html/css. Over time this will allow the team to amass a great component library that is more composable allowing even greater code sharing across the organization. The second benefit is that teams can more easily upgrade the infrastructure that powers data flow throughout the application. Because this code is isolated in the Container Components we don’t need to untangle the plumbing code from the html/css that displays it. So you want to swap your global store for an ember service? No problem! We just modify the entry point in the Container Components and the data still flows down to any Presentational Components like it did before the refactor.
    2. This is a pretty good summary of the recent evolution of front-end architecture, described in a framework-independent way.

    1. Critically, these experiences were still often preferable to downloading a separate app on a data plan or spotty WiFi connection

      So the success of chat UIs in Asia is partially owed to their data efficiency. Interesting.

    1. Create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is cancelable, and has no default action. The data attribute must be set to the value passed as the message argument to the postMessage() method, the origin attribute must be set to the serialization of the origin of the script that invoked the method, the lastEventId attribute must be set to the empty string, and the source attribute must be set to the Window object of the default view of the browsing context for which the Document object with which the script is associated is the active document.

      With reference to https://github.com/hypothesis/h/pull/3233 this is where the spec defines the value of the origin attribute.

    1. A common refrain in the JavaScript ecosystem is that opinionated software boxes you in, and that you're better off spending weeks picking the set of libraries that fit the particular needs of your application. But it has been our experience that building a strong community around a shared set of solutions leads to applications that are more maintainable over the long run, and don't leave you itching to throw away everything and rewrite it in a year or two.
    1. And if you were to change any redux related code (reducers or middleware) it'd prompt you to a page reload in the console

      This is at least partially wrong. The whole point of reducers being pure functions is that you can replace them at runtime because they are not stateful. Hot reloading for reducers works as follows:

      1. Call store.replaceReducer() to replace the root reducer with the new implementation
      2. Replay the recorded sequence of actions through the reducer to derive the new application state given the series of actions that have happened.

      Where this will break is if a code change causes existing actions created by a previous version to break.

    1. Several content moderation experts point to Pinterest as an industry leader. Microsoft’s Tarleton Gillespie, author of the forthcoming Free Speech in the Age of Platform, says the company is likely doing the most of any social media company to bridge the divide between platform and user, private company and the public. The platform’s moderation staff is well-funded and supported, and Pinterest is reportedly breaking ground in making its processes transparent to users. For example, Pinterest posts visual examples to illustrate the site’s "acceptable use policy" in an effort to help users better understand the platform’s content guidelines and the decisions moderators make to uphold them.
    1. Would love, but doesn't seem possible

      Quote for Ember hot reloading talk? From the livereload repo

    1. Fun, not entirely serious post complaining about the verbosity of writing functional code in languages like early C# (2.x) which were not designed with it in mind.

    1. This is a useful repository illustrating how to structure an Angular app in a modern way, using new helpers from Angular 1.5.x

    1. Good article on progressive enhancement. Jake Archibald now works for Google, I'm not sure if that was the case at the time.

    1. Brief but useful contrast of Jest vs. Angular approaches to dependency injection.

      The argument is that most code uses one implementation in production and one mock in testing and that using require() as the seam for inserting test doubles makes code easier to write than Angular's approach of implementing its own module system and DI container.

    1. How is all this different from mainstream constructors?Because an instance is created by sending a message to an object, and not by some special construct like a constructor invocation, we can replace the receiver of that message with any object that responds to that message. It can be another class (say, an implementation based on polar coordinates), or it can be a factory object that isn’t a class at all.

      Question: Is this different in any way from say Python where objects are constructed using a function call?

    1. Selections made by users may be extensive and/or cross over internal boundaries in the representation, making it difficult to construct a single selector that robustly describes the correct content. A Range Selector can be used to identify the beginning and the end of the selection by using other Selectors.

      The model here appears to be different from what a 'RangeSelector' looks like in the Hypothesis model.

    2. In many cases it is important to understand the reasons why the Annotation was created, or why the Textual Body was included in the Annotation, not just the times and agents involved. These reasons are provided by declaring the motivation for the Annotation's creation or the purpose for the inclusion of the Body in the Annotation; the "why" rather than the "who" and "when" described in the previous sections.

      Annotations do not currently have a 'motivation' property. For annotations, this should be 'commenting' (?), for replies this should be 'replying'

    3. The agent responsible for generating the serialization of the Annotation. There MAY be 0 or more generator relationships per Annotation

      The JSON-LD output from H does not currently include a generator or generated property. Worth adding?

    4. The Target resource is always an External Web Resource
    5. Bodies or Targets which are External Web Resources MUST have exactly 1 id with the value of the resource's IRI.

      I don't see an id field in the target field in the output of the first-pass at a JSON-LD presenter

    1. Blog post discussing a more productive format for a meetup than the typical case where developers turn up, watch speaker(s) for 30-60 minutes, then leave.

  14. Mar 2016
    1. Programmatic cut and copy to the clipboard It’s now possible to programmatically copy and cut text in response to a user gesture with document.execCommand('copy') and document.execCommand('cut'). Having this ability may eliminate some websites’ last need for the Flash plug-in.

      We'll be able to make use of this for direct linking

    1. Very useful brief explanation of the basics of matrix/vector math that is relevant to 3D graphics and explanation of the process of transforming model vertexes into rendered pixel positions on screen.

    1. define and document the interfaces between them

      Agreed. Your point here is about the fact that we should document them, not how, but I have some proposals around tools that would help us with this - on the front-end at least.

    2. 1. The Hypothesis client library (hypothesis.js)

      I would consider the sidebar an application rather than a library - since it is something that is fully usable by an end-user on its own.

      Making the client a distinct entity from the web service is an important first step, but given the goals outlined above, I think this is still quite monolithic and we already have use cases (eg. EPUB) which will require dividing this into smaller pieces, with public APIs connecting them. This doesn't have to happen all at once, but we'll need to have it in mind. I think the initial division might look something like this:

      1. The code that lives in the web page or application and handles creation, anchoring and display of annotations in text documents. This is currently the code that lives in h/static/scripts/annotator. This would expose a public API that clients could use in order to display those annotations in a viewer (like our sidebar), attach information about the parent document, persist annotations to/from storage. This raises the question of whether we should use Annotator (2.x?)'s APIs as the public API or whether we should define our own and have Annotator be an implementation detail.
      2. The app that allows the user to view, edit, reply and otherwise interact with annotations on the content. This would connect with clients implementing the public API in (1). Initially this would include our HTML and PDF clients and I expect we would introduce one for EPUB as well.
      3. A library that serves as an SDK for the Hypothesis service and compatible web services. This would be used by (2) for retrieving/persisting annotations, subscribing to real-time updates etc.

      I would envision that (1) and (3) would have minimal external dependencies, so that they could be integrated into a variety of environments.

    3. it's about building a platform that will outlast Hypothesis the organisation.

      Providing a platform that other people can run is an important part of building something that can outlast the organization, but a big factor is providing a tool and service that proves the potential value of such a layer over the web.

      There are a few aspects to this which I think are relevant to the architecture of H:

      Enabling Integrations

      I think a lot of the potential value of H and annotation tools in general depends on making it convenient to integrate it with other services and web content. Areas of integration include:

      1. Identity/user profile and authentication.
      2. Custom document viewers
      3. Providers of machine-generated annotations

      (2) and (3) are relevant in terms of how we partition the client into libraries and what public APIs we expose.

      Enabling us to provide services on top of the data

      When talking about separating the client and web service, I think there has been the implicit assumption that the architecture would be like this:

      Hypothesis WS <==> Client <==> Third party identity/annotation providers
      

      Whilst I we will clearly need to support that model, I think we could also satisfy many needs with a model where the client communicates only (or primarily) with our service, and our service then communicates with third parties:

      Client <==> Hypothesis Web Service <==> 3rd party providers
      

      The primary advantages of this approach, from my perspective, is more control over the quality and consistency of user experience and the ability to provide services around that third-party data.

    4. It is far too early to be talking about creating new projects and git repositories for most of the components described above.

      My instinct at the moment is that I would rather we didn't create a bunch of new projects and Git repositories. I completely agree that we want to restructure the source tree so that it mirrors the overall architecture. However, I think having a single repository (or a very small number) has advantages in terms of reducing the total volume of project config boilerplate, making developers feel freer to engage with any part of the codebase, keeping track of issues and PRs and supporting open source contributions etc.

    5. This library is semantically versioned and released independently, and is deployed as built assets to a CDN.

      We could deploy this via npm to make it convenient for 3rd-party consumers. If we split the application as I suggested above then each of the parts with a public API would be its own npm package.

      We could potentially leverage cdnjs as the CDN if we did this, although we could obviously quite easily host it ourselves as well.

    6. In particular, we are pursuing two goals which can at times be in direct tension

      They can also be mutually very supportive.

      In particular, having a clear architecture with well-defined interfaces between the components is something that will help us regardless of what other people do with it.