79 Matching Annotations
  1. May 2017
    1. the termconceptto denote the underlying notion.

      Concepts according to section 4 of this paper:

      • Flexible Probabilities: compute prob on the fly by allowing prob to be part of an atom.
      • Distributional Clauses: A shorthand notation to assign a distribution to a random variable
      • Unknown Objects : Dropping the closed world assumption, and dropping the unique names assumption.
      • Stochastic Memoization: "If a random variable in Church is memoized, subsequent calls to it simply look up the result of the first call, similarly to tabling in logic programming"
      • Constraints
      • Negation as Failure
      • Second Order Predicates
      • Meta-Calls
      • Time and Dynamics
      • Generalized Labels for Facts and Queries
    2. It is often assumed that probabilistic facts do not unify with other probabilisticfacts or heads of rules

      I think this is saying that you can't follow a chain of reasoning without considering the probabilities.

      e.g. All men are social + Socrates is a man => Socrates is social, disregards the probabilities of all men are social and Socrates is a man.

    3. Legal groundings of such facts can also be restricted by providing a domain, asin the following variant of our alarm example where all persons have the sameprobability of independently hearing the alarm

      Just a special case where several probabilistic facts have the same probability.

    4. we obtain the success probabilityofcalls(mary),P(calls(mary)) = 0:196

      OK, but this is the marginal distribution that Mary calls? Not conditional on the facts that we already know, that a burglary happened, that an earthquake did not, that Mary heard the alarm?

    5. backward reasoning

      query: call

      It seems to me that we need to prove that 'call' happened. ```

      • For this we need to prove that calls(X) happened.
        • For this we need to prove that alarm,hears_alarm(X) happened.
          • For this we need to prove that either burglary or an earthquake happened.

      Since burglary is true, subbing X = Mary proves calls.

    6. The corresponding logicprogram obtained by adding the set of rulesRto the set of facts, also called apossible world

      Notice that probabilities are gone.

    7. To summarize, the key contributions of this paper ar
      • Identify core concepts used by various probabilstic languages
      • Discussion of the execution mechanism that they require
      • Positioning of state-of-the-art prob languages and implementations wrt these concepts
    8. Typical probabilistic programming languages, on the other hand, employa variant of Sato's distribution semantics (Sato, 1995), in which random variablesdirectly correspond to ground facts and a traditional program speci es how to de-duce further knowledge from these facts

      Difference:

      SRL: Knowledge-based model construction, logic is used as a template for constructing a graphical model.

      Typical prob prog languages: Sato's distribution semantics, in which RVs directly correspond to ground facts. A traditional program specifies how to deduce further knowledge from these facts.

    9. By now, it is commonly accepted that the more interestingquestion is concerned with the underlying concepts that these languages employand their e ect on the inference mechanisms, as their expressive power is oftenvery similar.

      Expressiveness is OK. Interesting? This:

      • Underlying concepts employed
      • Effect on inference mechanisms
    10. To obtain a better understanding of probabilistic programming, we identify anumber of core programming concepts underlying the primitives used by variousprobabilistic languages, discuss the execution mechanisms that they require anduse these to position and survey state-of-the-art probabilistic languages and theirimplementation.While doing so, we focus on probabilistic extensions oflogicprogramminglanguages such as Prolog, which have been considered for over 20 years.

      A survey of probabilistic logic languages, with a bit of talk about probabilistic extensions to only logic languages.

  2. Apr 2017
    1. We performed experiments on two real data sets: the first is a pro-prietary address data set and the second is the publicly availableDBLP data set. The address data set contains 1 million strings,each a concatenation of an organization name and address (street,city, zip, state). The DBLP data set contains around 0.5 millionstrings, each a concatenation of authors and title of a publication.

      Datasets

    2. The self-proclaimed be all and end all - does everything exactly and better than everything else - like throwing a bunch of performance metric magnets at the refrigerator and seeing that everything sticks.

      Challenge: Find holes in the supremacy claims.

    1. Motivation of the paper is articulated in the conclusion: We have divided the problem into two aspects: theblack-box functions that match and merge records, and theER algorithm that invokes these functions. In our opinion,this division has two important advantages: (i) it yields gene-ric ER algorithms that can be used, with well-defined se-mantics, in many applications, and (ii) it lets us focus onour performance measure, the number of black-box invoca-tions.

    2. n summary, ER is an inherently expensive process, sothat only relatively small sets of records can be handled.Thus, large data sets need to be partitioned into smaller setsthat can be resolved in detail. How large a data set can beexhaustively resolved depends on the application. It is alsopossible to distribute the ER computations across multipleprocessors, in order to handle larger data sets. In [7] westudy various strategies for distributing the work done byR-Swoosh.

      Worth a look, methinks.

    3. Because record matching is inherently expensive, largesets of input records are often divided into “buckets” usingapplication knowledge, and then ER is run on each bucket.

      Blocking. They're talking about blocking.

    1. One key challenge for the classi cation-based approachinvolves the selection of training examples from which tolearn the similarity classi ers. Ideally, we want our modelto correctly predict the similarity of every document to everyother document (or every centroid, based on the modelingchoice described above) in the dataset. However, creating atraining example for each document (or document-centroid)pair results in a skewed label distribution, since a large ma-jority of pairs in the training dataset do not belong to thesame event

      Sound familiar?

    2. Radiohead

      Never thought I'd read a paper with "Radiohead" and "my favorite band" in the same sentence. Most certainly not one that wasn't my own anyway.

    1. [6] M. Hernandez and S. Stolfo. Real-world data is dirty:data cleansing and the merge/purge problem.Journalof Data Mining and Knowledge Discovery, 1(2), 1998.

      Say the folks that didn't really run experiments on real-world data :/

    2. The experimental comparisons should be extended to non-DBGendata sets to investigate dependencies on data sourcesand their error characteristics.

      Yeah. Tell me about it.

  3. Oct 2016
    1. Had this model failed to identify a bug, we could subsequently haveenriched the sketched specifications.

      This works backwards, saying, well, we know there's a bug, and we will change the sketched spec until MOLLY is able to find one...

    1. Let C be a bivalent configuration of P, and let e = (p, m) be an event that is applicable to C. Let %? be the set of configurations reachable from C without applying e, and let 9 = e(g) = (e(E) I E E %? and e is applicable to E). Then, 9 contains a bivalent configuration.

      This is the setup for the second part:

      "Second, we construct an admissible run that avoids ever taking a step that would commit the system to a particular decision."

      As I understand it, what the authors say is that for an arbitrary event e and for an arbitrary configuration C, there is always a bivalent state that is reachable from C (the set fancy D in this case). And reaching a decision means that there needs to be a univalent state.

    2. assumed partial correctness.

      It is a valid assumption because total correctness is assumed which subsumes partial correctness.

      "First, we argue that there is some initial configuration in which the decision is not already predetermined." - this is the proof of this part.

      But why is a proof of this needed when the partial correctness already assumes that it is true that the configuration is bivalent (if all P begins with an initial config)?

    3. The basic idea is to show circumstances under which the protocol remains forever indecisive. This involves two steps. First, we argue that there is some initial configuration in which the decision is not already predetermined. Second, we construct an admissible run that avoids ever taking a step that would commit the system to a particular decision.

      simple enough, right?

    4. Thus, the message system acts nondeterministically, subject only to the condition that if receive( p) is performed infinitely many times, then every message (p, m) in the message buffer is eventually delivered. In particular, the message system is allowed to return 0 a finite number of times in response to receive(p), even though a message (p, m) is present in the buffer

      Hard to follow. Explain please?

      Also, why is this particular assumption made, that at infinity receives, every message is eventually delivered - how does not making this assumption change things?

    5. Assumptions:

      1. No Byzantine failures
      2. Reliable message system
      3. No access to sync clocks (timeout based algos can't be used)
      4. If one non-faulty process receives a message, all non-faulty processes will and the sender knows this.

      No assumptions on:

      1. Relative speeds of processes
      2. Delay time in delivering a message
      3. The ability to detect the death of a process
    6. . In one atomic step, a process can attempt to receive a message, perform local computation on the basis of

      One atomic step:

      1. Attempt to receive a message
      2. perform local computation on the basis of whether or not a message was delivered to it
      3. Send an arbitrary but finite set of messages to other processes
    1. some application developers may not want to write their own conflict resolution mechanisms and choose to push it down to the data store

      Of course they will, it's the job of the data store, right?

    2. give services control over their system properties, such as durability and consistency, and to let services make their own tradeoffs between functionality, performance and cost-effectiveness

      Interesting... need to unpack this

    3. An SLA stated in terms of mean or median response times will not address the performance of this important customer segment.

      Important insight

    4. non-hostile and there are no security related requirements such as authentication and authorizatio

      Oooh, that's a pretty useful assumption

    5. In the past year, Dynamo has been the underlying storage technology for a number of the core services in Amazon’s e-commerce platform. It was able to scale to extreme peak loads efficiently without any downtime during the busy holiday shopping season. For example, the service that maintains shopping cart (Shopping Cart Service) served tens of millions requests that resulted in well over 3 million checkouts in a single day and the service that manages session state handled hundreds of thousands of concurrently active sessions.

      Much ads

  4. Sep 2016
    1. Exception Handling

      Whereas the exceptions relating to the distributed nature of the system (network partition etc) should have been part of this section, it unfortunately isn't. It is reduced to a call failed exception without too much detail about how to handle it.

    2. Having read the Waldo et al. paper first, I instinctively feel the need to defend this paper, which is kind of unprofessional. But I haven't enough time to shake off the emotions, so here goes.

      The problem that this paper has and also what sort of shields it from the Waldo et al. note is that it doesn't really try too hard to be a seminal contribution to the area of DS as we know it today. All it claims to do is provide an efficient, secure, programmer-of-that-day friendly way for more than one computer to communicate. They do back these claims up with their evaluations. So if they were honest, it is clear that in their evaluation, they did not come across problems that are fundamental to DS such as network partitions and node failures.

      Surely, this paper has important contributions considering that the DS paradigm of choice today - Map Reduce, makes use of some form of RPC to make calls to the map and reduce functions.

    3. Violation of this principle seemed likely to lead us into the complexities that have made previous communication packages and protocols difficult to use

      This line makes it pretty clear that "Distributed systems" and the common problems associated with them was not the concern that was being addressed, but it truly was creating a distributed communication protocol.

    4. imbedded in our major language, Mesa

      It does seem to be a major theme that implementation language choices ARE affecting the design choices, as was correctly pointed out in the Waldo et al paper.

    5. use procedure calls as the paradigm for expressing control and data transfers. For example, message passing might be a plausible alternative.

      Could someone elaborate the difference between the two mentioned paradigms? Doesn't procedure calls involve message passing?

    6. he primary purpose

      When I read this, having read the criticism that has been leveled against this paper in the A note on distributed computing paper. I think I see where the problem lies - this paper was set in a different time, and the problems that were trying to be solved were very different from the problems prevalent in 1994. It feels like at this stage of inception, the very problem of making calls to remote machines was a very new field, and various simplifying assumptions were made, and that performance was thought of, but not in the way that one would think of it ten years later.

    7. these should make it easier to build distributed computations, and to get them right. Another is efficiency: procedure calls seem simple enough for the communication to be quite rapid. A third is generality: in singie-machine com- putations, procedures are often the most important mechanism for communica- tion between parts of the algorithm.

      It's very hard to buy in to these assumptions, to be honest. So I hope they have justified why they think so in the pages to come.

    8. these should make it easier to build distributed computations, and to get them right.

      Easier to build, sure. Get them right? How did they get that idea?

    1. that all current invocations or callsfor system services will be eventually converted into callsthat might be to an object residing on some other machine.

      Why couldn't this be true, if for example, TCP were used?

    2. for easily under-standable reasons

      Are they implying the overhead of changing a lot of code? Or something more fundamental? I don't see the easily understandable reason.

    3. Unless the naming interfaceincludes an operation to lock a naming context, there is noway that I can make this operation completely robust.

      But isn't this true even today, and is just inherently a problem of distributed systems rather than any particular assumptions in a particular paradigm?

    4. This is not to argue against pleasant programming inter-faces.

      There seems to be an inherent assumption that programming interfaces that address DS problems have to be unpleasant, and I don't exactly see why this should be true.

    5. The hard problems have to do with differ-ences in memory access paradigms between local and dis-tributed entities.

      How so? In any way that's fundamentally different than process discovery?

    6. Thehard problems in distributed computing concern dealingwith partial failure and the lack of a central resource man-ager.

      Obviously...

    7. A less optimistic explanation of the failure of each attemptat unification holds that any such attempt will fail for thesimple reason that programming distributed applications isnot the same as programming non-distributed applications

      Did the development of BOOM, BLOOM and BUD take the view that this explanation takes?

    8. what-ever programming model is currently in vogue

      What is it in the 2000s and 2010s? Have we converged to a unified model? How have thing changed?