Hypothesis

44 Matching Annotations

Nov 2020
peerj.com peerj.com

Sustainable computational science: the ReScience initiative

1
1. khinsen 03 Nov 2020
  
  in Public
  
  (Topalidou & Rougier, 2015) Our initial motivation and the main reason for replicating the model is that we needed it in order to collaborate with our neurobiologist colleagues. When we arrived in our new lab, the model had just been published (2013) but the original author had left the lab a few months before our arrival. There was no public repository nor version control, and the paper describing the model was incomplete and partly inaccurate. We managed to get our hands on the original sources (6,000 lines of Delphi) only to realize we could not compile them. It took us three months to replicate it using 250 lines of Python. But at this time, there was no place to publish this kind of replication to share the new code with colleagues. Since then, we have refined the model and made new predictions that have been confirmed. Our initial replication effort really gave the model a second life.
  
  An example of the situations that motiveate work published in ReScience.
Visit annotations in context

Annotators

khinsen

URL

peerj.com/articles/cs-142/
Nov 2019
alexandrehocquet.perso.univ-lorraine.fr alexandrehocquet.perso.univ-lorraine.fr

5computationnel

8
1. khinsen 05 Nov 2019
  
  in Public
  
  les politiques de "technology transfer" des universités
  
  C'était la mode pendant longtemps, mais la science ouverte est passée par là et aujourd'hui les universités (et labos...) encouragent de plus en plus les chercheurs à contribuer aux logiciels libres.
2. khinsen 05 Nov 2019
  
  in Public
  
  n'est pas récompensé par la publication
  
  Il y a eu beaucoup de changements ces dernières années. D'un côté il y a des journaux spécialisés dans les logiciels scientifiques (par exemple JOSS, https://joss.theoj.org/) qui permettent de traduire la programmation en publication (c'est l'objectif affiché, les articles n'ont pas d'intérêt pour un lecteur potentiel), et de l'autre côté les instances d'évaluation un peu partout dans le monde commence à tenir compte les logiciels.
3. khinsen 05 Nov 2019
  
  in Public
  
  la validité statistique est grandement dépendante du domaine à laquelle on l'applique.
  
  Ça me semble évident - y a-t-il des gens qui disent le contraire ?
4. khinsen 05 Nov 2019
  
  in Public
  
  elle ne concernerait que la physique
  
  Euhhh... c'est qui qui dit ce genre de chose ? La reproductibilité expérimentale est un grand sujet en biologie, par exemple, mais personne n'y parle de crise.
5. khinsen 05 Nov 2019
  
  in Public
  
  réduire le computationnel au traitement de données
  
  Ça va un peu dans les deux sens: les simulateurs s'intéressent beaucoup moins à la reproductibilité que les analystes de données.
6. khinsen 05 Nov 2019
  
  in Public
  
  tendance à l'invisibiliser
  
  Ça vaut plus largement pour les logiciels dans la recherche. En dehors de la recherche principalement computationnelle, il reste normal de ne pas mentionner les logiciels qu'on utilise. Ce qui explique aussi pourquoi il est si difficile de trouver du financement pour le développement de logiciels scientifiques.
7. khinsen 04 Nov 2019
  
  in Public
  
  même si on sait bien que c'est plus compliqué, voire illusoire
  
  Aujourd'hui, oui. Quand j'ai fait DEA et thèse en physique (1988-1992), la reproductibilité était réelle et vérifiable, et parfois vérifiée pour s'assurer qu'on n'avait pas fait de fautes de frappe en rentrant les paramètres. J'ai vu la reproductibilité computationnelle s'évaporer lentement au cours de ma carrière. La complexité des logiciels en est le premier responsable. Mais tout était plus simple à l'époque: pas de collaborations autour du globe, tous les calculs se faisaient sur le seul ordinateur auquel on avait accès, etc.
  
  Autre point: "on sait bien que..." ne vaut pas pour tout le monde. Je connais peu de chercheurs qui auraient imaginé qu'un calcul puisse être non-reproductible avant que cela ne leur soit arrivé personnellement. Je suppose que cet élément de surprise a contribué à l'idée d'une crise.
8. khinsen 04 Nov 2019
  
  in Public
  
  Science Code Manifesto en 2011
  
  Que le temps passe... Je m'en souvient bien, je l'ai signé, mais ça fait longtemps que je n'en ai plus entendu parler. Aujourd'hui le site http://sciencecodemanifesto.org/ ne fonctionne plus. Que feraient les historien de la science récente sans l'Internet Archive ?
Visit annotations in context

Annotators

khinsen

URL

alexandrehocquet.perso.univ-lorraine.fr/plan_files/5computationnel.html
alexandrehocquet.perso.univ-lorraine.fr alexandrehocquet.perso.univ-lorraine.fr

1complexe

1
1. khinsen 04 Nov 2019
  
  in Public
  
  Il est notoire que les scientifiques en général n'ont (en général) aucune motivation à reproduire les expériences des autres
  
  Il y a une exception dont on ne parle pas beaucoup : beaucoup de thésards refont des expériences ou calculs d'un member plus sénior du labo, pour se faire la main avec les techniques. Ce n'est pas une reproduction systématique, et les résultats ne sortent que rarement du labo. C'était une motivation pour la création de ReScience.
Visit annotations in context

Annotators

khinsen

URL

alexandrehocquet.perso.univ-lorraine.fr/plan_files/1complexe.html
alexandrehocquet.perso.univ-lorraine.fr alexandrehocquet.perso.univ-lorraine.fr

0intro

1
1. khinsen 04 Nov 2019
  
  in Public
  
  Il est probable que l'utilisation du Web of Science biaise les résultats en défaveur des revues scientifiques spécialisées en sciences humaines, mais il est quand même frappant que la communauté scientifique censée la mieux connaître le sujet soit ou bien muette, ou bien inaudible.
  
  A ne pas oublier: la barrière de l'accès à la littérature entre disciplines. J'ai une belle collection de références en philosophie des sciences que je pense regard un beau jour quand je me trouverai à proximité d'une bibliothèque qui pourrait avoir ces livres. Pour les journaux, il y a Sci-Hub, mais pour le livres, rien.
Visit annotations in context

Annotators

khinsen

URL

alexandrehocquet.perso.univ-lorraine.fr/plan_files/0intro.html
Sep 2017
elifesciences.org elifesciences.org

Reproducible Document Stack – supporting the next-generation research article

5
1. khinsen 08 Sep 2017
  
  in Public
  
  Any third-party review of data and code assets is challenging.
  
  You might be interested in how this is handled at ReScience, where code review is the most important aspect of reviewing submissions. (Disclaimer: I am one of ReScience's founding editors)
2. khinsen 08 Sep 2017
  
  in Public
  
  current authoring tools such as Microsoft Word are popular for a reason
  
  Popular with some people, hated by others. No user interface will be universally appreciated. Please make sure that the underlying data formats are open AND designed to accomodate different types of user interfaces.
3. khinsen 08 Sep 2017
  
  in Public
  
  some scripts and data are too complex to sit within a reproducible document
  
  What would be "too complex" for a reproducible document? I can see cases of "too large", in particular for datasets, or for entire virtual machines, but complexity shouldn't be an obstacle.
4. khinsen 08 Sep 2017
  
  in Public
  
  The diversity of programming environments used in research must be supported
  
  Yes, that's an important point. But it is also desirable that programming environments evolve to facilitate integration with a framework such as the one you want to set up. For example, it ought to be possible for one subdocument, say a Jupyter notebook, to refer to code/data/explanation in another subdocument.
5. khinsen 08 Sep 2017
  
  in Public
  
  creation of an open standard
  
  I have mixed feelings about this "open standard" approach. I do agree with all you say in favor of it. But good standards require a thorough understanding of the requirements, working habits, and technology of the domain of standardization. For reproducible computational research, all these are still evolving rapidly. The two possible negative outcomes of premature standardization are (1) the standard becoming irrelevant and (2) the standard becoming a constraint to progress.
Visit annotations in context

Annotators

khinsen

URL

elifesciences.org/labs/7dbeb390/reproducible-document-stack-supporting-the-next-generation-research-article
Jan 2017
www.edge.org www.edge.org

Edge.org

1
1. khinsen 09 Jan 2017
  
  in Public
  
  in an isolated system (one that is not taking in energy)
  
  This is the key point that invalidates most of the following arguments. Life does not happen in equilibrium. Applying the Second Law to social phenomena such as poverty, or to perceptional phenomena such as misfortune, is at best an analogy, but not a proper use of a scientific concept.
Visit annotations in context

Annotators

khinsen

URL

edge.org/responses/what-scientific-term-or concept-ought-to-be-more-widely-known
Feb 2016
github.com github.com

betatim/openscienceprize

11
1. khinsen 26 Feb 2016
  
  in Public
  
  For open source projects momentum is king. The guides, tools, and advice created in this project will be opinionated. This steers users towards the recommended open-source tools. These tools might not be perfect but having more users generates momentum for those tools which results in improvement of those tools. This is better than fragmentation which occurs if individuals go off to build the missing feature in a new project.
  
  This is true but as much a problem as a solution to other problems. Momentum can take you into an dead-end road before you notice. Much of the mess we have to live with was created in this way. And I suspect that the jury judging our proposal is aware of this.
  
  I'd prefer to re-phrase this with an emphasis on open source and open communities as a mechanism for building a consensus through incremental improvements. Being opinionated and guiding other people's choices is better left to a later time when our choices have been validated by experience.
2. khinsen 26 Feb 2016
  
  in Public
  
  tools available
  
  We should also list other initiatives at creating executable papers, if only to show that we are aware of them and willing to learn from them. Plus references to related ideas such as the "transitive credit" idea.
3. khinsen 26 Feb 2016
  
  in Public
  
  two
  
  Three. We must also to define and document the file formats APIs, etc. that people need to know to reuse parts of an executable paper. Otherwise people would be limited to the functionality of the Web tool, creating a new layer of technological lock-in.
  
  The "executable paper" is first and foremost an electronic document. The document matters more in the long run than that tool that was used to generate it: https://khinsen.wordpress.com/2015/09/03/beyond-jupyter-whats-in-a-notebook/
4. khinsen 26 Feb 2016
  
  in Public
  
  Executable papers can directly produce or be retrofitted to produce their key results in a format easily ingested by projects like the Contentmine.
  
  Re-phrase this as openness for content mining in general, citing a specific initiative only as an example.
5. khinsen 26 Feb 2016
  
  in Public
  
  The dataset of one Large Electron Positron collider experiment used to be stored on a distributed system but now can be easily stored on a SSD and analysed in its entirety on a laptop. In 100 years undergraduates will routinely rediscover the Higgs boson
  
  That's a valid point but I wonder if we can find an illustration from the life sciences, which is what the Open Science Prize focuses on.
6. khinsen 26 Feb 2016
  
  in Public
  
  Each of their projects exists in a separate, customisable environment. They can not interact with each other.
  
  That's actually a problem as much as it is a solution to other problems. It was and still is my #1 unsolved issue with ActivePapers. In practice you often do have dependencies between projects running in parallel, and managing them is difficult.
  
  This is not the place to discuss the technicalities, but I'd be careful about selling isolation as an advantage.
7. khinsen 26 Feb 2016
  
  in Public
  
  higher citation rates, increased reputation, and reduced effort at later stages of a research project.
  
  Can we back up these claims by any evidence? Given that executable barely exist today, that seems difficult. Maybe we should widen the scope of this paragraph to cover "reproducible research".
8. khinsen 26 Feb 2016
  
  in Public
  
  After the completion of a project a research article is submitted for review to an academic journal and after several iterations either accepted or rejected by the journal. Editors choose the reviewers who volunteer their time and expertise. They receive no credit for their work as their identity is only known to the editor.
  
  The defects of the reviewing system are outside of the scope of this proposal, so I wouldn't talk about them at all. It's sufficient to say that print publications have lead to limitations in the kind and amount of information that can be shared, and that we want to break those limits.
9. khinsen 26 Feb 2016
  
  in Public
  
  how often work is reused
  
  That actually requires a fourth point on the to-do list: a technique to reference items inside a published executable paper. That's not as simple as it may seem to be, as I learned during my work on ActivePapers.
10. khinsen 26 Feb 2016
  
  in Public
  
  solved problems
  
  I'd agree with "mostly solved problem", but there is still some unexplored territory. Which might well be the topic of a competing proposition, so let's be careful with strong claims.
11. khinsen 26 Feb 2016
  
  in Public
  
  believe that research progresses by building on previous research
  
  A good place for the famous Newton quote of "standing on the shoulders of giants".
Visit annotations in context

Annotators

khinsen

URL

github.com/betatim/openscienceprize/blob/7cd9fd5615b44daf9e720cdc486a4f9ec8054979/proposal.md
2016-aesir.readthedocs.org 2016-aesir.readthedocs.org

Annotation as a service: helping build a global knowledge layer with annotation of the dark literature. — 2016-easir 1.0 documentation

5
1. khinsen 24 Feb 2016
  
  in Public
  
  a single technical solution
  
  I see two distinct categories in these use cases. They can probably be handled by a single technical solution, but I think it's worth pointing out the distinction because it matters for users:
  
  Plain text annotation, for human readers
  
  Metadata annotation, in a formal language, for computational processing.
2. khinsen 24 Feb 2016
  
  in Public
  
  server-side software to support any annotation engine
  
  That's again a bit mysterious. "Any annotation engine" presumably means something else than hypothes.is. Fine. And the software runs on some server. But what does it do???
3. khinsen 24 Feb 2016
  
  in Public
  
  an analysis server
  
  Will you also provide such analysis server? Or does it already exist? Or do you hope that someone else will create one? It seems that the analysis server is doing all the hard work, so this is not a minor question.
4. khinsen 24 Feb 2016
  
  in Public
  
  relies on
  
  "relies on" sounds strange. Until today, I thought that hypothes.is' main reason for existence is to support manual annotation. Why should that suddenly be a problem?
5. khinsen 24 Feb 2016
  
  in Public
  
  server-side functionality
  
  It isn't very clear if the server mentioned here is the hypothes.is server, the server hosing the document to be commented on, or yet another server.
Visit annotations in context

Annotators

khinsen

URL

2016-aesir.readthedocs.org/en/latest/primary.html
Dec 2015
www.johndcook.com www.johndcook.com

Dimensional analysis and types

3
1. khinsen 02 Dec 2015
  
  in Public
  
  A more active approach would be, for example, to introduce different types for temperatures and masses.
  
  Assuming that the type system provides for this. And assuming that it doesn't penalize you for doing this, for example by no longer allowing you to multiply both temperatures and masses by a plain scalar factor.
  
  programming
2. khinsen 02 Dec 2015
  
  in Public
  
  Haskell has a much more expressive type system than less formal languages, and yet the Haskell community is looking for ways to make the type system even more expressive.
  
  Again I think the problem is that type systems want to be universal, one size fits all. Even Haskell's type system (at least at the level of the '98 standard, I don't dare make a statement about the hundreds of extensions) is not sufficient to implement dimensional analysis. And yet, implementing dimensional analysis on its own is quite straightforward.
  
  programming
3. khinsen 02 Dec 2015
  
  in Public
  
  Although I mostly agree with this argument, it leaves out cost. Stronger type checking catches more errors, but at what cost?
  
  Dimensional analysis is indeed a simple protocol with a well-defined application domain. Type systems in programming languages want to be universal and in most cases compulsory for all of a program. Gradual typing is more flexible, but still assumes that a single type system is good for everything. Could we have type systems as libraries and use them as it seems appropriate?
  
  programming
Visit annotations in context

Tags

programming

Annotators

khinsen

URL

johndcook.com/blog/2015/12/01/dimensional-analysis-and-types/
Nov 2015
worrydream.com worrydream.com

What can a technologist do about climate change? A personal view.

7
1. khinsen 25 Nov 2015
  
  in Public
  
  What might such a tool look like?
  
  It's not just the tool that needs to be invented, it's also a good digital representation for models and facts. The current state (summarized earlier in the essay) is far from good enough. Would an authoring tool support Mathematica AND Simulink AND Modelica etc.? Read data from Excel files, SQL databases, and whatever else people have invented? How can all the elements of an interactive document be brought together? It all comes down to intelligent interfacing, something that the computing industry hasn't been good at until now. Perhaps because the incentives in a competitive market encourage the opposite.
2. khinsen 25 Nov 2015
  
  in Public
  
  The importance of models may need to be underscored in this age of “big data” and “data mining”.
  
  Another part that made my day!
3. khinsen 25 Nov 2015
  
  in Public
  
  What if there were some way Tesla could reveal their open problems?
  
  This seems more of a social than a technological problem. If Tesla really wanted to discuss their open problems and accept help from outside, they could simply post a description on a Web site. If everyone did that, we'd probably need better technology for organizing and channeling that information, but that's not the immediate issue at hand.
4. khinsen 25 Nov 2015
  
  in Public
  
  Modelica is a programming language, but it is not a language for software development!
  
  This section made my day! I have been arguing this point for a while (here for example), but I have the impression that hardly anyone really understands what I am writing about.
5. khinsen 25 Nov 2015
  
  in Public
  
  I’m also happy to endorse Julia because, well, it’s just about the only example of well-grounded academic research in technical computing.
  
  Sad but true. I am not so convinced that Julia will make a big difference to scientific computing because it does not even try to address the most important problems that we have in this field (weak replicability, big obstacles to validation, lack of support for the "scientific innovation lifecycle" from exploring ideas to applying tried-and-trusted methods). But yes, it has the merit of stirring up the frozen landscape of scientific languages.
6. khinsen 25 Nov 2015
  
  in Public
  
  The very concept of a “programming language” originated with languages for scientists — now such languages aren’t even part of the discussion! Yet they remain the tools by which humanity understands the world and builds a better one.
  
  I have made the same observation and I blame it on two main factors. First, scientific computing has become a niche market because it hasn't grown nearly as much since the 1960s as most other application domains of computing. Second, most scientists and engineers today take computing technology as imposed from outside, not as something they can influence themselves. Why should computer scientists work on new languages for scientific computing if its users are apparently happy with what they have?
7. khinsen 25 Nov 2015
  
  in Public
  
  The electric grid would collapse.
  
  One reason for this is that voltage and frequency are maintained within very strict limits in the power grid. But is that still so important? We all have AC adapters that work on pretty much any voltage. For devices that draw more power this is less obvious, but it's worth a thought.
Visit annotations in context

Annotators

khinsen

URL

worrydream.com/ClimateChange/
Oct 2015
www.wsj.com www.wsj.com

The Myth of Basic Science

1
1. khinsen 27 Oct 2015
  
  in Public
  
  When you examine the history of innovation, you find, again and again, that scientific breakthroughs are the effect, not the cause, of technological change.
  
  In complex processes such as research and technology, linear cause-effect relationships are rare. When some people say that A causes B and others reply that B causes A, they are usually both wrong in the sense of arbitrarily focusing on one aspect and neglecting the other one.
  
  A better pattern to describe that relation is the yin-yang pattern from Asian philosophical traditions. In the case of science and technology, the yin part is scientific research, and the yang part is technological innovation (or engineering, for those who prefer traditional terms to jargon innovation). Both depend on each other and each particular case of a research finding or a new technology can be traced back to an alternation of both aspects.
Visit annotations in context

Annotators

khinsen

URL

wsj.com/articles/the-myth-of-basic-science-1445613954

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Tags

Annotators

URL

Annotators

URL

Annotators

URL