127 Matching Annotations
  1. Aug 2017
    1. PeerJ review #1:

      Basic reporting

      The paper have clear language, significantly improved since first review. The dataset is augmented with extra material, and referenced properly from Figshare with https://doi.org/10.6084/m9.figshare.3980463.v5

      Argumentation is well-structured and founded, although a couple of citations or examples are missing, e.g. claim that HTML allow ambiguous structures, or the novel (and unnecessary) use of RDF/XML in a script tag.

      See https://via.hypothes.is/https://essepuntato.github.io/papers/rash-peerj2016/2017-07-06.html#rash-eval for my detailed review per section of this version.

      Experimental design

      The paper describes well the motivation and design of the RASH framework, while also giving an extensive and up to date review of comparative technologies and approaches. The paper explains also challenges and peculiarities encountered in its implementation.

      RASH is a well-designed subset of HTML that emphasizes document structure and semantic annotations. I think it could also be argued that unlike "any HTML5" this design also improves longevity for articles published in RASH HTML.

      My only slight concerns is the extension of WAI-ARIA roles (e.g. "doc-endnotes"), which I could not find any citations for being allowed (or not) within HTML5; as well as the novel use of RDF/XML in a HTML script tag.

      Validity of the findings

      The survey part provides valuable insight into the uptake potential of RASH-like technology - although this should be taken with a grain of salt as the relative low number of survey participants means the data is (as the authors point out) NOT statistically significant. The paper do however provide a good qualitative analysis of the findings, which warrants their inclusion.

      The authors provide well-reasoned conclusions. While my previous review identified some speculative language, this have now been improved.

      Comments for the author

      I am Stian Soiland-Reyes http://orcid.org/0000-0001-9842-9718 and believe in open reviews.

      This review is licensed under a Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/

      The authors present the RASH framework, a subset of XHTML for academic publishing, along with software tools for its validation and conversion. The paper review the state of art in academic HTML publishing, motivate and detail the design of the framework, and evaluate its uptake and future challenges.

      While personally I would have welcomed a more visionary/revolutionary approach for changing academic publishing for the Web, the authors take a more conservative approach with emphasis on pragmatic tooling to support existing authoring workflows (e.g. support LaTex and MS Word). From this, RASH can provide a valuable stepping stone for more structured and accessible Web publication workflows for academic publishing.

      I think this is a solid article that presents an important contribution to the further development of web-based scholarly communication.

      I Recommend this article as "Accept" - although I have left some annotations in https://via.hypothes.is/https://essepuntato.github.io/papers/rash-peerj2016/2017-07-06.html#rash-eval that I hope the authors will consider (along with this review) if a revision nevertheless take place.

    2. For instance, several of the papers written by Semantic Web experts do not include any RDF statements other than those enforced by RASH.

      "Eat your own dog food" is still not common practice in SW community..

    3. SAVE-SD 2015 Survey

      Good qualitative summary of the surveys. We can't conclude too much from the quantitative side (Table 3) as the low number of participants (6+7 authors, 4+3 reviewers) don't provide much statistical significance alone.

    4. rmat, i.e. interactiveness, accessibility and easiness to be processed by machine. In addition, RAJE uses the GitHub API so as to allow authors to store their articles online, to keep track of changes by means of the GitHub services, and to share the articles with others.

      Very cool!

    5. with the attribute type set to application/rdf+xml for adding plain RDF/XML content [19].

      Would RDF/XML actually be valid within XHTML's script tag? Do you mean within a CDATA block? This custom script block sounds dangerous to me without specifying clearly exactly how it should be used and escaped, as (unlike JSON-LD) and Turtle), RDF/XML specifications do not define how use it inside <script> within HTML/XHTML (perhaps deliberately).

      IMHO no-one should encourage RDF/XML anymore, so I would prefer if RASH never even recommended that :)

    6. We applied these guidelines for the definition of RASH. We restricted HTML, which does not use the aforementioned patterns in a systematic way, allowing the creation of arbitrary and, sometimes, quite ambiguous structures by selecting a good subset of elements expressive enough to capture the typical components of a scholarly article while being also well-designed, easy to reuse and robust.

      --> HTML does not use the aforementioned patterns in a systematic way, as it allows the creation of arbitrary and, sometimes, quite ambiguous structures.

      To apply the structural pattern guidelines for RASH, we restricted HTML by selecting a good subset of elements expressive enough to capture the typical components of a scholarly article while being also well-designed, easy to reuse and robust.

    7. rendered by browsers or other software readers

      Longevity wise there's also an issue in that future browsers might not render any "advanced" HTML usage in the same way. Secondly having the full range of HTML available could encourage the use of active content, which although enriching the reading experience today would run with a higher risk of the external content and javascript technologies no longer working just a few years later.

      While we don't know what future browsers would expect, a restricted HTML subset would presumably run at a lower risk of such degradation - particularly if it largely overlaps with elements that have survived earlier HTML versions.

    8. As of August 10, 2016, the online documentation is mainly a fork of the Scholarly HTML specification proposed by science.ai discussed above.

      How compatible Is RASH with this new Scholarly HTML approach?

    9. Of course, publishers, conferences, and workshop organisers, would like to manage new formats in the same way as for the formats they already support

      I don't think this should be given - we must also push publishers and workshop along into the desired "future".

    10. is expected to be produced from MS Word, ODT and LaTeX sources

      So RASH fits into the existing paper production workflow as an output format -- but should it not also be the target for authoring Web-first papers in HTML?

  2. Nov 2016
    1. The language of the article is OK, but I'm afraid it needs some work several places to improve clarity, e.g. by rephrasing or simplification.

      Some of the data (the CSV files) has been shared on Figshare and cited as such, but I am missing the raw data of the extracted RDF annotations as well as the scripts used for extraction.

      The HTML file of the article in RASH format has for some reason not been submitted as an additional file (only cited by URL) -- perhaps this was not supported by PeerJ's submission system?

      The RASH framework and associated software is referenced by GitHub URLs, but without using versioning. For archival purposes and future availability I would appreciate a Zenodo or Figshare archive of a tagged/version of the software, cited using DOI.

    2. This article presents RASH, an HTML-based format for authoring, exchanging and publishing academic articles, arguing that this allows a "Web-first" approach to authoring with a focus on content; but with facilities for semantic annotations. The associated RASH software allows conversion to more traditional article styles (for PDF), as well as conversion from a traditional word processor to RASH HTML.

      While the authors argue that we should aim for a "Web first" publishing model with no conversion to traditional PDFs, as championed by the perhaps more visionary linkedresearch.org movement, here it is proposed that RASH gives a pragmatic approach that requires smaller adaptation to existing co-authoring and publishing workflows.

      The authors has performed a kind of usability survey for RASH users at two consequent workshop, which gives validity to the claims of its purpose, but also (as recognized by the authors) highlights the current gap in tooling and skills to produce the underlying HTML and RDF annotations.

      I think RASH can be seen as an important element of modernizing academic publishing; and while it can be argued that a restricted HTML template like RASH can limit academic authors from publishing articles augmented with state of the art web technology (for instance for interactive data rendering), this model is also a stepping-stone with a stronger focus on portability and longevity that lowers the barriers to get existing publishers on board.

    3. The survey data is robust, but perhaps of a bit small sample size to be statistically sound. This is however helped by the fact that the survey was run over two consequent years.

      The paper makes several unfounded claims using words like "guarantees" - I believe this is more of an English language/grammar issue than actual claims. See detailed comments in-line.

    4. It allows authors to use RDFa annotations [31] within any element of the language5. In addition to RDFa, RASH makes available another way to add RDF statements to the document, i.e., the use of an element script (with the attribute type set to application/rdf+xml, text/turtle or to application/ld+json) within the element head for adding plain RDF/XML [13], Turtle [28] or JSON-LD content [32].

      I would remove these RDF details as this is too early/scary here - they are also explained more later. Perhaps just "Allows authors to use embedded RDF annotations"

    5. Note that accepting HTML as format for submissions in conferences/workshops is a totally different issue, since this choice is normally taken by the organisers.

      I've found EasyChair had problems accepting RASH HTML files and I instead had to make my own PDF from the browser, which was difficult to also ensure hyperlinks worked.

    6. For instance, several of the papers written by Semantic Web experts do not include any RDF statement in addition to those annotations that are enforced by RASH.

      Perhaps the authors were not too concerned with adding RDF if the conference call did not make any "carrot" promises of what any RDF statements would be used for, e.g. added to a SPARQL-queriable triple store across all submitted papers or recognized/rendered by publisher's system.

    7. e W3C RDFa 1.1 Distiller service on e

      Note that of the three <script> options mentioned above, this Distiller only supports RDFa and embedded Turtle, not embedded JSON-LD or the custom embedded RDF/XML.

    8. Some models are already available under the terms of the Apache Licence at http://opennlp.sourceforge.net/models-1.5/. [back]

      Remove footnote, just make the sourceforge URL be the citation.

    9. Since, the program committee, the reviews and the editors will also have access to a LaTeX or a PDF version of the paper, the RASH file is an addition that does not preclude any current workflows

      Rephrase: "As the program committee, reviewers and editors also have access ..."

    10. automatically

      How is it done automatically? Is there an OpenOffice extension for this?

      Do you mean "by executing the converter script"?

      ODT documents are ZIP files, how can an XSLT script access the XML inside it without first unpacking? Presumably this must be coordinated from a second script.. Code citation?

    11. Among other things above just using the RASH grammar only, this script adds relatively sophisticated checking of the datatype microsyntaxes of attribute values.

      Simplify setence: This script also checks datatype microsyntaxes.

    12. This ensures that RASH users get alerted to more potential mistakes in their documents so that they can easily fix them.

      Rephrase, something like: This will hopefully help RASH users to discover and fix any mistakes in their documents early.

    13. The whole visualisation of this document (as any other RASH document)

      As this paper, if accepted, is transferred to other forms (e.g. PDF), this sentence would no longer be true. Rephrase to "The visualizations of RASH documents are renderered by the browser .."

    14. The full project is available at https://github.com/essepuntato/rash/. Please use the hashtag #rashfwk for referring to any of the items defined in the RASH Framework via Twitter or other social platforms.

      This footnote is appropriate for a website, but not for a paper. I would remove it and just use a normal citation for https://github.com/essepuntato/rash/

      I would prefer also a software citation using a Zenodo archived version of the github repository.

    15. It is worth noticing that, excepting three properties from schema.org for defining author's metadata (see Section 2 of the RASH documentation for additional details), RASH does not constrain any particular vocabulary for introducing RDF statements

      Rephrase: It is worth noticing that RASH does not constrain any particular vocabulary for introducing RDF statements, except three properties from schema.org for defining author's metadata (see Section 2 of the RASH documentation for additional details),

    16. https://cdn.mathjax.org

      If RASH papers are using such external scripts, which like here refers to /latest/ and can change (or disappear) over time - is there not a potentially large preservability problem in that RASH documents can't be archived directly?

      I think you can argue that RASH is mainly a "live" format for interchange between co-authors, reviewers and publishers, and that once a paper is to be published, the document is "rendered" from RASH to PDF and/or publication-specific HTML; where such HTML would be modified to use snapshots/archived external resources for embedded images and scripts. Perhaps this could be a new small paragraph or part of Discussion?

    17. The first, is a container-based behaviour, also suggested by JATS [21] by means of the element fn-group, that allows one to specify footnotes (through the element ft) by using a tag that is totally separated from the main text from which it is referenced (usually through XML attributes), as shown in the following excerpt:

      It's worth pointing out that fn-group and ft are JATS-specific elements that are not part of HTML5

    18. A different discourse can be done for the pattern popup, which is used for any structure that, while still not allowing text content inside itself, is nonetheless found in elements with a mixed content context [+t+s], and it is meant to represent complex substructures that interrupt but do not break the main flow of the text,

      I'm afraid I didn't understand these sentences well - could they be rephrased or shortened?

    19. Formulas have been taken in particular consideration, since different ways are possible so as to implement them

      Rephrase to avoid "possible so as" and use active language. "We have taken particular considerations to formulas, since there are different ways to implement them"

    20. which is not pattern-based at all

      This claim is unfounded - HTML, and in particular XHTML - has a detailed specification of patterns with which elements are allowed where -- however they are not fully the structural patterns you mention above.

    21. However, leaving the user (i.e., the author) the freedom of using, potentially, the whole HTML specification may affect, in some way, the whole writing and publishing process of articles.

      Full HTML would also need serious considerations for archival and future accessibility purposes. RASH has some of the same problems, which is not addressed here - for instance how do you make sure the CSS and images of the article is carried along with the HTML file?

      Perhaps the article needs to be archived in a Research Object or archived using archive.org.

    22. The rest of the paper is structured as follows. In Section 2 we introduce some of the most relevant related works in the area, providing a functional comparison of the various works. In Section 3 we introduce the rationale for the creation of a new Web-first format for scholarly publication, discussing the importance of minimality. In Section 4 and Section 5 we introduce the theoretical background of RASH, and then provide an introduction of the language and the main tools included in its Framework. In Section 6, as a case study, we discuss the use of RASH as one of the formats for submitting papers to the SAVE-SD 2015 and SAVE-SD 2016 workshops. Finally, in Section 7 we conclude the paper sketching out some future developments.

      "Section 2" etc. here and elsewhere don't show up correctty in PDF submitted to PeerJ - the numbers are missing.

    23. is meant to be produced from MS Word, ODT and LaTeX sources

      yet the surveys in this paper describes authors mainly writing RASH by hand, not in Word/ODT/LaTeX.

      Rephrase to something like "the goal of RASH is to be produced from .." ?