136 Matching Annotations
  1. Nov 2023
  2. Oct 2023
    1. In a journal article or manuscript a sample identified by IGSN SSH000SUA may look like this (tagged IGSN): IGSN:SSH000SUA

      Manuscript tagging

    2. Unlike many other persistent identifiers, an IGSN is used not only used by machines but also needs to be handled by humans.

      Human-readable

    1. DOIs have a business model. LSIDs currently do not. Without a business model (read funding) we should stick to something that doesn’t have the implementation/adoption impediment of LSIDs and make the best of it (i.e. just have a usage policy for HTTP URIs).
    2. Without some kind of persistence mechanism the only advantage of LSIDs is that they look like they are supposed to be persistent. Unfortunately, because many people are using UUIDs as their object identifiers LSIDs actually look like something you wouldn’t want to look at let alone expose to a user! CoL actually hide them because they look like this: urn:lsid:catalogueoflife.org:taxon:d755ba3e-29c1-102b-9a4a-00304854f820:ac2009
    3. The act of minting an LSID indicates that you intend to try to make it permanent or at least never re-use it for another resource.
    1. The preferred PID scheme In consideration of the foregoing, the strongest option across the studied major dimensions of the available Handle System PID schemes and operational modes is for DiSSCo to use DOIs to identify Digital Specimens. The case for choosing DOI comes out slightly more strongly than choosing ePIC for reasons related to the substantial achievements, operational experience and reputation of DOI/ IDF to date. Operating under another Handle-system prefix than those used by IDF and ePIC is the substantially weakest option because of the difficulties associated with introducing an identifier that is not perceived to be a DOI. The term ‘DOI’ is trademarked by the IDF and thus not available for describing other identifiers. The practical and sensible avenue to explore further are the options to establish and become an RA member of the DOI Foundation (option A5) and to enter a strategic alliance at the level of the DOI Foundation (option A1). These options are likely most effective when actioned in combination.

      Preferred PID Scheme

    2. When digitized, each resulting ‘Digital Specimen’ must be persistently and unambiguously identified. Subsequent events or transactions associated with the Digital Specimen, such as annotation and/or modification by a scientist must be recorded, stored and also unambiguously identified.

      Workflows

    3. Persistent identifiers (PID) to identify digital representations of physical specimens in natural science collections (i.e., digital specimens) unambiguously and uniquely on the Internet are one of the mechanisms for digitally transforming collections-based science.

      Use case

    4. Information quality is the strongest factor to influence organizational benefits through perceived usefulness and user satisfaction

      Information quality=precision

    5. Digital Specimen

      Entity

    6. Appropriate identifiers Requirement: PIDs appropriate to the digital object type being persistently identified.

      Appropriate

    7. Governance

      GOvernance

    8. Persistence

      Persistence

    9. Trust

      Trust

    10. Available Handle-based PID schemes

      Topology if Handle-based schemes

    11. Scalability

      Characteristics

    12. Collectively the IDF and its RA members assume the long-term responsibility to maintain and sustain the DOI PID scheme for everyone.

      Characteristics

    1. Compact identifiers are a longstanding informal convention in bioinformatics. To be used as globally unique, persistent, web-resolvable identifiers, they require a commonly agreed namespace registry with maintenance rules and clear governance; a set of redirection rules for converting namespace prefixes, provider codes and local identifiers to resolution URLs; and deployed production-quality resolvers with long-term sustainability.

      Characteristics

    1. Wittenburg, P., Hellström, M., Zwölf, C.-M., Abroshan, H., Asmi, A., Di Bernardo, G., Couvreur, D., Gaizer, T., Holub, P., Hooft, R., Häggström, I., Kohler, M., Koureas, D., Kuchinke, W., Milanesi, L., Padfield, J., Rosato, A., Staiger, C., van Uytvanck, D., & Weigel, T. (2017). Persistent identifiers: Consolidated assertions. Status of November, 2017. Zenodo. https://doi.org/10.5281/zenodo.1116189

      Characteristics

    1. Over time the risk grows that the document is no longer accessible at the loca-tion given as reference. Web servers that follow the HTTP protocol then givethe notorious reply: ‘404 not found’. This resembles the situation of a book in a– very large – library that is not on the shelf at the position indicated in the cata-logue. How is it to be found?

      PID Issues

    2. In the mid 1990s, a number of schemes were developed that, rather than rely-ing on the precise address of a document, introduced the idea of name spaces forrecording the names and locations of documents.

      History

    1. Archives. The Member shall use best efforts to contract with a third-party archive or other content host (an "Archive") (a list of which can be found here) for such Archive to preserve the Member's Content and, in the event that the Member ceases to host the Member's Content, to make such Content available for persistent linking.

      Characteristics

    2. Maintaining and Updating Metadata.

      Characteristics

    1. ePIC PID service, and an organisation, that provides a PID service and is not an ePIC member, can ask for a certification, that it provides its PID service along the lines of the ePIC rules and policies.

      Characteristics

    2. ePIC has rules and policies, how to provide PID services and how to ensure the reliability, that is necessary for the persistency of access to digital objects via ePIC PIDs.

      Characteristics

    3. Registered production PIDs will not be deleted. They are used as a kind of tombstone even if the underlying data is not available anymore.

      Characteristics

    4. Persistent identifiers (PIDs) are an abstraction layer that arbitrates between the reference of a digital object and its location.

      Definition

    1. The developer and administrator of the DOI system is the International DOI Foundation (IDF), which introduced it in 2000

      Persistence Maximum

    1. 5.2. Key ? was(DESCRIPTION) when(DATE) resync This "metadata" command form provides nothing more than a way to carry a Key along with its description. The form is a "no-op" (except when "resync" is present) in the sense that the Key is treated as an adorned URL (as if no THUMP request were present). This form is designed as a passive data structrue that pairs a hyperlink with its metadata so that a formatted description might be surfaced by a client-side trigger event such as a "mouse-over". It is passive in the sense that selecting ("clicking on") the URL should result in ordinary access via the Key-as-pure-link as if no THUMP request were present. The form is effectively a metadata cache, and the DATE of last extraction tells how fresh it is. The "was" pseudo-command takes multiple arguments separated by "|", the first argument identifying the kind of DESCRIPTION that follows, e.g,

      ARK Kernel Metadata Query

    1. To resolve a Compact ARK (ie, an ARK beginning "ark:") it must initially be promoted to a Mapping ARK so that it becomes actionable. On the web, this means finding a suitable web Resolver Service to prepend to the compact form of the identifier in order to convert it to a URL (cf [CURIE]). (This is more or less true for any type of identifier not already in URL form.)

      Characteristics

    1. CLARIN: European Research Infrastructure for Language Resources and Technology CNIC: Computer Network Information Center, Chinese Academy of Sciences, China CSC: IT Center for Science CSCS: Swiss National Supercomputing Centre DKRZ: Deutsches Klimarechenzentrum GRNET: Greek Research and Technology Network GWDG: Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen SND: Swedish National Data Center SURF: SURF is the collaborative ICT organization for higher education and research in the Netherlands

      ePIC Members

    1. When content underlying a DOI is updated, we recommend updating the DOI metadata and, for major changes, assigning a new DOI. For minor content changes, the same DOI may be used with updated metadata. A new DOI is not required. For major content changes, we recommend assigning a new DOI and linking the new DOI to the previous DOI with related identifiers.

      Characteristics

    2. To enable easy usability for both humans and machines, a DOI should resolve to a landing page that contains information about the DOI being resolved. It is the responsibility of the entity creating the DOI to provide such a landing page. The following are best practices for creating well-formed DOI landing pages.

      Characteristics

    3. there may be infrequent cases where it is not desirable for the item described by a DOI to be available publicly, such as in the case of research retraction. In these cases, it is best practice to still provide a "tombstone page", which is a special type of landing page describing the item that has been removed.

      Characteristics

    1. certification: If certified, acronym for certification organization or standard (e.g., TRAC, TDR, DSA) and year of certification.

      Certification This potentially includes many of the features of PIDs already listed

    2. succession: The plan for dealing with sudden loss of provider viability, including set-aside funding and length of time that operations would be able to support continued operation while a successor provider is found to keep references intact.

      Succession

    3. mission: One sentence mission statement of the organization.

      Mission

    4. business model: For profit (FP) or not for profit (NP).

      Characteristics

    5. name: Full name of the provider organization. identifier: Unique identifier for the organization.

      Provider Identity

    6. check character (CC): A check character is incorporated in the assigned identifier to guard against common transcription errors.

      Mitigation

    7. inflection: a change to the ending of an object’s id string in order to obtain a reference to content related to the originally referenced content.

      Expectations A form of content negotiation

    8. landing: content intended mostly for human consumption, such as an object description and links to primary information (e.g., an image file or a spreadsheet), to alternate versions and formats, and to related information; from “landing page”, this is intended to support a browsing experience of an abstract overall view of the object.

      Expectations

    9. plunging: content intended as primary object information, often required or directly usable by software; from “below the landing page”, this is intended to support an immersive object experience that bypasses any browsing step.

      Expectations

    10. introversioned: a kind of intraversioned content for which the version identifier (within the object identifier) is opaque, e.g., “http://doi.org/10.2345/678”, which happens to be version 4.

      Expectations

    11. intraversioned: a version identifier is part of the id string, e.g., “http://doi.org/10.2345/67.V4”.

      Expectations

    12. extraversioned: a version identifier is separate from the id string, so that the actionable id does not lead to specific version without human intervention, e.g., “http://doi.org/10.2345/67”, Version 4.

      Expectations

    13. Precisely when such assignment will be triggered depends on policy that will differ across objects, collections, and providers.

      Trigger

    14. waxing: change that is limited to appending content in a way that does not in itself disrupt or displace previously recorded content. Examples of waxing objects include live sensor-based data feeds, citation databases, and serial publications.

      Expectations Dynamic Citation

    15. subinfinite: due to succession arrangements, the object is expected to be available beyond the provider organization’s lifetime.

      Expectations

    16. lifetime: the object is expected to be available as long as the provider exists.

      Expectations

    17. indefinite: the provider has no particular commitment to the object.

      Expectations

    18. finite: availability is expected to end on or around a given date (e.g., limited support for software versions not marked “long term stable”) or trigger event (e.g., single-use link).

      Expectations

    19. finite: availability is expected to end on or around a given date (e.g., limited support for software versions not marked “long term stable”) or trigger event (e.g., single-use link). indefinite: the provider has no particular commitment to the object. lifetime: the object is expected to be available as long as the provider exists. subinfinite: due to succession arrangements, the object is expected to be available beyond the provider organization’s lifetime.

      Expectations

      'Indefinite' should rather be 'Undefined'

    20. We define content variance to be a description of the ways in which provider policy or practice anticipates how an object’s content will change over time. Approaches to content variance differ depending on the object, version, service, and provider.

      Expectations

    21. molting: Previously recorded content may be entirely overwritten at any time with content that preserves thematic continuity. For example, an organization’s homepage may be completely reworked while continuing to be its homepage, and a weather or financial service page may reflect dramatic changes in conditions several times a day.

      Expectations

    22. rising: Previously recorded content may be improved at any time, for example, with better metadata (datasets), new features (software), or new insights (pre- and post-prints). This encompasses any change under “fixing”

      Expectations

    23. fixing: Previously recorded content may be corrected at any time, in addition to any change under “keeping”

      Expectations

    24. keeping: Previously recorded content will not change, but character, compression, and markup encodings may change during a format migration, and high-priority security concerns will be acted upon (e.g., software virus decontamination, security patching).

      Expectations

    25. frozen: The bit stream representing previously recorded content will not change

      Expectations

    26. id string: the sequence of characters that is the identifier string itself, possibly modified by adding a well-known prefix (often starting with http://) in order to turn it into a URL. identifier: an association between an id string and a thing; e.g., an identifier “breaks” when the association breaks, but to act on an identifier requires its id string. actionable identifier: an identifier whose id string may be acted upon by widely available software systems such as web browsers; e.g., URLs are actionable identifiers.

      Classes of identifier

    27. On the other hand, the universal numeric fingerprint (Altman & King 2007) is a PID that supports citation of numeric data in a way that is largely immune to the syntactic formatting and packaging of the data

      Versioning

    28. By contrast, repositories such as figshare (figshare 2016) and Merritt (Abrams et al. 2011) tolerate changes to metadata under the PID assigned originally, but create a new “versioned” PID if the object title or a component file changes, and in the latter case, the original non-versioned PID always references the latest version

      Versioning

    29. The DataONE federated data network (Michener et al. 2011) assigns a PID to immutable data objects and a “series identifier” that resolves to the latest version of an object (DataONE 2015).

      Versioning

    30. Hey, T, Tansley, S and Tolle, K (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery In: Redmond, Washington: Microsoft Research.

      Citation stability

    31. At a minimum it implies a prediction about an archive’s commitment and capacity to provide some specific kind of long-term functionality

      Persistence

    32. persistence is purely a matter of service

      Persistence

    1. Datafiles can be published with a suitable embargo period, forinstance to allow completion of publications or research basedon the dataset, or to respect contracts made by the depositorwith third parties concerning intellectual property rights.DANS encourages embargo periods of 6 months or less.

      Expectations

    2. If a published dataset is improved by amendments to thedata files of the dataset, a major version increment iscreated with a record of changes. In cases where it isnecessary to disable access to earlier versions, these can bedeaccessioned

      Expectations

    3. Data Management: Provenance, Versioning, and ReliableIdentification

      Versioinng

    1. Content drift describes the case where the resource identified by its URI changes over time and hence, as time goes by, the request returns content that becomes less and less representative of what was originally referenced.

      Content Drift

    1. The Handle System was first implemented in autumn 1994, and was administered and operated by CNRI until December 2015, when a new "multi-primary administrator" (MPA) mode of operation was introduced

      Handle system introduction

    1. These findings provide strong indicators that scholarly contentproviders reply to DOI requests differently, depending on the request method,the originating network environment, and institutional subscription levels

      PID Resolution factors

    1. In addition, PIDs may be local to an individual organization (e.g. identifiers in an internal human resources system), national (e.g. the DAI – Digital Author Identifier, used in the Netherlands), or global (all the examples in the paragraph above).

      PID Scope

    2. identifiers for research objects and outputs, for example, DOIs (digital object identifiers), Archival Resource Key identifiers (ARKs), handles and IGSNs (International Geo Sample Number).

      PID Entities - research outputs

    3. identifiers for organizations, including GRID (Global Research Identifier Database), Ringgold IDs, ISNIs (International Standard Name Identifiers), LEIs (legal entity identifiers) and the identifiers that will be provided by the recently announced Research Organization Registry2

      PID Entities - organisations

    4. identifiers for researchers, such as ORCID iDs, ResearcherIDs and Scopus IDs

      PID Entities - Researchers

    1. ARK systems such as Noid and N2T can record and provide metadata about any resource with an ARK.  That metadata becomes available via APIs, and can be seen when you add “?” to the end of an ARK URL. (See “Inflections” below) ARK metadata is very flexible, with no initial required metadata, but with support for multiple metadata schemas.  This flexibility is intentional: ARKs are designed to support a full digital object workflow, including the earliest stages before a resource is well-understood or described.

      ARK Metadata

    1. The ARK Alliance maintains a complete registry of all assigned NAANs, currently at the California Digital Library. The registry is mirrored at the (U.S.) National Library of Medicine and the National Library of France.

      PID Registry

  3. Sep 2023
    1. PIDs are exemplary implementations of FAIR data in their own right, but they also help to provide FAIR access to research entities like articles and datasets.

      Benefits

    2. DOIs are a great solution for the problem of URIs that change over time, but this approach does depend on journal publishers, repositories, libraries, and other major hosting organization to be responsible for maintaining current link information within the DOI records that they have created

      Integrity

    3. Ultimately the knowledge graph will permit a much clearer understanding of global research networks, research impact, and the ways in which knowledge is created in a highly interconnected world.

      Use Case

    4. PIDs infrastructure promises much more accurate and timely reporting for key metrics including the number of publications produced at an institution in a given year, the total number of grants, and the amount of grant funding received.

      Use Case

    5. They can also much more easily see whether researchers have met mandated obligations for open access publishing and open data sharing.

      Use Case

    6. The global knowledge graph created by the interlinking of PIDs can help funders to much more easily identify the publications, patents, collaborations, and open knowledge resources that are generated through their various granting programs.

      Use Case

    7. Benefits to Researchers

      Time-saving: reduction in administrative burden

    8. identifiers will continue to resolve indefinitely.

      Resolvability

    9. All PID Registration Agencies must have highly redundant storage and hosting infrastructure in order to ensure that services are globally available 24-7

      Redundancy

    10. Persistent

      Persistent

    11. Machine-Readable

      {Machine-Readable}

    12. Globally Unique Names

      {Globally Unique}

    1. Systems such as DOI can thus support resolution mechanisms that are likely to be able to maintain the resolution of identifiers regardless of changes in technology or to one particular system.

      {Protocol Independence}

    1. Brown, Josh, Jones, Phill, Meadows, Alice, & Murphy, Fiona. (2022). Incentives to invest in identifiers: A cost-benefit analysis of persistent identifiers in Australian research systems. Zenodo. https://doi.org/10.5281/zenodo.7100578

      P1: Benefits of PIDs

    1. PIDs for research dataPIDs for instrumentsPIDs for academic eventsPIDs for cultural objects and their contextsPIDs for organizations and projectsPIDs for researchers and contributorsPIDs for physical objectsPIDs for open-access publishing services and current research information systems (CRIS)PIDs for softwarePIDs for text publications

      PID Use Case Elements, entities

    1. Although the DOIs assigned to relatively large aggregations of datasets are well suited for citation and acknowledgment pur-poses, they are not issued at fine enough granularity to meet the scientific imperative that published results should be traceableand verifiable

      Reproducibility

    2. Persistent identifiers for acknowledgment and citation

      PID Use Cases

    3. ne key element is to generate a dataset-centric rather than system-centric focus, with an aim to making the infrastructure less prone to systemic failure.

      PID Motivations

    4. scientific reproducibility and accountability

      PID Motivations

    1. To reuse and/or reproduce research it is desirable that researchoutput be available with sufficient context and details for bothhumans and machines to be able to interpret the data as described inthe FAIR principles

      Reusability and reproducibility of research output

    2. Registration of research output is necessary to report tofunders like NWO, ZonMW, SIA, etc. for monitoring andevaluation of research (e.g. according to SEP or BKOprotocols). Persistent identifiers can be applied to ease theadministrative burden. This results in better reporting,better information management and in the end betterresearch information.

      Registering and reporting research

    3. 1. Registration and reporting research2. Reusability and reproducibility of research3. Evaluation and recognition of research4. Grant application5. Researcher profiling6. Journal rankings

      PID Use Cases

    1. Deduplication of researchersLinkage with awardsAuthoritative attribution of affiliationand worksORCID iD RecommendedIdentification of datasets, software andother types of research outputsDataCite DOI RecommendedIdentification of organisations GRID/ROR RecommendedIdentification of organisations inNZRISNZBN Required for data providers

      PID Use Cases

    1. Function PID type (Examples:) Recommended or required?

      PID Use Cases

    2. The progress and impact of the project will be measured and monitored through the collection ofquantitative indicators. The different systems of the project partners as well as ORCID Inc. andROR will be queried. If possible, indicators for all 10 PID use cases should be measured. Theseinclude for example the following indicators:● Number of registered DataCite DOIs by scientific institutions in Germany.● Number of registered DataCite-DOIs that have a link to further resources via arelated-IDentifier relationship.● Number of ROR implementations at scientific institutions in Germany.● Number of GND records that have an ORCID iD or a ROR ID.● Etc.

      PID Use Cases

    1. Key features● KISTI’s mission is to curate collect, consolidate, and provide scientific information toKorean researchers and institutions. It includes but is not limited to.■ Curating Korean R&D outputs. Curate them higher state of identification for bettercuration, tracking research impact, analysing research outcomes.■ DOI RA management. Issuing DOIs to Korean research outputs, Intellectualproperties, research data■ Support Korean societies to stimulate better visibilities of their journal articlesaround the world.■ Collaborate for better curation (identification and interlinking) with domestic andglobal scientific information management institutions, publishers and identifiermanaging agencies

      PID Use Cases

    1. Name of infrastructure Key purpose List of integrated PIDsFairdata.fi Research data publication,metadata hub andpreservation serviceDOI, URN, ORCID (updaResearch.fi National research data hub. Current draft:ADSbibcode - AstrophysicsData System -Bibliographic ReferenceCode (en)ARK - Archival ResourceKey (en)arXiv - arXiv identifierscheme (en)BusinessID - Y-tunnus (fi)(en)Crossref_funders -Crossref Funder Registry(en)DOI - Digital ObjectIdentifier (en)Case Study: FINLAND Page 3 of 6

      PID Use Cases

    1. Name of infrastructure Key purpose List ofintegratedPIDse-infra This large infrastructure will build the NationalRepository Platform in the upcoming years. Thatshould greatly facilitate adoption of PIDs.TBDNational CRIS - IS VaVaI(R&D Information System)National research information system. We planon working with Research, Development andInnovation Council (in charge of IS VaVaI) onintegrating global PIDs into their submissionprocesses as required. Nowadays it uses mostlylocal identifiers.TBDInstitutional CRIS systems Various institutional CRIS systems at CzechRPOs. OBD (Personal Bibliographic Database)application is an outstanding case of aninstitutional CRIS system in the Czech Republicdeveloped locally by a Czech company DERS.An ORCID integration for OBD is currently indevelopment.TBD, OBDORCID inprocessInstitutional or subjectrepositoriesThere are several repositories in the Czechrepublic collecting different objects, some arealready using PIDs but there is still enough roomto improve and really integrate those PIDs, notonly allow their evidence.Handle,DOI,maybeotherMajor research funders Grant application processes TBDLocal publishers Content submission processes TBD

      PID Use Cases

    2. TARGET INSTITUTIONS:● Public research performing organisations (RPOs): Higher Education Institutions andResearch organizations● Research funding organizations (RFOs): Ministry of Education, Youth and Sports, CzechScience Foundation, Technology Agency of the Czech Republic etc.● Policymakers: Ministry of Education, Youth and Sports; Research, Development andInnovation Council (R&D&I Council)● Libraries: National library, National Library of Technology, academic libraries● Publishers based in Czechia● Service providers, research infrastructuresTARGET GROUPS:● Researchers● Librarians● Open Science/Open Access managers/coordinators● CRIS system managers● Repository managers● Other research support positions, e.g. data stewards, data curators

      PID Stakeholders and Target Groups

    3. Function PID type Recommended or required?

      PID Use Cases

    1. Function PID type Recommended or required?

      PID Use Cases in the Netherlands

    2. 1. Registration and reporting research2. Reusability and reproducibility of research3. Evaluation and recognition of research4. Grant application5. Researcher profiling6. Journal rankings

      PID Use Cases in the Netherlands

    1. PIDs comparison tableCase study Function PID typeFinland Researchers, persons ORCID; ISNIOrganisations VAT-number (not resolvableyet)RoRISNI___________________________________________________________________________________________________________________Pathways to National PID Strategies: Guide and Checklist to facilitate uptake and alignment Page 13 of 20

      PID usage by country