36 Matching Annotations
  1. Apr 2021
    1. code the addresses of sender and receiver within <opener> or <closer>

      In my opinion this would stretch the semantics of <opener> and <closer> too much. The distinction should be clear between what is part of the letter text and what isn't. The addresses, typically very clearly separated from the letter body as intended for the mail logistics, are not really part of the letter text. Some encoders (like me) could even rather put the transcribed addresses inside of <front> than of <body>, arguing that correspondence addresses are not part of the textual body.

    2. <address> directly inside a <div>

      I strongly support this proposal and would like to push it even further, allowing for the use of <address> not only directly inside of <div> but also directly in <body>, <front> or <back>, making it a "chunk level" element of its own. While grouping several addresses inside a <div> could be useful, using <div> should not be required for one single address.

  2. Jan 2020
    1. pre-printed

      These proposed values should not constitute a closed list because other types might be needed, e.g. for stamps or text printed afterwards on a previously handwritten document.

    2. the <ref> element

      In my opinion, using the "ref" element would imply that the text segment it encloses actually is a reference (e.g. a hypertextual link or an intratextual pointer). But this is not the case here because the letterhead is not referring to its own layout description in the metadata.

    3. Any text contained within <figure> is being understood as "commentary on the figure in the source"

      This is not what the quoted passage in the Guidelines says, in my understanding. The "head" element may e.g. capture the real caption accompanying a figure in the source, without being a "commentary" on it.

    4. <fw> being an inline element

      The "fw" element is member of "model.milestoneLike". This membership certainly makes it possible to position the element in an "inline" manner in a TEI document, but it does not restrict it to "inline" use, in my opinion.

    1. The descriptions of <dateline>, <salute> and <signed> imply semantic and structural meanings at the same time. The TEI Guidelines determine specific opportunities and restrict the use of those elements: Informally, all three function as block-level elements or so-called chunks

      Most of the presumed problems this proposal intends to resolve originate from the assumption that semantic TEI elements like "p" (paragraph) are (also) defined by layout.

      That is why the code examples do not use "lb" at the beginning of such elements - there is the underlying assumption that a "p" implies a line beginning at its own start, and similarly all the other mentioned elements like "dateline", "salute", "signed", "opener", "closer", "address", "addrLine".

      I strongly disagree with this view. In such an encoding, it becomes difficult to determine the physical lines. If a documentation of physical lines is desired (which is not true for many editorial projects), it should - in my opinion - be done always explicitly, even at the beginning of elements like "p", by inserting the "lb" element ("line beginning").

      Similarly, the encoding of a verse by the "l" element (indeed very misleadingly labeled "verse line") does in fact not imply that the verse starts on a new line (nor that it ends at a physical line end).

      So some of the proposed solutions in the following text - such as "@rend="inline" proposed for "salute", "signed" and "dateline" where the text does not start on a new line - are really completely unnecessary if the "lb" element were used consistently for all line beginnings.

      While many "chunk" elements may often coincide with layout features like a new zone, a new line or an indentation, they are independent of these features, they are not defined by them. A new paragraph (conceived semantically) can very well start in the middle of a line, and it is really up to the editor to determine where a textual section like a paragraph starts (or ends).

      It is my firm conviction that a clear conceptual separation of semantic elements (and most of the TEI elements are semantic elements) from the documentation of physical layout would clarify many problems which are dealt with in this proposal.

    2. are all defined not only by their semantic meaning but also by layout specifics in the text

      In my opinion, they are defined by their semantic meaning and are not determined by layout phenomena. A whole piece of correspondence could in an extreme case be very well written in one single line and still contain all the semantic parts like "salute", "signed" etc. The constitution of such semantic segmentation is entirely the interpretative task of an editor.

    3. We suggest to allow <dateline> within <p>

      I strongly advise against it because it would obscure the primarily semantic character of "dateline" which really is not defined by a physical line but by its textual content. And this textual content is conceptually different from running prose in a "paragraph" (again conceived semantically). Mentions of dates and places inside of a running prose can always be marked up by "date" or "placeName", but they do not constitute a "dateline".

    4. <salute> should become a member of the attribute class att.typed

      This would, of course, enable @subtype as well, which is not mentioned but probably intended.

    5. intended to be used for the encoding of visually distinctive closing sections

      I disagree with this interpretation of the following Guidelines definition. There is no mention of anything "visual" in the Guidelines quote. "appearing as a final group at the end of a division" can be understood purely semantically, especially if the "division" itself is a semantically determined unit.

    6. the attribute @rend with values "paragraph" and "closer"

      Do the @rend-values "paragraph" or "closer" really say anything about the visual rendition? Here again, my point of view is that "paragraph" or "closer" are semantic units, not determined (but often coinciding) with specific layout phenomena.

    7. For example, it would be inadequate in the context of a diplomatic transcription to render the embedded saluting phrase offset the paragraph.

      Here again, the underlying assumption is that a "paragraph" is defined by the layout and it starting after the "salute" would violate the principles of a diplomatic transcription. But from my point of view, this is not the case because the start of a "p" element does not say anything about the line segmentation (which would be expressed by "lb"). Accordingly, in the following solution (2) the "inline" statement would be unnecessary if you could rely on the information that there is no line beginning where there is no "lb" element.

    8. <address>

      This placement of "address" does not seem to be valid, in fact "address" is not allowed as (direct) child of "div" at all. One possible solution would be wrapping "address" in a "ab".

    9. However, the encoding above includes only 5 <addrLine>, but one of them additionally contains a linebreak element (<lb>). This combination of <addrLine> and <lb> provides a possibility to group basic units (e.g. the title or the location) which are distributed over several lines within one element.

      This is one example for the unlucky confusion between semantic elements and elements documenting the layout. On the one hand, you are recognizing "addrLine" as a semantic "basic unit" and you explicitly mark the line beginning in the middle of this unit, but at the same time you tacitly assume that the beginning of "addrLine" automatically marks a new physical line which it does not, in my opinion.

    10. <div>

      This is of course just one encoding example but it should perhaps be pointed out that a letter does not necessarily have to be encoded inside of a "div". If one TEI document contains exactly one letter, the letter could very well be placed directly inside of "body", and some parts could even be conveniently allocated in "front" or "back" (e.g. the "address", possibly wrapped in a "ab").

    11. example 6 below

      Sorry for asking and for my bad reading skills, but is the photo labeled "Example 6" really correct? I am failing to recognize on the photo the text you mention.

    12. As a member of model.global it can appear at any point within a TEI text.

      The "address" element does not seem to be member of "model.global" (however, it uses this model in its content model). Therefore, its placement is much more restricted than this text is insinuating. E.g. it is not allowed as child of "div".

    1. an attribute @new or an attribute set @from/@to

      It is not clear to me how the attributes @new or @from/@to would actually be used according to this proposal.

    2. new milestone element <placeShift>

      I think that such a new element - if introduced - should be compatible with the Parallel Transcription approach. Such shifts between zones are similar to the "cb" element that can point by @facs to a zone declaration inside of "facsimile". Perhaps "cb" itself would be sufficient if it did not stretch the semantics of "column" and of "beginning" in cases where a text zone were not really a typical "column" and where the text would leap into the inside of a text zone (i.e. not in its first line).

      The approach adopted at the Heidelberg University Library for such cases is :

      1) Declaring all content zones of a page with "zone" elements inside of "surface" in "facsimile" (currently even without coordinates) and assigning a @xml:id to each "zone" element.

      2) Ordering the text inside of "text" meaningfully, not necessarily respecting the physical layout.

      3) Annotating the position where the running text switches its placement from one zone to another either by "cb" (if it really is a beginning of a real column) or by "milestone" elements belonging to one of the two categories "zone beginning" or "zone shift" (we use @ana and URIs for the categorization, but it is essentially a typing mechanism). The "cb" or "milestone" points by @facs to the appropriate "zone" declaration in "facsimile".

      4) Each physical line, regardless whether in a main text column or in marginal zones, is annotated at its beginning with an explicit "lb" and a number. Thanks to the "cb"s and the other two "beginning" and "shift" milestones, each "lb" knows to which zone it belongs, and for a diplomatic visualisation of a page content, the texts can be rearranged into correct physical lines ordered accordingly into their physical layout zones.

    3. ground rules

      I would just like to point out that postscripts placed in the margin are also arranged in lines which could be encoded by "lb"s and that without a clear milestone mechanism for the "switches" between zones this could become very confusing.

      On top of this, rules like "left margin -> in front of the line" would insinuate that there is a correlation between a certain margin and the meaningful text order, which there isn't, at least not always and per se.

    4. Furthermore, the <postscript> element should be supplied with an attribute @place to indicate the place where the postscript is entered on the page.

      This is really unnecessary if a milestone solution for documenting the shifts between zones would be adopted, something like the "placeshift" element proposed below.

      Together with a "facsimile" declaration (where @place perhaps could be introduced on the "zone" element), this approach would state clearly where the postscripts are physically placed, without the need of inserting them violently inside of paragraphs where they would interrupt the running text at non-meaningful positions (but if the positions were meaningful, then perhaps the marginal texts would rather be classified as notes and not as postscripts).

    5. <postscript> should become a possible subelement of <p>.

      I strongly advise against it, because then the semantics of "postscript" would be confused with notes or glosses.

      A "note" element can be placed inside of a "p" in cases where it is assumed that the note refers to a text segment in the paragraph. If this were a case - as with a marginal note written later on the margin as a comment on the main text - then it is a note and not a postscript and should be tagged with "note".

      But if it is just an addition to a letter as a whole which is written in the margin without a reason other than because there is free space for it, this would be a "postscript" to be encoded after the body of the letter. The diplomatic perspective (documenting correctly the physical placement) should again not be confused with the meaningful reading order of the text.

    6. The former would mean, page break elements have to be linked among each other and would result in a higher number of <pb> elements than there are pages in the source document.

      I think "page beginning" means that a page starts only once. That's why there should not be more than one "pb" element for one page in a TEI document.

      But I agree that for such shifts where the running text leaps from one zone or page to another, an appropriate milestone element is needed. The proposed "pageshift" element goes in this direction (at the Heidelberg University Library, we are using the generic "milestone" element further specified by an attribute).

      And there should be an understanding that every "lb" between such milestones belongs to a certain zone (typically column) on a page, and this zone is being pointed at by the first one of the two "shift" milestones. In other words: All lines marked by "lb"s after a "shift" milestone belong to the zone referenced by the milestone, until the next "shift" milestone is reached.

      I think the physical units we are talking about are essentially zones on pages, not just the pages. But of course zones are assigned to their surfaces (typically pages), as it is intended in the structure of the "facsimile" element. So in cases where a postscript leaps from one page to another it should be sufficient to encode "zone shifts" by appropriate milestones, and "pb"s once used should not be repeated again.

    7. As stated above the postscript can be found at locations other than the bottom of a letter.

      In my opinion, the main ordering structure (the ordered tree) of the element "text" in a TEI document should reflect a semantically meaningful reading of the text and not the physical location of its different parts. So basically, the postscripts should always be encoded in the end (of a "div" or "body" representing the letter), regardless of their physical placement.

      There are other mechanisms for documenting the physical layout and placement of text segments: milestone elements linked with surface and zone declarations as described in the Parallel Transcription section of the chapter "Representation of Primary Sources" of the Guidelines.

      The meaningful reading order of the "text" tree structure should in my view not be "polluted" by attempts of reconciling it with potentially deviating physical orderings of content zones on pages. The physical allocation should be documented in a different layer of the editorial document.

    1. cmif:copy-by-sender

      This and the following two URIs deviate from the camel case scheme (adopted here otherwise) by using the dashes and lower case letters ("kebab case").

    2. element ref instead of relation

      The "type" attribute used on the "ref" element in the following example has again (like the "name" attribute on the "relation" element) the datatype "teidata.enumerated". So it would not really be a true URI statement of the RDF predicate, at least not in the explicit sense of the datatype "teidata.pointer" as available on the "target" attribute used here for the RDF object.

    3. The predicate is also defined as a machine-readable URI.

      The attribute "name" used for the RDF predicate in the following code example has the datatype "teidata.enumerated", not "teidata.pointer" as the other two attributes used for the RDF subject and object. It is therefore not really suitable for an URI statement. See the TEI Guidelines.

    4. that the @active attribute always has the same URI - namely that of the edited letter.

      The repetition of the triple subjects (if they were always the same) could equally be avoided simply by omitting the "active" attribute, as they are optional on the "relation" element.

    5. A coding would then look like this:

      It should be noted that in this and all the other following proposed encodings the authority file (the origin of the authority data) is not stated explicitly but must be deduced from the URI. Using "idno" as a child element instead of only the "ref" attribute would make it possible to state the type and origin of the authority URI, e.g. VIAF, by the "type" and "source" attributes.

    6. TEI conformity in the further development of the format. According to the definitions of the TEI guidelines, this includes, on the one hand, that the CMIF is validated against TEI All

      According to the TEI Guidelines (https://tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#CFVL) a TEI-conformant document "must validate against a schema file that has been derived from the published TEI Guidelines". This however is not the same as validity against "TEI All". E.g. a customized TEI schema can include extensions like elements or attributes in an own namespace and still be conformant with the TEI, being derived from TEI Guidelines by valid ODD means.