diff --git a/docbook/T1/WG3/XLIFF-EM-BP.xml b/docbook/T1/WG3/XLIFF-EM-BP.xml index 8861783..ae8e4b1 100644 --- a/docbook/T1/WG3/XLIFF-EM-BP.xml +++ b/docbook/T1/WG3/XLIFF-EM-BP.xml @@ -62,8 +62,9 @@ This prose specification is one component of a Work Product that also includes: - Extraction and merging examples from - &this-loc;/extraction_examples/readme.md + Extraction and merging examples from &this-locArts;extraction_examples/ + + An unstable editorial version of the examples might exist at &EDArts;extraction_examples/ diff --git a/docbook/T1/WG3/XLIFF-EM-BP.xml_xslt b/docbook/T1/WG3/XLIFF-EM-BP.xml_xslt new file mode 100644 index 0000000..41c31df --- /dev/null +++ b/docbook/T1/WG3/XLIFF-EM-BP.xml_xslt @@ -0,0 +1,91 @@ +XLIFF 2 Extraction and Merging Best Practice, Version 1.0David Filip, Ján Husarčík, Rodolfo M. Raya, and Andreas GalambosXLIFF, Extraction, Merging, Best PracticeDocBook XSL Stylesheets with Apache FOPXLIFF 2 Extraction and Merging Best Practice, Version 1.0Table of ContentsTerminology and ConceptsIntroductionSpecificationInline CodesRepresenting Spanning CodesOutermost Tag PairsIncomplete Extraction of Inline CodesRepresenting Multiple Subsequent CodesTarget Content in Extracted XLIFFInserting unmodified source content into <target>Inserting possible translation into <target>State MachineEditing and Context HintsNon-deletable Inline CodesPreserving Order of CodesControlling SegmentationProviding ContextConsiderations for Using Spanning CodesXLIFF StructureFile StructureRole of <unit>MiscellaneousValue of attribute idWhitespace HandlingProtecting Non-localizable ContentMerging Translated ContentSelecting Language TagsValidation of Extracted ContentXLIFF ValidationsSummaryReferencesXLIFF 2 Extraction and Merging Best Practice, Version 1.0XLIFF 2 Extraction and Merging Best Practice, Version 1.0XLIFF 2 Extraction and Merging Best Practice, Version 1.0Edited by David Filip and Ján HusarčíkRodolfo M. RayaAndreas GalambosTAPICC T1/WG3Copyright © 2018 GALATAPICC. All rights reserved.Additional artifactsThis prose specification is one component of a Work Product that also includes:Extraction and merging examples from https://galaglobal.github.io/TAPICC/T1/WG3/wd01/extraction_examples/ + An unstable editorial version of the examples might exist at https://galaglobal.github.io/TAPICC/T1/WG3/extraction_examples/Related workThis note provides informative best practice for XLIFF 2 Specifications:XLIFF Version 2.1 [[XLIFF-2.1]]XLIFF Version 2.0 [[XLIFF-2.0]]ISO 21720:2017 [[ISO XLIFF]]StatusThis Informational Best Practice was last revised by TAPICC T1/WG3 or the TAPICC Steering + Committee on the above date. The level of approval is also listed above. Check the “Latest + version” location noted above for possible later revisions of this document.Contributions to this deliverable or subsequent versions of this deliverable can be made + via the GALA TAPICC GitHub + Repository [https://github.com/GALAglobal/TAPICC] subject to signing the TAPICC Legal + Agreement [https://www.gala-global.org/tapicc-legal-agreement].Citation formatWhen referencing this specification the following citation format should be used:[XLIFF-EM-BP]XLIFF 2 Extraction and Merging Best Practice, Version 1.0 + Edited by David Filip and Ján Husarčík. 24 January 2018. Working Draft 01. https://galaglobal.github.io/TAPICC/T1/WG3/wd01/XLIFF-EM-BP-V1.0-wd01.html. + Latest version: N/A.html.NoticesCopyright © GALATAPICC 2018. All rights reserved. The Translation API Class and Cases (TAPICC) initiative is a collaborative, + community-driven, open-source project to advance API standards in the localization industry. + The overall purpose of this project is to provide a metadata and API framework on which + users can base their integration, automation and interoperability efforts.The usage of all deliverables of this initiative - including this specification - is + subject to open source license terms expressed in the BSD-3-Clause License and CC-BY 2.0 + License, the declared applicable licenses when the project was chartered. The 3-Clause BSD License (BSD-3 Clause): https://opensource.org/licenses/BSD-3-ClauseCreative Commons Legal Code (CC-BY 2.0): https://creativecommons.org/licenses/by/2.0/legalcode24 January 2018AbstractThis Informational Best Practice specification targets designers of XLIFF Extracting and + Merging Tools for content owners. It gathers common problems that are prone to appear when + Extracting + XLIFF Documents from HTML, generic XML, or MarkDown. This + specification shows why some Extraction approaches will cause issues + during an XLIFF Roundtrip. This best practice guidance provides + better thought through alternatives and shows how to use many of advanced XLIFF features for + lossless Localization roundtrip of HTML and XML based content.Table of ContentsTerminology and ConceptsIntroductionSpecificationInline CodesTarget Content in Extracted XLIFFEditing and Context HintsXLIFF StructureMiscellaneousXLIFF ValidationsSummaryReferencesTerminology and ConceptsContext hintsXLIFFattributes on structural or inline elements providing additional contexts, such + as disp [http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#disp] or equiv [http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#equiv].Inline codesmarkerIntroductionIntroductionThis specification targets designers of XLIFF Extracting and Merging Tools for content + owners. XLIFF Roundtrip designers of all kinds will benefit, no matter if they design their + XLIFF Extractor/Merger for corporate or blog use.Extraction and merging behavior is out of the normative scope of OASIS XLIFF + Specifications. Although those specifications do provide some guidance for Extractor and + Merger Agents, XLIFF TC did not attempt to prescribe how exactly to use XLIFF to represent + native content. This is mostly because XLIFF is a native format agnostic Localization + interchange Format.This Informational Best Practice targets designers of XLIFF Extracting and + Merging Tools for content owners. XLIFF + Roundtrip designers of all kinds will benefit, no matter if they design their + XLIFF Extractor/Merger for corporate or blog use.Extraction and Merging behavior is out of + the normative scope of OASIS XLIFF Specifications. Although those specifications do provide + some guidance for Extractor and Merger Agents, + XLIFF TC did not attempt to prescribe how exactly to use XLIFF to represent native content. + This is mostly because XLIFF is a native format agnostic Localization Interchange + Format.This specification gathers common problems that are prone to appear when Extracting XLIFF + Documents from HTML, generic XML, or MarkDown. This specification shows why some + Extraction approaches will cause issues during an XLIFF + Roundtrip, issues often so severe that Merging back of + target content will not be possible without costly postprocessing or could fail utterly. This + best practice guidance provides better thought through alternatives and shows how to use many + of advanced XLIFF features for lossless Localization roundtrip of HTML and XML based content. + Most of the times there are no ultimate prescribed solutions, rather possible design goals are + described and best methods how to achieve them proposed.SpecificationSpecificationInline CodesInline CodesRepresenting Spanning CodesSpanning codes in the original format are created by opening code, content and closing + code. In HTML that can be <bold>text</bold>, in RTF \b text + \b0.In XLIFF2 such code can be represented using <sc /> [http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#sc]/<ec /> [http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#ec] pair universally, + or by <pc></pc> [http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#pc] in case of + well formed spanning code.Ideally the original format is documented enough to instruct Extractor about role of + each inline code. For example XMLschema allows to declare elements using keyword EMPTY. + This way all elements, which are not declared EMPTY, can be represented as described + above. To further help the extraction process the following recommendation could be + implemented in original XML format: “For interoperability, the empty-element tag + SHOULD be used, and SHOULD only be used, for elements which are declared + EMPTY.”[[XML]]This concept is illustrated by spanning_as_ph [https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/spanning_as_ph]•[spanning_as_ph] + https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/spanning_as_ph + •Extractor could use knowledge of schema and only use does not use <ph> for codes that + are declared as EMPTY. To further help the extraction process, following W3C + recommendation could be followed: „The empty-element tag SHOULD be used, and SHOULD only be used, + for elements which are declared EMPTY.“ (https://www.w3.org/TR/REC-xml/#sec-starttags), + e.g. even <span> without content would use <span></span> as compared to <br + />. •https://issues.oasis-open.org/browse/XLIFF-14 + http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#phOutermost Tag Pairs •[outermost_inline_excluded] + https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/outermost_inline_excluded + •Both functional and formatting inline codes provide additional context for translator and + could be linguistically significant. •If they are important enough to be in native format, + they should be present in extracted content. Incomplete Extraction of Inline Codes •[CDATA] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/cdata + •[inline_codes_plain_text] + https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/inline_codes_plain_text + •http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#d0e8112 + •https://www.w3.org/TR/xml-i18n-bp/#AuthCDATA •Not using native XLIFF representation + leaves inline codes unprotected and increases risk of roundtrip corrupting them. Representing Multiple Subsequent Codes •[multiple_codes_represented_as_single] + https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/multiple_codes_represented_as_single + •Grouping several independent inline codes into single representation could prove + challenging with negative impact on •Translation quality •Fluency •Functionality + •Automated actions •Validation •Some codes needs to be removed, copied, added or + reordered. •If any of the above actions is to be prevented, it can be controlled using + editing hints with finer granularity. Target Content in Extracted XLIFFTarget Content in Extracted XLIFFInserting unmodified source content into <target> Inserting possible translation into <target> State MachineEditing and Context HintsEditing and Context HintsNon-deletable Inline Codes Preserving Order of Codes Controlling SegmentationProviding ContextContext hintsConsiderations for Using Spanning CodesXLIFF StructureXLIFF StructureFile StructureRole of <unit>MiscellaneousMiscellaneousValue of attribute idWhitespace HandlingProtecting Non-localizable ContentMerging Translated ContentSelecting Language TagsValidation of Extracted ContentXLIFF ValidationsXLIFFValidationsSummarySummaryReferencesNormative references[XML] W3C: Extensible Markup Language (XML) + 1.026 November 2008https://www.w3.org/TR/xml/[XLIFF-2.1] + Edited by David Filip, Tom Comerford, Soroush Saadatfar, Felix + Sasaki, and Yves Savourel: XLIFF Version 2.112 October 2017 + http://docs.oasis-open.org/xliff/xliff-core/v2.1/cos01/xliff-core-v2.1-cos01.htmlhttp://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html + [XLIFF-2.0] + Edited by Tom Comerford, David Filip, Rodolfo M. Raya, and Yves + Savourel: XLIFF Version 2.004 August 2014 + http://docs.oasis-open.org/xliff/xliff-core/v2.0/os/xliff-core-v2.0-os.htmlhttp://docs.oasis-open.org/xliff/xliff-core/v2.0/xliff-core-v2.0.html + [ISO XLIFF] + Edited by Tom Comerford, David Filip, Rodolfo M. Raya, and Yves + Savourel: ISO 21720:2017 - XLIFF (XML Localisation interchange file + format)November 2017 + https://www.iso.org/standard/71490.html + Non-Normative ReferencesError: no bibliography entry: d5e263 found in http://cdn.docbook.org/release/xsl/bibliography/bibliography.xml \ No newline at end of file diff --git a/docbook/T1/WG3/dbgenent.mod b/docbook/T1/WG3/dbgenent.mod index f980493..09c98d9 100644 --- a/docbook/T1/WG3/dbgenent.mod +++ b/docbook/T1/WG3/dbgenent.mod @@ -76,6 +76,18 @@ + + + + + + + + + + diff --git a/docbook/dbgenent.mod b/docbook/dbgenent.mod index d7f0083..94bd5c9 100644 --- a/docbook/dbgenent.mod +++ b/docbook/dbgenent.mod @@ -79,6 +79,15 @@ + + + + + + + diff --git a/docs/T1/WG3/XLIFF-EM-BP-ED.html b/docs/T1/WG3/XLIFF-EM-BP-ED.html index 3bd5a2a..cf0ff35 100644 --- a/docs/T1/WG3/XLIFF-EM-BP-ED.html +++ b/docs/T1/WG3/XLIFF-EM-BP-ED.html @@ -1,14 +1,14 @@ - XLIFF 2 Extraction and Merging Best Practice, Version 1.0

XLIFF 2 Extraction and Merging Best Practice, Version 1.0

Edited by

David Filip

ADAPT Centre

Ján Husarčík

Moravia

Rodolfo M. Raya

Andreas Galambos

TAPICC T1/WG3

Additional artifacts

This prose specification is one component of a Work Product that also includes:

  • Extraction and merging examples from - https://galaglobal.github.io/TAPICC/T1/WG3/wd01/XLIFF-EM-BP-V1.0-wd01/extraction_examples/readme.md

Related work

This note provides informative best practice for XLIFF 2 Specifications:

  • XLIFF Version 2.1 [[XLIFF-2.1]]

  • XLIFF Version 2.0 [[XLIFF-2.0]]

  • ISO 21720:2017 [[ISO XLIFF]]

Status

This Informational Best Practice was last revised by TAPICC T1/WG3 or the TAPICC Steering + XLIFF 2 Extraction and Merging Best Practice, Version 1.0

XLIFF 2 Extraction and Merging Best Practice, Version 1.0

Edited by

David Filip

ADAPT Centre

Ján Husarčík

Moravia

Rodolfo M. Raya

Andreas Galambos

TAPICC T1/WG3

Additional artifacts

This prose specification is one component of a Work Product that also includes:

Related work

This note provides informative best practice for XLIFF 2 Specifications:

  • XLIFF Version 2.1 [[XLIFF-2.1]]

  • XLIFF Version 2.0 [[XLIFF-2.0]]

  • ISO 21720:2017 [[ISO XLIFF]]

Status

This Informational Best Practice was last revised by TAPICC T1/WG3 or the TAPICC Steering Committee on the above date. The level of approval is also listed above. Check the “Latest version” location noted above for possible later revisions of this document.

Contributions to this deliverable or subsequent versions of this deliverable can be made via the GALA TAPICC GitHub Repository subject to signing the TAPICC Legal - Agreement.

Citation format

When referencing this specification the following citation format should be used:

[XLIFF-EM-BP]

XLIFF 2 Extraction and Merging Best Practice, Version 1.0 + Agreement.

Citation format

When referencing this specification the following citation format should be used:

[XLIFF-EM-BP]

XLIFF 2 Extraction and Merging Best Practice, Version 1.0 Edited by David Filip and Ján Husarčík. 24 January 2018. Working Draft 01. https://galaglobal.github.io/TAPICC/T1/WG3/wd01/XLIFF-EM-BP-V1.0-wd01.html. - Latest version: N/A.html.

Notices

Copyright © GALA TAPICC 2018. All rights reserved.

The Translation API Class and Cases (TAPICC) initiative is a collaborative, + Latest version: N/A.html.

Notices

Copyright © GALA TAPICC 2018. All rights reserved.

The Translation API Class and Cases (TAPICC) initiative is a collaborative, community-driven, open-source project to advance API standards in the localization industry. The overall purpose of this project is to provide a metadata and API framework on which users can base their integration, automation and interoperability efforts.

The usage of all deliverables of this initiative - including this specification - is @@ -20,7 +20,7 @@ specification shows why some Extraction approaches will cause issues during an XLIFF Roundtrip. This best practice guidance provides better thought through alternatives and shows how to use many of advanced XLIFF features for - lossless Localization roundtrip of HTML and XML based content.


Terminology and Concepts

Context hints

XLIFF attributes on structural or inline elements providing additional contexts, such + lossless Localization roundtrip of HTML and XML based content.


Terminology and Concepts

Context hints

XLIFF attributes on structural or inline elements providing additional contexts, such as disp or equiv.

Inline codes

marker

Introduction

This specification targets designers of XLIFF Extracting and Merging Tools for content owners. XLIFF Roundtrip designers of all kinds will benefit, no matter if they design their XLIFF Extractor/Merger for corporate or blog use.

Extraction and merging behavior is out of the normative scope of OASIS XLIFF @@ -43,7 +43,7 @@ best practice guidance provides better thought through alternatives and shows how to use many of advanced XLIFF features for lossless Localization roundtrip of HTML and XML based content. Most of the times there are no ultimate prescribed solutions, rather possible design goals are - described and best methods how to achieve them proposed.

Specification

Inline Codes

Representing Spanning Codes

Spanning codes in the original format are created by opening code, content and closing + described and best methods how to achieve them proposed.

Specification

Inline Codes

Representing Spanning Codes

Spanning codes in the original format are created by opening code, content and closing code. In HTML that can be <bold>text</bold>, in RTF \b text \b0.

In XLIFF2 such code can be represented using <sc />/<ec /> pair universally, or by <pc></pc> in case of @@ -61,33 +61,33 @@ for elements which are declared EMPTY.“ (https://www.w3.org/TR/REC-xml/#sec-starttags), e.g. even <span> without content would use <span></span> as compared to <br />. •https://issues.oasis-open.org/browse/XLIFF-14 - http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#ph

Outermost Tag Pairs

•[outermost_inline_excluded] + http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#ph

Outermost Tag Pairs

•[outermost_inline_excluded] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/outermost_inline_excluded •Both functional and formatting inline codes provide additional context for translator and could be linguistically significant. •If they are important enough to be in native format, - they should be present in extracted content.

Incomplete Extraction of Inline Codes

•[CDATA] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/cdata + they should be present in extracted content.

Incomplete Extraction of Inline Codes

•[CDATA] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/cdata •[inline_codes_plain_text] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/inline_codes_plain_text •http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#d0e8112 •https://www.w3.org/TR/xml-i18n-bp/#AuthCDATA •Not using native XLIFF representation - leaves inline codes unprotected and increases risk of roundtrip corrupting them.

Representing Multiple Subsequent Codes

•[multiple_codes_represented_as_single] + leaves inline codes unprotected and increases risk of roundtrip corrupting them.

Representing Multiple Subsequent Codes

•[multiple_codes_represented_as_single] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/multiple_codes_represented_as_single •Grouping several independent inline codes into single representation could prove challenging with negative impact on •Translation quality •Fluency •Functionality •Automated actions •Validation •Some codes needs to be removed, copied, added or reordered. •If any of the above actions is to be prevented, it can be controlled using - editing hints with finer granularity.

Target Content in Extracted XLIFF

Inserting unmodified source content into <target>

Inserting possible translation into <target>

State Machine

Editing and Context Hints

Non-deletable Inline Codes

Preserving Order of Codes

Controlling Segmentation

Providing Context

Context hints

Considerations for Using Spanning Codes

XLIFF Structure

File Structure

Role of <unit>

Miscellaneous

Value of attribute id

Whitespace Handling

Protecting Non-localizable Content

Merging Translated Content

Selecting Language Tags

Validation of Extracted Content

XLIFF Validations

Summary

References

Normative references

[XML] W3C: Extensible Markup Language (XML) - 1.026 November 2008https://www.w3.org/TR/xml/

[XLIFF-2.1] + editing hints with finer granularity.

Target Content in Extracted XLIFF

Inserting unmodified source content into <target>

Inserting possible translation into <target>

State Machine

Editing and Context Hints

Non-deletable Inline Codes

Preserving Order of Codes

Controlling Segmentation

Providing Context

Context hints

Considerations for Using Spanning Codes

XLIFF Structure

File Structure

Role of <unit>

Miscellaneous

Value of attribute id

Whitespace Handling

Protecting Non-localizable Content

Merging Translated Content

Selecting Language Tags

Validation of Extracted Content

XLIFF Validations

Summary

References

Normative references

[XML] W3C: Extensible Markup Language (XML) + 1.026 November 2008https://www.w3.org/TR/xml/

[XLIFF-2.1] Edited by David Filip, Tom Comerford, Soroush Saadatfar, Felix Sasaki, and Yves Savourel: XLIFF Version 2.112 October 2017 http://docs.oasis-open.org/xliff/xliff-core/v2.1/cos01/xliff-core-v2.1-cos01.htmlhttp://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html -

[XLIFF-2.0] +

[XLIFF-2.0] Edited by Tom Comerford, David Filip, Rodolfo M. Raya, and Yves Savourel: XLIFF Version 2.004 August 2014 http://docs.oasis-open.org/xliff/xliff-core/v2.0/os/xliff-core-v2.0-os.htmlhttp://docs.oasis-open.org/xliff/xliff-core/v2.0/xliff-core-v2.0.html -

[ISO XLIFF] +

[ISO XLIFF] Edited by Tom Comerford, David Filip, Rodolfo M. Raya, and Yves Savourel: ISO 21720:2017 - XLIFF (XML Localisation interchange file format)November 2017 https://www.iso.org/standard/71490.html -

Non-Normative References

[] Error: no bibliography entry: d5e260 found in http://cdn.docbook.org/release/xsl/bibliography/bibliography.xml

\ No newline at end of file +

Non-Normative References

[] Error: no bibliography entry: d5e263 found in http://cdn.docbook.org/release/xsl/bibliography/bibliography.xml

\ No newline at end of file diff --git a/docs/T1/WG3/XLIFF-EM-BP-ED.pdf b/docs/T1/WG3/XLIFF-EM-BP-ED.pdf index dbf171b..adb9e0a 100644 Binary files a/docs/T1/WG3/XLIFF-EM-BP-ED.pdf and b/docs/T1/WG3/XLIFF-EM-BP-ED.pdf differ diff --git a/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.html b/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.html index 3bd5a2a..cf0ff35 100644 --- a/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.html +++ b/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.html @@ -1,14 +1,14 @@ - XLIFF 2 Extraction and Merging Best Practice, Version 1.0

XLIFF 2 Extraction and Merging Best Practice, Version 1.0

Edited by

David Filip

ADAPT Centre

Ján Husarčík

Moravia

Rodolfo M. Raya

Andreas Galambos

TAPICC T1/WG3

Additional artifacts

This prose specification is one component of a Work Product that also includes:

  • Extraction and merging examples from - https://galaglobal.github.io/TAPICC/T1/WG3/wd01/XLIFF-EM-BP-V1.0-wd01/extraction_examples/readme.md

Related work

This note provides informative best practice for XLIFF 2 Specifications:

  • XLIFF Version 2.1 [[XLIFF-2.1]]

  • XLIFF Version 2.0 [[XLIFF-2.0]]

  • ISO 21720:2017 [[ISO XLIFF]]

Status

This Informational Best Practice was last revised by TAPICC T1/WG3 or the TAPICC Steering + XLIFF 2 Extraction and Merging Best Practice, Version 1.0

XLIFF 2 Extraction and Merging Best Practice, Version 1.0

Edited by

David Filip

ADAPT Centre

Ján Husarčík

Moravia

Rodolfo M. Raya

Andreas Galambos

TAPICC T1/WG3

Additional artifacts

This prose specification is one component of a Work Product that also includes:

Related work

This note provides informative best practice for XLIFF 2 Specifications:

  • XLIFF Version 2.1 [[XLIFF-2.1]]

  • XLIFF Version 2.0 [[XLIFF-2.0]]

  • ISO 21720:2017 [[ISO XLIFF]]

Status

This Informational Best Practice was last revised by TAPICC T1/WG3 or the TAPICC Steering Committee on the above date. The level of approval is also listed above. Check the “Latest version” location noted above for possible later revisions of this document.

Contributions to this deliverable or subsequent versions of this deliverable can be made via the GALA TAPICC GitHub Repository subject to signing the TAPICC Legal - Agreement.

Citation format

When referencing this specification the following citation format should be used:

[XLIFF-EM-BP]

XLIFF 2 Extraction and Merging Best Practice, Version 1.0 + Agreement.

Citation format

When referencing this specification the following citation format should be used:

[XLIFF-EM-BP]

XLIFF 2 Extraction and Merging Best Practice, Version 1.0 Edited by David Filip and Ján Husarčík. 24 January 2018. Working Draft 01. https://galaglobal.github.io/TAPICC/T1/WG3/wd01/XLIFF-EM-BP-V1.0-wd01.html. - Latest version: N/A.html.

Notices

Copyright © GALA TAPICC 2018. All rights reserved.

The Translation API Class and Cases (TAPICC) initiative is a collaborative, + Latest version: N/A.html.

Notices

Copyright © GALA TAPICC 2018. All rights reserved.

The Translation API Class and Cases (TAPICC) initiative is a collaborative, community-driven, open-source project to advance API standards in the localization industry. The overall purpose of this project is to provide a metadata and API framework on which users can base their integration, automation and interoperability efforts.

The usage of all deliverables of this initiative - including this specification - is @@ -20,7 +20,7 @@ specification shows why some Extraction approaches will cause issues during an XLIFF Roundtrip. This best practice guidance provides better thought through alternatives and shows how to use many of advanced XLIFF features for - lossless Localization roundtrip of HTML and XML based content.


Terminology and Concepts

Context hints

XLIFF attributes on structural or inline elements providing additional contexts, such + lossless Localization roundtrip of HTML and XML based content.


Terminology and Concepts

Context hints

XLIFF attributes on structural or inline elements providing additional contexts, such as disp or equiv.

Inline codes

marker

Introduction

This specification targets designers of XLIFF Extracting and Merging Tools for content owners. XLIFF Roundtrip designers of all kinds will benefit, no matter if they design their XLIFF Extractor/Merger for corporate or blog use.

Extraction and merging behavior is out of the normative scope of OASIS XLIFF @@ -43,7 +43,7 @@ best practice guidance provides better thought through alternatives and shows how to use many of advanced XLIFF features for lossless Localization roundtrip of HTML and XML based content. Most of the times there are no ultimate prescribed solutions, rather possible design goals are - described and best methods how to achieve them proposed.

Specification

Inline Codes

Representing Spanning Codes

Spanning codes in the original format are created by opening code, content and closing + described and best methods how to achieve them proposed.

Specification

Inline Codes

Representing Spanning Codes

Spanning codes in the original format are created by opening code, content and closing code. In HTML that can be <bold>text</bold>, in RTF \b text \b0.

In XLIFF2 such code can be represented using <sc />/<ec /> pair universally, or by <pc></pc> in case of @@ -61,33 +61,33 @@ for elements which are declared EMPTY.“ (https://www.w3.org/TR/REC-xml/#sec-starttags), e.g. even <span> without content would use <span></span> as compared to <br />. •https://issues.oasis-open.org/browse/XLIFF-14 - http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#ph

Outermost Tag Pairs

•[outermost_inline_excluded] + http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#ph

Outermost Tag Pairs

•[outermost_inline_excluded] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/outermost_inline_excluded •Both functional and formatting inline codes provide additional context for translator and could be linguistically significant. •If they are important enough to be in native format, - they should be present in extracted content.

Incomplete Extraction of Inline Codes

•[CDATA] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/cdata + they should be present in extracted content.

Incomplete Extraction of Inline Codes

•[CDATA] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/cdata •[inline_codes_plain_text] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/inline_codes_plain_text •http://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#d0e8112 •https://www.w3.org/TR/xml-i18n-bp/#AuthCDATA •Not using native XLIFF representation - leaves inline codes unprotected and increases risk of roundtrip corrupting them.

Representing Multiple Subsequent Codes

•[multiple_codes_represented_as_single] + leaves inline codes unprotected and increases risk of roundtrip corrupting them.

Representing Multiple Subsequent Codes

•[multiple_codes_represented_as_single] https://github.com/GALAglobal/TAPICC/tree/master/extraction_examples/multiple_codes_represented_as_single •Grouping several independent inline codes into single representation could prove challenging with negative impact on •Translation quality •Fluency •Functionality •Automated actions •Validation •Some codes needs to be removed, copied, added or reordered. •If any of the above actions is to be prevented, it can be controlled using - editing hints with finer granularity.

Target Content in Extracted XLIFF

Inserting unmodified source content into <target>

Inserting possible translation into <target>

State Machine

Editing and Context Hints

Non-deletable Inline Codes

Preserving Order of Codes

Controlling Segmentation

Providing Context

Context hints

Considerations for Using Spanning Codes

XLIFF Structure

File Structure

Role of <unit>

Miscellaneous

Value of attribute id

Whitespace Handling

Protecting Non-localizable Content

Merging Translated Content

Selecting Language Tags

Validation of Extracted Content

XLIFF Validations

Summary

References

Normative references

[XML] W3C: Extensible Markup Language (XML) - 1.026 November 2008https://www.w3.org/TR/xml/

[XLIFF-2.1] + editing hints with finer granularity.

Target Content in Extracted XLIFF

Inserting unmodified source content into <target>

Inserting possible translation into <target>

State Machine

Editing and Context Hints

Non-deletable Inline Codes

Preserving Order of Codes

Controlling Segmentation

Providing Context

Context hints

Considerations for Using Spanning Codes

XLIFF Structure

File Structure

Role of <unit>

Miscellaneous

Value of attribute id

Whitespace Handling

Protecting Non-localizable Content

Merging Translated Content

Selecting Language Tags

Validation of Extracted Content

XLIFF Validations

Summary

References

Normative references

[XML] W3C: Extensible Markup Language (XML) + 1.026 November 2008https://www.w3.org/TR/xml/

[XLIFF-2.1] Edited by David Filip, Tom Comerford, Soroush Saadatfar, Felix Sasaki, and Yves Savourel: XLIFF Version 2.112 October 2017 http://docs.oasis-open.org/xliff/xliff-core/v2.1/cos01/xliff-core-v2.1-cos01.htmlhttp://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html -

[XLIFF-2.0] +

[XLIFF-2.0] Edited by Tom Comerford, David Filip, Rodolfo M. Raya, and Yves Savourel: XLIFF Version 2.004 August 2014 http://docs.oasis-open.org/xliff/xliff-core/v2.0/os/xliff-core-v2.0-os.htmlhttp://docs.oasis-open.org/xliff/xliff-core/v2.0/xliff-core-v2.0.html -

[ISO XLIFF] +

[ISO XLIFF] Edited by Tom Comerford, David Filip, Rodolfo M. Raya, and Yves Savourel: ISO 21720:2017 - XLIFF (XML Localisation interchange file format)November 2017 https://www.iso.org/standard/71490.html -

Non-Normative References

[] Error: no bibliography entry: d5e260 found in http://cdn.docbook.org/release/xsl/bibliography/bibliography.xml

\ No newline at end of file +

Non-Normative References

[] Error: no bibliography entry: d5e263 found in http://cdn.docbook.org/release/xsl/bibliography/bibliography.xml

\ No newline at end of file diff --git a/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.pdf b/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.pdf index d728d4a..986afa1 100644 Binary files a/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.pdf and b/docs/T1/WG3/wd01/XLIFF-EM-BP-wd01.pdf differ