Skip to content

Commit

Permalink
#10, #11: docs/eng-Latn/hxltm.adoc: use cases
Browse files Browse the repository at this point in the history
  • Loading branch information
fititnt committed Nov 29, 2021
1 parent e9362f4 commit 48460d4
Showing 1 changed file with 82 additions and 15 deletions.
97 changes: 82 additions & 15 deletions docs/eng-Latn/hxltm.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,11 @@ General experience with terminology, even as an user of https://iate.europa.eu/f
https://unterm.un.org/[UNTERM] or end user interface with similar propose,
is helpful to undestand how HXLTM use these levels.

The `4. _Fourth-level_` (not used with this nomenclature on other standards) means arbitrary data related to entire dataset _knows_ about itself:
The **`4. _Fourth-level_`**
(not used with this nomenclature on other standards) when used on HXLTM documentation means arbitrary data related to the entire dataset _knows_ about itself and does not fit as _Abstract_ of any of the 3 levels (not even the `1. **Concept-level**`)
This can be used, for example, as a base to store **in every data row** title or description of an TBX.

////
for example the relationship between linguistic datasets,
information about how it is processed, etc.
// It can also be used to save on HXLTM tabular format what would be on metadata from XML containers with one issue:
Expand All @@ -39,16 +43,35 @@ information about how it is processed, etc.
TIP: If you are _only_ a end user,
you can ignore referentes to the `4. _Fourth-level_`.
But the idea of _Concrete vs Abstract_ is relevant as it can affect how you label data.
////

[#item-meta]
==== Concrete vs Abstract
The way `1. Concept-level`, `2. Language-level` and `3. Term-level` expressions used on HXLTM also have two options of base hashtag which could be explained as making the data either concrete (like the main objective) or abstract (like metadata).
=== Concrete vs Abstract
The way `1. Concept-level`, `2. Language-level` and `3. Term-level` expressions used on HXLTM also have two options of base hashtag which could be explained as making the data either concrete
(like the main objective intended to be always used)
or abstract (like generic metadata or data, like very new `2. Language-level` / `3. Term-level` columns, that are not ready yet).

////
Note that most terminology formats are designed to only export final data.
By default HXLTM tools when importing from then the terms will save with HXL hashtags that are "concrete".
////

////
The optimized use case of HXLTM is focused on emergency response **and** multilingual content:
There is special care for languages which could not be worked on places like Europe IATE or other humanitarian online terminology frontend because are not prioritized.
////

////
While most examples of HXLTM made by HXP-CPLP are already publish CSVs, XLSXs and Google Sheets,
the HXLTM tooling can be used to
////
////
This distinction is made both to allow ad-hoc differentiation when parsing HXL directly,
without HXLTM-aware tools,
by simply changing the base tag.
TIP: For example you may be doing a collaborative translation but tools that fetch you data and publish may be marked to not export entire coluns (like new translations) that are marked as abstract.
////

////
NOTE: tools parsing HXLTM tables directly should undestand
Expand All @@ -60,11 +83,8 @@ if a data source needs to be processed both by old and new tools,
this feature can be explored
////

=== Base tags used when HXLTM on tabular container
=== HXLTM on tabular container

Compared to the HXLStandard,
while the HXLTM reference tools will allow mix with other HXL tags,
most optimized operations for formats that are not tabular HXLTM will work with only `#item` and `#meta` *and* require an extra base HXL attribute.
// Such extra attribute also match the `1. Concept-level`, `2. Language-level` and `3. Term-level` idea.
The baseline HXL hashtags _(when using Latin script)_ are the following:

Expand All @@ -80,7 +100,42 @@ The baseline HXL hashtags _(when using Latin script)_ are the following:
4. _Fourth-level_
** `#x_meta`

== HXL base hashtags for HXLTM
Trivia: Compared to the HXLStandard,
while the HXLTM reference tools will allow mix with other HXL generic tags (for example, `#date`),
the most optimized operations for formats that are not tabular HXLTM will work with only `#item` and `#meta` *and* require an extra base HXL attribute.
Without this extra attribute HXLTM tools will assume you are mixing generic HXL.

=== Use with not typical linguistic content

* https://tools.ietf.org/search/bcp47
** https://en.wikipedia.org/wiki/ISO_15924
** https://en.wikipedia.org/wiki/ISO_639-3

==== One non typical language

In addition to allow mix linguistic content
(for example, extra metadata, codes, etc)
is also possible to reuse HXLTM tools for no linguistic content at all:
you just need _create_ your own private language code.
Since HXLTM operates using BCP47,
the most generic base to use is ISO 15924 `Zyyy`` and ISO 639-3 `zxx``:
`zxx-Zyyy` (or `+i_zxx+is_Zyyy`)

==== Several non typical languages
Both use of BCP47 one or more private tags,
`zxx-Zyyy-x-privatum` (or `+i_zxx+is_Zyyy+ix_privatum`),
or language codes and language scripts,
like `qaa-Zyyy` (or `+i_qaa+is_Zyyy`),
can be used.

==== Text descriptions for non typical languages

When using HXLTM to encode either one non or several typical languages,
for example quick examples of programming hello worlds,
you can writte the human descriptions as definitions of a real natural language.

== HXL base hashtag for HXLTM
When working with HXLTM on a tabular container, it is necessary specify a base HXL hashtag.

=== `+#item+`

Expand All @@ -99,14 +154,14 @@ Datasets with valid HXL base hashtags
(but not explicitly known as part of HXLTM, like your user-configurable Ontologia)
can be used when creating more generic exporters from tabular formats.

NOTE: operations related to transpose data (see <<#__linguam__>>),
NOTE: operations related to transpose data (see <<#linguam>>),
which already are very advanced to simplify for the end user,
did not explicitly have promises that will keep it working.
If you have generic HXL tags that want to transpose,
the more reliable way would be attach explicitly to one of the
<<#conceptum-linguam-terminum>>.

=== Behavior for columns without HXL hashtags (but tabular dataset already is HXLated)
==== Behavior for columns without HXL hashtags (but tabular dataset already is HXLated)
HXLTM tools will not create **new** columns on HXLTM tabular datasets without HXL hashtags.
But it _MAY_ re-export columns without HXL headings when no advanced transposition is done and MAY allow exporters specifying exact column order of original dataset.

Expand All @@ -129,19 +184,29 @@ to add the tags used by HXLTM.

HXL attribute for **Concept-level** representation (See <<#conceptum-linguam-terminum>>).

==== `+conceptum+codicem`

=== `+linguam`

HXL attribute required for **Language-level** representation (See <<#conceptum-linguam-terminum>>).

Required: <<#__linguam__>>
Required additional atttribute: <<#linguam>>

=== `+linguam+definitionem`

While each language can have several terms, the textual definition should be defined at language level.

NOTE: HXLTM intentionally **NOT** allows set textual definition on Concept-level.

Required additional atttribute: <<#linguam>>

=== `+terminum`

HXL attribute required for **Term-level** representation (See <<#conceptum-linguam-terminum>>).

Required: <<#__linguam__>>
Required additional atttribute: <<#linguam>>

[#__linguam__]
[#linguam]
=== `+__linguam__+`
Both user documentation and ontologia file uses `+__linguam__+` to represent an unlimited (but predictable) number of HXL attributes related to express the idea of language (often a language code).

Expand Down Expand Up @@ -211,8 +276,7 @@ hxltmdexml --agendum-linguam lat-Latn,arb-Arab testum/hxltm-salve-mundi.hxltm.xm
include::../testum/resultatum/hxltm-salve-mundi.tm.hxl.csv[]
----

> TODO: make it work with new format
> `hxltmcli hxltm-exemplum-glossarium-minimum.tm.hxl.csv --objectivum-TMX`
_TODO: make it work with new format `hxltmcli hxltm-exemplum-glossarium-minimum.tm.hxl.csv --objectivum-TMX`_

////
== Drafts
Expand Down Expand Up @@ -274,4 +338,7 @@ Did you know that UTX is public domain? That's fantastic!
[#TBX]
=== TermBase eXchange (TBX) (the creative commons licensed)

* https://www.tbxinfo.net/wp-content/uploads/2016/10/tbx_oscar.pdf
* http://www.terminorgs.net/downloads/TBX_Basic_Version_3.1.pdf

_TODO: add more information here_

0 comments on commit 48460d4

Please sign in to comment.