Reads JSON config file output by CollectionSpace application.
Gets field definitions (field_defs
), including repeatability, data type, value source, XML field name and parents, etc.
Gets fields as defined for use in forms (form_fields
), including the panel in which the field is included, and UI hierarchy.
Gets messages assigned to fields, panels, and input tables from field_defs and the messages hash under the profile and record types. It is assumed messages set at profile level will override those at lower levels
For a given profile, matches each form_field to its corresponding field_def and creates a field
object that combines all info for the field. If a form_field represents a field group populated with structured date fields, the individual structured date fields are provided from the extension, and the original form_field is treated as the parent UI grouping.
Note: there may be field_defs in a profile which do not match any form_fields. Field objects are not created/reported for these, because if a field has not been made available for viewing/editing in a form, it is not considered included in the profile.
-
Tested with Ruby 2.7.4
-
Do
bundle --version
-
If the version of Bundler is lower than 2.2.29, do
gem update bundler
-
Bundler should come standard with Ruby 2.7.0, but may be an older version. If you get an error that you don’t have Bundler installed when you try to check the version, do
gem install bundler
-
-
Clone this repo
-
cd
into cloned directory -
bundle/install
-
Download your configs into the appropriate
data/configs
directory or directories -
Configure your settings in
lib/cspace_config_untangler.rb
.
The benefit of this is that you can run ccu
from the command line anywhere to interact with the application. If you don’t do this, you can still use the tool, but must cd
into the cloned repository directory and use exe/ccu
when entering a command in your terminal.
The way you do this is different depending on your operating system, terminal configuration, and whether you want it to be permanent or not, so google it.
Once the setup is done, you should be able to cd
into the cloned directory and type exe/ccu
(or just ccu
if you have installed as a gem) at the command prompt to get the list of available functions with their brief descriptions.
💡
|
The best source of info on what each function does and how to use it is the documentation available from the command line interface (CLI). For the top-level command groups:
For an overview of the specific commands inside a group (using the profiles group as an example):
For details on usage of a specific command (using the profiles compare command as an example):
|
There are detailed instructions for some common tasks in the doc
directory.
❗
|
This tool can only be used confidently with configs from CollectionSpace 6.1 and newer |
-
For 5.2 configs, data source values are not consistently supplied for structured date fields. This is because configuration of the structured date fields was not written out to the JSON config in a standard way until 6.0.
-
The 6.1 release further refined the JSON config output allowing the full functionality of this tool
-
Does not currently report on fields in the
ns2:collectionspace_core
namespace -
Does not currently report on fields in the
rel:relations-common-list
namespace because the way this data is defined in the config is very different from the rest -
contact
andblob
get reported/treated as extensions within the tool, rather than sub-records -
Does not support fields in custom namespaces added to
contact
orblob
-
Do
exe/ccu fields csv -p all
and check whether thedata_type
column has any blank values. If so, probably your profile has configured some fields from extensions in an unexpected manner. This can causeforms/default/props/subpath
values (used to create form_field ids) to not match thefields/document/…/{fieldname}/[config]/messages/name/id
values (used to create field_def ids) for some fields. The Untangler is then unable to match up form_field info with field_def info to generate the necessary combined field info required for fully-populated fields CSV, CSV template, and RecordMapper output. You’ll need to do some hard-coding somewhere in the code to get a match -
Do you have fields with the same name in different namespaces in the same record type? Use
exe/ccu fields nonunique
to generate a listing of any such fields.-
The code tries to automatically fix this here but if any non-unique field names are sneaking through, you may need to hard-code something to fix this. Otherwise, you will get two columns in your CSV template with the same header and it won’t be clear which field that data should be imported into.
-
-
If you have record types with (a) no required field; or (b) multiple required fields, you will need to hard-code
identifier_field
values inrecord_mapper.rb’s `get_id_field
method. -
If you have created any form templates that include fields that are not included in your
default
template, your Untangler output will not include those fields. The assumption is that thedefault
form contains all possible fields for a record type. -
RECOMMENDED: add your profile name and the last version of that profile that should be handled with fancy column/fieldname style. If you do not configure this for your profile, you will get warnings on the screen and in your log file, and data exported from CollectionSpace for round-tripping with the CSV importer may not be importable without fixing some column headers. See Other topics > Column styles for more explanation.
Since there is no way to programmatically grab the JSON config, this currently requires you to manually download the JSON config files from the following links. The JSON files should be saved as {profilename}.json
in the data/configs
directory.
❗
|
You must follow the config naming conventions specified below in order for the Untangler to properly identify profile name and version! |
And for the latest dev versions of profiles:
Set CCU.const_set('MAINPROFILE')
value in lib/cspace_config_untangler.rb
.
Config file name must contain the profile name and profile version.
Use _
(underscore) to separate the profile name and profile version sections of the name.
Use -
(hyphen) to separate words/numbers within a section.
Examples:
anthro_4-1-2.json
my-custom-config_2-0.json
This allows the Untangler to split the config file name on _
and unambiguously determine profile name vs. profile version.
Output files follow the same convention, adding the recordtype section:
anthro_4-1-2_concept-associated.json
This is related to:
-
the field names/column headers in CSVs exported from CollectionSpace
-
the field names/column headers in the CSV templates generated by this tool, and for which mapping instructions are generated for CSV import
💡
|
You can pretty much ignore this if:
If you are annoyed by warnings about it on the screen and in your logs, you can configure it, but it won’t really matter what you enter as the last fancy column version |
This mainly affects fields which may be populated with terms from multiple authorities, where several columns of CSV data map into one CollectionSpace data field.
Prior to CollectionSpace 7.0, CollectionSpace export and this tool both tried to create shorter, less redundant column names using a more "fancy" algorithm, but the two tools ended up creating columns with slightly different names. We realized this, and the fact that it would require more data prep for roundtripping, while building 7.0.
In CollectionSpace 7.0 and beyond, the column names are longer and sometimes a bit internally redundant, but they are consistent with each other for both export and import.
For the community profiles, we increment the profile version with each CollectionSpace release, so the version used with 6.1 is enterd in the settings as the last fancy version for each profile.
If this affects you, add a line for your profile to the default_last_fancy_column_versions
hash, and include the version of your profile that was used with CollectionSpace 6.1.
❗
|
If you do not configure this for your profile, the consistent column naming style will be used. If you are on 6.1 and configure this correctly, you will get fancy column headers. You may still have to fix some column names for import (the pre-processing step of the import will warn you about them). You would have to fix a lot more column names if you are exporting from 6.1 (fancy export column names), but using the consistent headers in your CSV import data. |