Skip to content

Latest commit

 

History

History
409 lines (320 loc) · 27.1 KB

CHANGELOG.adoc

File metadata and controls

409 lines (320 loc) · 27.1 KB

Changelog

Planned for a future release

Breaking

  • Removal of sep parameter in Deduplicate::GroupedFieldValues (deprecated in 3.3.0)

  • Removal of multival parameter in Cspace::NormalizeForId (deprecated in 3.3.0)

Unreleased

These changes are merged into the main branch, but have not been released. After merging pull requests (PRs) that are not immediately released into main, a tag is added appending the PR# to the current release. For example, if the release version/tag is 3.2.1, and PR# 107 is merged without a new release, the state of the codebase after that merge will be tagged as 3.2.1.107.

Breaking

Bugfixes

  • Role term and subordinate body subfields for meeting names fixed in default config.

  • IterativeCleanup now automatically extends its extending module with Dry::Configurable prior to defining settings that depend on Dry::Configurable. (PR#192)

  • Kiba::Extend::Job.output? no longer fails if given job returns Nil (PR#194)

  • Reshape::FieldsToFieldGroupWithConstant constant value is no longer added to rows with no values in the renamed/remapped value fields, when fieldmap length == 1. (PR#195)

Added

  • MARC::LanguageCodeLookup transform

  • Ability to pass find argument to Clean::RegexpFindReplaceFieldVals as a Regexp object. Not sure why this was not the default initial behavior, but here we are! (PR#196)

  • Ability to pass delim argument to Append::ToFieldValue to trigger multi-value treatment (PR#200)

Changed

  • MARC name extraction transforms now supply "uniform title name" as a role term in fields having a $t value. This change supports the fact that some applications may not wish to treat these values as standalone names, and this makes it possible to filter out these values. (PR#199)

Deleted

Deprecated/Will break in a future version

Dev

  • Improve error handling for resolution of lookups for individual jobs (PR#191)

Releases

4.0.1 - 2023-09-13

  • Simplification of requirements for IterativeCleanup usage

  • Complete documentation for IterativeCleanup

  • Switch to kramdown Markdown conversion for YARD

4.0.0 - 2023-09-11

Breaking

  • Nested job.show_me, job.tell_me, and job.verbosity config settings have been removed. (They were deprecated in 3.2.0)

  • Kiba::Common::Sources and Kiba::Common::Destinations are no longer supported. To use an external source or destination class, subclass it in a Kiba::Extend source or destination class that extends Kiba::Extend::Sourceable or Kiba::Extend::Destinationable (PR#139)

  • Using a supplied file registry entry as the destination of a job raises an error, since the definition of a supplied entry is that it is not created by a job in the project. (PR#139)

Added

New destinations
  • Destinations::Marc (PR#138)

  • Destinations::Lambda (PR#139)

New sources
  • Sources::CSV (PR#139)

  • Sources::Enumerable (PR#139)

  • Sources::JsonDir (PR#140)

New job types
  • Jobs::JsonToCsvJob (PR#140)

New transforms
  • Clean::EnsureConsistentFields (PR#140)

  • Delete::FieldnamesStartingWith (PR#156)

  • Explode::RowsFromGroupedMultivalFields (PR#165)

  • Fingerprint::FlagChanged (PR#155)

  • Fingerprint::MergeCorrected (PR#157)

  • Marc::FilterRecords::ById (PR#138)

  • Marc::FilterRecords::WithLambda (PR#138)

  • Marc::ExtractMeetingNameData (PR#164)

  • Marc::ExtractOrgNameData (PR#137)

  • Marc::ExtractPersonNameData (PR#137)

  • Marc::ExtractNameData (PR#137)

  • Marc::ExtractSubfieldsFromField (PR#141)

  • Replace::NormWithMostFrequentlyUsedForm (PR#167)

  • Sort::ByFieldValue (PR#151)

  • Split::PublicationStatement transform (PR#142)

New Transforms::Helpers
  • OrgNameChecker (PR#148)

  • PersonNameChecker(PR#161)

New params/options
  • CombineValues::FromFieldWithDelimiter can now take sources: :all, and will provide space as a default delim if not provided (PR#147)

  • CombineValues::FromFieldWithDelimiter can now take delete_sources and prepend_source_field_name args (PR#147)

  • :mode parameter added to Jobs::BaseJob (PR#154, PR#157)

Other
  • Utility classes to clean ISBD trailing punctuation from name and role term values extracted from MARC data (PR#141)

  • Kiba::Extend::Job.output? convenience method (PR#150)

  • Job duration report (added to normal and verbose job run) (PR#154, PR#157)

  • IterativeCleanup mixin (PR#180)

Changed

  • Transforms that take an action argument now mix in the new ActionArgumentable module and validate the argument values in a consistent way (PR#138)

  • Name and role term values extracted from MARC data by subclasses of Transforms::Marc::ExtractBaseNameData are run through Utils::MarcNameCleaner and Utils::MarcRoleTermCleaner (PR#141)

  • Fingerprint::Add now passes in default delim: U+241F / E2 90 9F / Symbol for Unit Separator (PR#155)

  • Fingerprint::Decode now passes in default delim (U+241F / E2 90 9F / Symbol for Unit Separator), and default prefix (fp) (PR#155)

  • Fingerprint::FlagChanged can now be passed an ignore_fields parameter indicating fields included in the fingerprint, but which should not be compared to current values and flagged (PR#168)

Bugfixes

  • Fixes #46 - CombineValues::FullRecord with multi-sources can result in different values (PR#147)

  • Fixes issue where job registry entry with a Marc source and CSV destination could not be used as a source or lookup in jobs (PR#137)

  • Fixes issue in StringValue::ToArray transform where delim=nil was not correctly being calculated (PR#145)

  • Fixes #152: Fingerprint::Decode error: Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8 (PR#153)

  • Fixes #162: failure of Delete::EmptyFields transform when passed a source with no rows

  • Fixes #179: renaming field with same fieldname in from and to resulted in the field being deleted (PR#181)

Deprecated/Will break in a future version

  • sep parameter will be replaced by delim in CombineValues::FromFieldWithDelimiter and CombineValues::FullRecord (PR#147)

Dev

  • Adds Kiba::Extend::ErrMod module to be included into Kiba::Extend-specific error classes. This allows us to subclass each application-specific error to the semantically appropriate Ruby exception class, while retaining the ability to identify/scope/rescue only application-specific errors. (PR#138)

  • Add :info method to Kiba::Extend::ErrMod module, to print error type, message, and backtrace to STDOUT in a consistent way. (PR#141)

  • Set up standardrb linting, with Kristina’s standard (ha) minor overrides (PR#169)

3.3.0 - 2023-02-17

Added

  • StringValue::ToArray transform (PR#111)

  • Two mixin modules to Transforms to support deprecating non-preferred parameter signatures:

    • Transforms::SepDeprecatable (PR#124)

    • Transforms::MultivalPlusDelimDeprecatable (PR#124)

  • Utils::StringNormalizer service class (PR#124)

  • ignore_case and normalized parameters for Deduplicate::GroupedFieldValues (PR#124)

  • Options to Clean::StripFields transform (PR#129):

    • Ability to pass fields: :all to strip all fields in a table

    • Ability to turn on multivalue processing by passing in a delim value

  • More country mappings to Cspace::AddressCountry transform (PR#132)

  • Marc source, MarcJob, Kiba::Extend::Marc configuration module, Utils::MarcIdExtractor, and an initial Marc::Extract245Title transform (PR#134)

Bugfixes

  • Catch Merge::MultiRowLookup transform created with empty fieldmap and raise error on initialization, rather than letting it blow up Utils::Fieldset later (PR#127)

  • Fix #121 (PR#122)

Deprecated/Will break in a future version

  • sep parameter in Deduplicate::GroupedFieldValues (PR#124)

  • multival parameter in Cspace::NormalizeForId (PR#124)

Dev

  • Run Rspec in random order with seed (PR#124)

3.2.2 - 2022-09-23

Added

  • Fraction::ToDecimal transform (and supporting Utils::ExtractFractions and Data::ConvertibleFraction classes) (PR#108)

  • yardspec gem to support running YARD examples as RSpec tests (PR#107)

  • Branch coverage to simplecov setup (PR#107)

Changed

  • Tests for the Prepend::ToFieldValue transform converted to use yardspec (PR#107)

Bugfixes

  • No longer falls over when a project has nested job config settings (scope changes when used in a project, and the private :warn_unnested method couldn’t be called)

3.2.1 - 2022-09-21

Added

  • Config setting to control string used as registry namespace separator

Bugfixes

  • Require the kiba-common ShowMe extension so that option actually works when running jobs

Changed

  • Refactoring lib/kiba/extend.rb so inter-application require statements can be removed

3.2.0 - 2022-09-20

Added

  • Configurable pre-job task handling

  • Kiba::Extend::Registry::FileRegistry.finalize method

  • Unnested job_show_me, job_tell_me, and job_verbosity config settings.

Deprecated/Will break in a future version

  • Nested job.show_me, job.tell_me, and job.verbosity config settings.

3.1.0 - 2022-09-20

Added

  • Add publicly readable srcrows and outrows on Kiba::Extend::Jobs::BaseJob (inherited by all job types). This makes it possible to do things like this in client projects:

job = Kiba::Extend::Command::Run.job(:prep__objects)
puts "Some records omitted" if job.outrows < job.srcrows

These attributes were previously only accessible via:

job.context.instance_variable_get(:@srcrows)

Changed

  • Refactored thor tasks. Moved basically all logic/behavior into Kiba::Extend::Command namespace where it can be called by Thor tasks or directly by client projects. This leaves the /lib/tasks more purely about defining the CLI interaction

3.0.0 - 2022-08-26

Breaking

  • See the list of deleted transforms, helpers, and params below.

  • Split::IntoMultipleColumns transform: no longer removes spaces between split segments that end up collapsed left or right. This was a bug, but fixing it could cause jobs relying on that behavior (or introducing subsequent transforms to deal with it) to fail or generate unexpected results.

Added

Changed

  • Split::IntoMultipleColumns: If empty string is passed in as the value to be split, all newly created fields will be nil

Bugfixes

  • Split::IntoMultipleColumns no longer removes existing spaces between segments that get right/left collapsed

  • Fixes incorrect value splitting in Split::IntoMultipleColumns

  • Reshape::FieldsToFieldGroupWithConstant now works with single source fields (i.e. listed in fieldmap param) with nil values

Deleted

  • Transforms

    • Clean::DelimiterOnlyFields

    • CombineValues::AcrossFieldGroup

    • Reshape::CollapseMultipleFieldsToOneTypedFieldPair

    • FilterRows::FieldValueGreaterThan

  • Transform Helpers

    • Helpers.delim_only?

    • Helpers.field_values

  • Parameters

    • multival and sep parameters from Replace::FieldValueWithStaticMapping transform

2.9.0 - 2022-07-28

Breaking

  • Removes Hash conditions parameter and sep parameter from Merge::ConstantValueConditional transform, replacing with lambda Proc condition parameter. In PR#88

  • Only relevant if you have called global Kiba::Extend::DELIM or Kiba::Extend::CSVOPT from outside Kiba::Extend for some reason. These global constants were finally removed from the final few places they were being used within kiba-extend, and they have been removed from the application setup.

Added

  • New service object classes in Transforms::Helpers in PR#93:

    • DelimOnlyChecker

    • FieldValueGetter

    • RowFieldEvennessChecker

  • New transforms:

    • Clean::EvenFieldValues (in PR#93)

    • Collapse::FieldsToRepeatableFieldGroup (in PR#93)

    • Collapse::FieldsToTypedFieldPair (in PR#93)

    • Collapse::FieldsWithCustomFieldmap (in PR#93)

    • Deduplicate::FlagAll (in PR#93)

    • Delete::DelimiterOnlyFieldValues (in PR#93)

    • Delete::EmptyFieldGroups (in PR#93)

    • FilterRows::AllFieldsPopulated (in PR#85)

    • FilterRows::AnyFieldsPopulated (in PR#85)

    • FilterRows::WithLambda (in PR#85)

    • Merge::ConstantValues (in PR#84)

    • Replace::EmptyFieldValues (in PR#93)

    • Reshape::FieldsToFieldGroupWithConstant (in PR#93)

    • Warn::UnevenFields (in PR#93)

  • stripextra csv converter to do aggressive stripping of csv field values, without converting 'NULL' strings to nilValues. In PR#91

  • ignore_case parameter to FilterRows::FieldMatchRegexp transform. Defaults to false for backward compatibility. In PR#85

Changed

  • BUGFIX: Utils::Lookup::RowSorter no longer fails if all rows given to sort have blank values. In PR#93

  • BUGFIX: Clean::EmptyFieldGroups was broken if sep = | and use_nullvalue = true. In PR#93

  • BUGFIX: No longer runs the same dependency job multiple times. In PR#90

  • In Merge::ConstantValueConditional transform, lambda Proc is passed in as condition, rather than conditions. In PR#88

  • If source data is an ISO 3166 code, Cspace::AddressCountry passes that value through to target. Adds some more lookup keys to support client data set. In PR#87

  • Merge::ConstantValue warns (once per transform) if target is an existing field containing any data. In PR#84

  • BUGFIX: RowSorter checks for presence of sortfield and raises error if it doesn’t exist, rather than trying to proceed and blowing up. In PR#83

Deleted

  • Removes Hash conditions parameter and sep parameter from Merge::ConstantValueConditional transform, replacing with lambda Proc condition parameter. In PR#88

To be deprecated/Will break in a future version

  • Deprecates Helpers.delim_only?, replacing with Helpers::DelimOnlyChecker service class. In PR#93

  • Deprecates Clean::DelimiterOnlyFields, replacing with Delete::DelimiterOnlyFieldValues . In PR#93

  • Deprecates Reshape::CollapseMultipleFieldsToOneTypedFieldPair, replacing with Collapse::FieldsToTypedFieldPair . In PR#93

  • Deprecates CombineValues::AcrossFieldGroup, replacing with Collapse::FieldsWithCustomFieldmap . In PR#93

  • Deprecates FilterRows::FieldValueGreaterThan. In PR#86

2.8.0 - 2022-05-13

Breaking

  • Count::MatchingRowsInLookup previously returned Integers. Now it defaults to returning Strings, since many of the transforms assume all field values will be strings. If you were calling Count::MatchingRowsInLookup in a job and working with the integer result as an integer within that job, this will be a breaking change. In PR#69

Added

  • Lookup::RowSorter class and the ability to pass it in as an argument to Merge::MultiRowLookup transform to explicitly control the order in which matching rows are merged. In PR#82

  • Ability to pass in a Lambda as a conditions argument on transforms. This provides a more straightforward and infinitely flexible alternative to the horrible, poorly documented Hash expression of conditions. In PR#82

  • Add Rename::Fields transform. In PR#75

  • Add Name::SplitInverted and Name::ConvertInvertedToDirectForm transforms. In PR#74

  • Add Allable mixin module for transforms that accept fields: :all. In PR#73

  • Add Cspace::AddressCountry transform. In PR#72. Made more configurable in PR#75

  • Add null_placeholder parameter to Merge::MultiRowLookup, which will replace any blank values in merged field values with the given string. Useful for building repeating field groups in CollectionSpace migrations. In PR#70

Changed

  • Raise LookupTypeError when Merge::MultiRowLookup is called with lookup parameter that is not a Hash. In PR#81

  • Improved exception handling when MissingDependencyError is raised. In PR#80

  • Improved error message for Copy::Field. In PR#78

  • Add improved error handling in jobs when a transform raises a Kiba::Extend::Error. In PR#77.

  • Improved exception handling when KeyNotRegisteredError is raised, as per GH#64. In PR#79

  • More informative error message if you pass in a non-existent using hash when calling Deduplicate::Flag transform. In PR#76

  • Rename::Field now warns if the to field already exists and will be overwritten. In PR#75

  • Use zeitwerk for autoloading. In PR#75. Bugfix for use in projects implemented in PR#76 via eager autoload.

  • Make Delete::EmptyFieldValues Allable. In PR#73

  • If given an "existing" field that does not exist, Rename::Field transform will warn about it, but not throw an exception. This supports building reusable jobs where the data may be slightly different from use to use. In PR#71

  • BUGFIX: Clean::RegexpFindReplaceFieldVals now skips non-string field values instead of trying to call :gsub on them and failing with NoMethodError. In PR#68

2.7.2 - 2022-04-05

Added

  • When setting up a file registry hash, creator may be a Hash if you need to pass keyword arguments to your job. See File registry entry reference for more info and examples. In PR#67

  • When setting up a file registry hash, creator may be a Module if the relevant job is a private instance method named with the configured default_job_method_name (The default is :job). See File registry entry reference for more info and examples. In PR#67

  • default_job_method_name config setting. In PR#67

  • Fingerprint::Add and Fingerprint::Decode transforms. In PR#65

  • override_app_delim_check param to Fingerprint::Add for backward compatibility with a project I want to be able to use this transform. Defaults to false. PR#66

Changed

  • Moves Merge::CompareFieldsFlag to Compare::FieldValues. Aliases the old transform to the new one for backward compatibility, but raises deprecation warning. In PR#62

  • Fingerprint::Decode forces field values to UTF-8, preventing CSV write errors. In PR#66

2.7.1 - 2022-03-10

Added

  • Kiba::Extend::Utils::MultiSourceNormalizer and Kiba::Extend::Jobs::MultiSourcePrepJob to handle normalization of fields across multiple sources to be used in a multiple-source job with a Kiba::Extend::Destinations::CSV destination (in PR#60)

  • explicit_no argument to Kiba::Extend::Transforms::Deduplicate::Flag. Defaults to true for backward compatibility (in PR#60)

  • amazing_print dependency (in PR#61)

2.6.1 - 2022-03-09

Breaking

  • mvdelim keyword argument removed from Prepend::ToFieldValue, and replaced by multival and delim

Added

  • Binstub for running rspec without bundler exec (given that you add kiba-extend/bin to your PATH) (in PR#59)

  • lookup_on to registry entry summary (in PR#59)

Changed

  • Explode::RowsFromMultivalField defaults to using Kiba::Extend.delim if no delim keyword argument passed in (in PR#58)

  • Some documentation formatting fixed (Issue #53) (in PR#58)

  • Requires higher versions of Ruby, Bundler, and Rspec (in PR#59)

2.6.0 - 2022-02-24

Breaking

  • Changes to keyword argument names for Delete::FieldValueIfEqualsOtherField (in PR#57)

    • sep becomes delim

    • case_sensitive becomes casesensitive

Added

  • multival parameter added to Cspace::NormalizeForID transform (in PR#49)

  • new Count::FieldValues transform (in PR#50)

  • new Append::ConvertedValueAndUnit transform (in PR#51)

  • preparation of the file registry:

    • warns of any supplied files that do not exist (in PR#54)

    • creates any reference directories that do not exist (in PR#54)

  • test Clean::RegexpFindReplaceFieldVals to replace \n (in PR#55)

  • Helpers.empty? method, which returns true/false for a given string value (without treating delimiter values as special) (in PR#57)

  • fields keyword argument to Delete::FieldsExcept, which should be used going forward instead of keepfields (in PR#57)

  • nullvalue setting to Kiba::Extend.config. Default value is '%NULLVALUE%' (in PR#57)

  • usenull keyword argument to Delete::EmptyFieldValues (in PR#57)

  • delim keyword argument to Delete::EmptyFieldValues, which should be used going forward instead of sep (in PR#57)

  • documentation for Delete transforms (in PR#57)

  • Delete::BlankFields transform (in PR#57)

Changed

  • move/alias Merge::CountOfMatchingRows to Count::MatchingRowsInLookup(in PR#50)

  • Delete::FieldsExcept can accept a single symbol as value for fields keyword argument (in PR#57)

  • Delete::EmptyFieldValues will default to Kiba::Extend.delim as delimiter if none given explicitly (in PR#57)

  • keyword argument names for Delete::FieldValueIfEqualsOtherField (in PR#57)

    • sep becomes delim

    • case_sensitive becomes casesensitive

Deleted

  • Removed JARD as development dependency (in PR#52)

  • Removed -t alias from jobs:tagged_and and jobs:tagged_or tasks, as they conflicted with the -t/--tell option (in PR#56)

To be deprecated/Will break in a future version

These will now give warnings if used.

  • Delete::FieldsExcept keepfields keyword parameter. Change to fields (in PR#57)

  • Delete::EmptyFieldValues sep keyword parameter. Change to delim (in PR#57)