The following is a summary of changes and improvements to :mod:`eulfedora`.
- Fix UTF-8 issue introduced in 1.5 (for details, see discussion on PR #21)
- Updated for compatibility with requests 2.11
- Fix XmlDatastream options being overwritten when defaults are not specified #7
- Code cleanup based on landscape.io reports
- Several improvements and fixes for repo-cp and syncutil:
- Refined datastream regular expression: more accurately grab the correct datastream id and checksum, avoiding spurious checksum errors; always grab the last match found, in case a section includes multiple datastream versions
- Expose verify option in the repo-cp script options (MD5 decoded content)
- Fix omit checksums regular expression to work under Python 3
- Improve the explanation for the archive xml sync option
- Update ReadableIterator to handle inaccurate sizes provided in the Fedora export datastream info for some objects
- Add logging and expose it in repo-cp script via verbosity option
- Fix unit test imports so tests can be run without Django
- Updated syncrepo for Django 1.9+ compatibility
- Improved django-debug-toolbar integration and updated for current version.
- Configured continuous integration on travis-ci.
- Updated unit tests so they can be run with or without Django installed, and work for multiple versions of Django. Configured travis-ci to test against multiple versions of Django and without Django.
- Fixed missing django view documentation on readthedocs #20,
- New custom django-debug-toolbar panel to view Fedora API requests. used to generate a django page.
- Clarify confusing documentation for setting content on
DatastreamObject
andFileDatastreamObject
. Thanks to @bcail. #20, PR #21 - New Django exception filter eulfedora.util.SafeExceptionReporterFilter to suppress Fedora session password when an exception occurs within the API request
- Add retries option to :class:`eulfedora.server.Repository` to configure requests max retries when making API calls, in case of errors establishing the connection. (Defaults to 3; configurable in Django settings as FEDORA_CONNECTION_RETRIES)
- Bugfix: datastream isModified detection error in some cases when XML content is empty, resulting in errors attempting to save (especially when the datastream does not exist; cannot add with no content)
- Now Python 3 compatible, thanks in large part to Morgan Aubert (@ellmetha).
- New, more efficient version of :class:`eulfedora.views.RawDatastreamView` and :meth:`eulfedora.views.raw_datastream`. Passes response headers from Fedora, and takes advantage of support for HEAD and Range requests in Fedora 3.7+. NOTE that the method signature has changed. The previous implementation is still available as :class:`eulfedora.views.RawDatastreamViewOld` and :meth:`eulfedora.views.raw_datastream_old` for those who need the functionality.
- Updated functionality for synchronizing content between Fedora repositories: :mod:`eulfedora.syncutil` for programmatic access and repo-cp for command-line. Now supports Fedora archive export format and better handling for large objects.
- Upload API method (:meth:`eulfedoa.api.REST_API.upload`) now supports iterable content with known size.
- Updated to require requests 2.9 or greater.
- New streaming option for :class:`eulfedora.views.RawDatastreamView` and :meth:`eulfedora.views.raw_datastream` to optionally return a :class:`django.http.StreamingHttpResponse` (intended for use with large datastream content).
- New repo-cp script (BETA) for synchronizing content between Fedora repositories (e.g., production to QA or development servers, for testing purposes).
- Require a version of python-requests earlier than 2.9 (2.9 includes change to upload behavior for file-like objects that breaks eulfedora api uploads as currently handled in eulfedora).
- Tutorial updated to be compatible with Django 1.8 thanks to jaska @chfw.
- New class-based view :class:`eulfedora.views.RawDatastreamView`, equivalent to :meth:`eulfedora.views.raw_datastream`.
- Access to historical versions of datastreams now available in :meth:`eulfedora.models.DigitalObject.getDatastreamObject` and :meth:`eulfedora.views.raw_datastream`.
- Change checksum handling to cue Fedora to auto-generate checksums on ingest.
- Recommended: Fedora 3.7+ for automatic checksum support on ingest
Note
This checksum change in this release is a work-around for a Fedora bug present in 3.8 (at least, possibly 3.7), where passing a checksum type with no checksum value results in in Fedora storing an empty checksum, where previously it would calculate and store a checksum. On ingest, if a checksum type but no checksum value is specified, no checksum information will be sent to Fedora (when checksum type and checksum value are both specified, they will be passed through to Fedora normally). If you have auto-checksumming configured in Fedora, then your checksums should be generated automatically. Note that auto- checksum functionality on ingest was broken until Fedora 3.7 (see https://jira.duraspace.org/browse/FCREPO-1047); if you are still using an older version of Fedora and need checksums generated at ingest, you should use eulfedora 1.1.
- :class:`~eulfedora.models.ReverseRelation` now includes an option for specifying a property to be used for sorting resulting items. Can also be specified for reverse relations autogenerated by :class:`~eulfedora.models.Relation`.
- :mod:`unittest2` is now optional for testutils
- Use python :mod:`json` for :mod:`eulfedora.indexdata.views` instead of the simplejson that used to be included with Django
- Support for Fedora 3.8.
- Update :meth:`eulfedora.views.raw_datastream` to handle old Fedora datstreams with invalid content size.
Note
Differentiating Fedora error messages in some versions of Fedora (somewhere after 3.4.x, applicable to at least 3.7 and 3.8, possibly earlier versions) requires that Fedora be configured to include the error message in the response, as described at https://groups.google.com/forum/#!topic/fedora-tech/PAv1LYWPW0k
- API methods have been overhauled to use python-requests and requests-toolbelt
Note
API methods that previously returned a tuple of response content and the url now simply return the response object, which provides access to both content and url (among other information). Server and DigitalObject classes should behave as before, but API methods are not backward-compatible.
Warning
The API upload method filesize is limited by the system maxint (2GB on 32-bit OSes) due to a limitation with the Python len() method (possibly dependent on your Python implementation). If you need large file upload support on a 32-bit OS, you should use an earlier version of eulfedora.
- New script upload-test.py for testing upload behavior on your platform; also provides an example of an upload callback method. (Found in the scripts directory, but not installed with the module.)
- bugfix: relationship methods on :class:`~eulfedora.models.DigitalObject` now recognize unicode as well as string pids as resources.
- Related objects accessed via :class:`~eulfedora.models.Relation` are now cached for efficiency, similar to the way datastreams are cached on :class:`~eulfedora.models.DigitalObject`.
- Methods :meth:`~eulfedora.models.DigitalObject.purge_relationship` and :meth:`~eulfedora.models.DigitalObject.modify_relationship` added to :class:`~eulfedora.models.DigitalObject`. Contributed by Graham Hukill @ghukill.
- bugfix: correction in detailed output for validate-checksum script when all versions are checked and at least one checksum is invalid
- bugfix: support HTTP Range requests in :meth:`eulfedora.views.raw_datastream` only when explicitly enabled
- A repository administrator can configure a script to periodically check content checksums in order to identify integrity issues so that they can be dealt with.
- A repository administrator will receive an email notification if the system encounters bad or missing checksums so that they can then resolve any integrity issues.
- A repository admin can view fixity check results for individual objects in the premis data stream (for objects where premis exists) in order to view a more detailed result and the history.
- Support for basic HTTP Range requests in :meth:`eulfedora.views.raw_datastream` (e.g., to allow audio/video seek in HTML5 media players)
- It is now possible to add new datastreams using :meth:`eulfedora.models.DigitalObject.getDatastreamObject` (in contrast to predefined datastreams on a subclass of :class:`~eulfedora.models.DigitalObject`). Adding new datastreams is supported when ingesting a new object as well as when saving an existing object. This method can also be used to update existing datastreams that are not predefined on a DigitalObject subclass.
- Development requirements can now be installed as an optional requirement
of the eulfedora package (
pip install "eulfedora[dev]"
). - Unit tests have been updated to use :mod:`nose`
- Provides a nose plugin to set up and tear down for a test Fedora Commons repository instance for tests, as an alternative to the custom test runners.
- Bugfix: don't auto-create an XML datastream at ingest when the xml content is empty (i.e., content consists of bootstrapped :class:`xmlmap.XmlObject` only)
- Bugfix: handle Fedora restriction of ownerId field length to 64 characters. When setting :attr:`~eulfedora.models.DigitalObject.owner`, will now warn and truncate the value to allow the object to be saved.
- New command-line script
fedora-checksums
for datastream checksums validation and repair. See :doc:`scripts` for more details. - :class:`~eulfedora.models.DigitalObject` now provides access to the
Fedora built-in audit trail; see
:attr:`~eulfedora.models.DigitalObject.audit_trail`. Also provides:
- :meth:`eulfedora.views.raw_audit_trail`: Django view to serve out audit trail XML, comparable to :meth:`eulfedora.views.raw_datastream`.
- :class:`~eulfedora.models.DigitalObject` attribute :attr:`~eulfedora.models.DigitalObject.audit_trail_users`: set of all usernames listed in the audit trail (i.e., any users who have modified the object)
- :class:`~eulfedora.models.DigitalObject` attribute :attr:`~eulfedora.models.DigitalObject.ingest_user`: username responsible for ingesting the object into Fedora if ingest is listed in the audit trail
- :class:`~eulfedora.models.Relation` now supports recursive relations
via the option
type="self"
. - API wrappers have been updated to take advantage of all methods available in the REST API as of Fedora 3.4 which were unavailable in 3.2. This removes the need for any SOAP-based APIs and the dependency on :mod:`soaplib`.
- Minor API / unit test updates to support Fedora 3.5 in addition to 3.4.x.
- Bugfix: Default checksum type for :class:`~eulfedora.models.DatastreamObject` was previously ignored when creating a new datastream from scratch (e.g., when ingesting a new object). In certain versions of Fedora, this could result in datastreams with missing checksums (checksum type of 'DISABLED', checksum value of 'none').
- Exposed RIsearch
count
return option via :meth:`eulfedora.api.ResourceIndex.count_statements` - :class:`~eulfedora.models.DatastreamObject` now supports setting datastream content by URI through the new :attr:`~eulfedora.models.DatastreamObject.ds_location` attribute (this is in addition to the previously-available :attr:`~eulfedora.models.DatastreamObject.content` attribute).
Previously, several of the REST API calls in :class:`eulfedora.api.REST_API` suppressed errors and only returned True or False for success or failure; this made it difficult to determine what went wrong when an API call fails. This version of :mod:`eulfedora` revises that logic so that all methods in :class:`eulfedora.api.REST_API` will raise exceptions when an exception-worthy error occurs (e.g., permission denied, object not found, etc. - anything that returns a 40x or 500 HTTP error response from Fedora). The affected REST methods are:
- :meth:`~eulfedora.api.REST_API.addDatastream`
- :meth:`~eulfedora.api.REST_API.modifyDatastream`
- :meth:`~eulfedora.api.REST_API.purgeDatastream`
- :meth:`~eulfedora.api.REST_API.modifyObject`
- :meth:`~eulfedora.api.REST_API.purgeObject`
- :meth:`~eulfedora.api.REST_API.setDatastreamState`
- :meth:`~eulfedora.api.REST_API.setDatastreamVersionable`
New custom Exception :class:`eulfedora.util.ChecksumMismatch`, which is a subclass of :class:`eulfedora.util.RequestFailed`. This exception will be raised if :meth:`~eulfedora.api.REST_API.addDatastream` or :meth:`~eulfedora.api.REST_API.modifyDatastream` is called with a checksum value that Fedora determines to be invalid.
Note
If :meth:`~eulfedora.api.REST_API.addDatastream` is called with a checksum value but no checksum type, current versions of Fedora ignore the checksum value entirely; in particular, an invalid checksum with no type does not result in a :class:`~eulfedora.util.ChecksumMismatch` exception being raised. You should see a warning if your code attempts to do this.
Added read-only access to :class:`~eulfedora.models.DigitalObject` owners as a list; changed default :meth:`eulfedora.models.DigitalObject.index_data` to make owner field a list.
Modified default :meth:`eulfedora.models.DigitalObject.index_data` and sample Solr schema to include a new field (dsids) with a list of datastream IDs available on the indexed object.
- Addition of :mod:`eulfedora.indexdata` to act as a generic webservice that can be used for the creation and updating of indexes such as SOLR; intended to be used with :mod:`eulindexer`.
- Split out fedora-specific components from :mod:`eulcore`; now depends on :mod:`eulxml`.