-
Notifications
You must be signed in to change notification settings - Fork 40
Initiating an Import
The Bulk Import Tool is implemented as a background job inside the Alfresco server application, and as such the native interface to the tool is a Java API (see Developers for more information on using that API). This native interface is then wrapped into a variety of higher level, optional "convenience" mechanisms, including the following repository-tier Web Scripts. These Web Scripts allow imports to be initiated and monitored either manually or via scripting (i.e. using CURL
, wget
or similar tools).
The UI Web Script is an HTTP GET Web Script located at service path /bulk/import
that presents a simple HTML form to the user, containing (at least) the following form fields:
-
Source
- the type of source to use for the import. The Bulk Import Tool only ships with a single source (called "Default"), but your installation may have others developed by 3rd parties. -
Target space
- the target space in the repository to import the content into. This field has an autocomplete function - as you start typing it will offer suggestions of spaces in the repository with matching names. -
Replace
- flag indicating whether to replace existing content in the repository, should it conflict with content being read from the source content set. -
Dry run
- flag indicating that the import should be a "dry run". In this mode the tool logs everything it's doing toalfresco.log
, but doesn't actually do any of it.
Depending on the source you're using, there may be additional Source Settings as well. The "Default" source (shipped with the Bulk Import Tool and selected by default) adds one extra field to the form:
-
Source directory
- the directory on the server from which to read the source content. This directory must be readable by the Alfresco server process.
The initiate Web Script is an HTTP POST Web Script located at service path /bulk/import/initiate
that accepts the above form fields and initiates a bulk import with them. These form fields are:
-
sourceBeanId
- (optional) the Spring bean id of the source to use. Default is "bit.fs.source" (the Spring bean id of the "Default" import source). -
targetPath
- (optional, though either this ortargetNodeRef
are required) the path of the target space, relative to Company Home. e.g./Sites/mysite/documentLibrary/importFolder
-
targetNodeRef
- (optional, though either this ortargetPath
are required) the NodeRef of the target space -
replaceExisting
- (optional) flag (boolean) indicating whether to replace existing files or not (default is "false") -
dryRun
- (optional) flag (boolean) indicating whether this is a dry run or not (default is "false")
Depending on the source you're using, there may be additional Source Settings as well. The "Default" source (shipped with the Bulk Import Tool and selected by default) adds one extra form field to the POST body:
-
sourceDirectory
- (mandatory) the directory on the server from which to read the source content
The status Web Script is an HTTP GET Web Script located at service path /bulk/import/status
that returns status information about the current (or previous, if an import isn't active) bulk import. This Web Script supports two response formats:
- HTML - useful for human monitoring of imports (default)
- JSON - useful for scripted monitoring of imports
Formats are chosen in the usual way for Web Scripts i.e. by appending the desired extension to the URL (e.g. http://alfrescohost:alfrescoport/alfresco/s/bulk/import/status.json
)
The Bulk Import Tool is capable of tracking a variable number of statistics, both for the repository (the so-called "target counters") and the source system that's providing content (the so-called "source counters"). It is important to understand that the number and names of these counters can differ from version to version of the tool, and (most especially) between different types of import source. As a result, any scripting that uses these statistics needs to be designed in such a way that the number and names of both the target and source counters are not hardcoded (or if they are, that appropriate checks are taken to ensure they still exist).
The service paths shown above are not the full URLs used to access these Web Scripts. As with any Alfresco Web Script, you need to prefix them with the protocol (HTTP or HTTPS), host, port, Alfresco webapp context and service path before you'll have a URL you can actually use.
For example if you have the tool installed into a default Alfresco instance running on your local machine, the following URL will take you to the "UI" Web Script:
http://localhost:8080/alfresco/s/bulk/import
Back to usage.
Copyright © Peter Monks. Licensed under the Apache 2.0 License.