-
Notifications
You must be signed in to change notification settings - Fork 40
Troubleshooting
Before reporting issues with the tool or posting to the mailing list, please make sure you've downloaded, installed, and are using one of the builds provided here - this is the original, and only actively maintained, edition of the tool.
The fork embedded and shipped with Alfresco is ancient, buggy, and slow, and its use is STRONGLY discouraged!
To determine whether you're using the embedded fork or the original (this) edition of the tool, look at the URL of the page you use to initiate an import. If you see a path ending with bulkfsimport
(for example http://localhost:8080/alfresco/s/bulkfsimport
) you're using the embedded fork, and should cease immediately (did I mention it's ancient, buggy, and slow?).
The path of the original / correct edition ends with bulk/import
(for example http://localhost:8080/alfresco/s/bulk/import
).
Before reporting issues with the tool, it is highly recommended to run the Environment Validation Tool (EVT) and resolve all of the issues it identifies. Alfresco will not function well if environment validation doesn't pass, and bulk imports are a particularly heavyweight operation for the repository.
One environment-related issue that has been observed is bulk import performance trailing off as the import proceeds (often ending up at a fraction of the initial import rate), due to insufficient memory allocated to the Alfresco JVM. Alfresco includes a number of auto-sized caches, and when Alfresco's heap is small these caches also end up small. One explanation for the observed drop in performance is that as the amount of imported content grows, the effectiveness of these caches tails off, and if they're small to begin with there's a double impact (small cache + large dataset = cache thrashing). The recommended solution is to ensure that the Alfresco JVM has as much heap allocated to it as possible - at least 2GB (as suggested by the EVT), but ideally 4GB or more.
Note that as of Alfresco v5.0, the default heap allocation (at least if Alfresco is installed using the official installer), is already 4GB.
Although by default the tool is fairly terse in its logging output, it does in fact produce a lot of detailed logging output at debug and trace levels, and this output can be helpful in troubleshooting issues with the tool.
To enable detailed logging:
- add the following lines to log4j.properties (or, if you're an Alfresco Enterprise customer, use JMX to add a new Log4J category if you'd prefer):
log4j.logger.org.alfresco.extension.bulkimport=debug
# or for the highest level of logging:
#log4j.logger.org.alfresco.extension.bulkimport=trace
- restart Alfresco
- watch
alfresco.log
for additional logging output from the tool
All messages logged by the tool are prefixed with the text BULKIMPORT
, making grepping of the log for relevant messages easier (i.e. tail -f alfresco.log | grep BULKIMPORT
).
Note that enabling this level of logging output will slow down the Bulk Import Tool, and will result in the alfresco.log
file consuming substantially more disk space as well. It should not be left at this setting for long!
If you're seeing errors from the repository related to custom metadata properties, it's worth running the Data Dictionary Web Script to confirm that the custom model has been correctly deployed to the repository. This Web Script displays a simple HTML representation of the entire Data Dictionary that's registered with the running Alfresco server - if a custom content model isn't visible on that page it has not been deployed correctly.
As of Alfresco versions 6.2 or newer, there is a Content Property Restriction Interceptor that prevents the setting of the Content property through the NodeService. This prevention shows up as an error that looks like this:
The node's content can't be updated via NodeService#addProperties directly:
node: workspace://SpacesStore/a8e2f27d-b34c-483c-8dc6-b329b06be0ad
property name: content
org.alfresco.service.cmr.dictionary.InvalidTypeException: 00220051 The node's content can't be updated via NodeService#addProperties directly:
node: workspace://SpacesStore/a8e2f27d-b34c-483c-8dc6-b329b06be0ad
property name: content
at org.alfresco.repo.node.ContentPropertyRestrictionInterceptor.invoke(ContentPropertyRestrictionInterceptor.java:147)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.alfresco.repo.audit.DisableAuditableBehaviourInterceptor.invoke(DisableAuditableBehaviourInterceptor.java:120)
Please visit the Configuration page for more details on how to configure Alfresco to allow the Bulk Import Tool to the "whitelist" of classes allowed bypass this security feature.
There are also some issues and mailing list discussions that describe symptoms that have been seen with v1.x the tool, and (usually) their resolution. Some of these may be relevant to v2.x as well:
- Google Code Issue #87 - on Linux, files with accented characters in their filenames won't load if the LC_ALL and LANG environment variables aren't set appropriately
- Google Code Issue #57 / Google Code Issue #88 - misunderstandings about how tags are specified in shadow metadata files
- Google Code Issue #97 - describe behaviour if two files end up with the same name (via metadata)
- Question about metadata files - Java is very picky about XML properties files - they must have exactly the right DOCTYPE declaration to be read without error
If you're having trouble with the tool, or think you've found a bug, please post a message on the mailing list first.
Back to wiki home.
Copyright © Peter Monks. Licensed under the Apache 2.0 License.