Skip to content
This repository has been archived by the owner on Sep 25, 2022. It is now read-only.

Troubleshooting

Sean edited this page Jan 24, 2020 · 20 revisions

Embedded Fork

Before reporting issues with the tool or posting to the mailing list, please make sure you've downloaded, installed, and are using one of the builds provided here - this is the original, and only actively maintained, edition of the tool.

The fork embedded and shipped with Alfresco is ancient, buggy, and slow, and its use is STRONGLY discouraged!

To determine whether you're using the embedded fork or the original (this) edition of the tool, look at the URL of the page you use to initiate an import. If you see a path ending with bulkfsimport (for example http://localhost:8080/alfresco/s/bulkfsimport) you're using the embedded fork, and should cease immediately (did I mention it's ancient, buggy, and slow?).

The path of the original / correct edition ends with bulk/import (for example http://localhost:8080/alfresco/s/bulk/import).

Environment Validation

Before reporting issues with the tool, it is highly recommended to run the Environment Validation Tool (EVT) and resolve all of the issues it identifies. Alfresco will not function well if environment validation doesn't pass, and bulk imports are a particularly heavyweight operation for the repository.

One environment-related issue that has been observed is bulk import performance trailing off as the import proceeds (often ending up at a fraction of the initial import rate), due to insufficient memory allocated to the Alfresco JVM. Alfresco includes a number of auto-sized caches, and when Alfresco's heap is small these caches also end up small. One explanation for the observed drop in performance is that as the amount of imported content grows, the effectiveness of these caches tails off, and if they're small to begin with there's a double impact (small cache + large dataset = cache thrashing). The recommended solution is to ensure that the Alfresco JVM has as much heap allocated to it as possible - at least 2GB (as suggested by the EVT), but ideally 4GB or more.

Note that as of Alfresco v5.0, the default heap allocation (at least if Alfresco is installed using the official installer), is already 4GB.

Enabling Debug/Trace Logging

Although by default the tool is fairly terse in its logging output, it does in fact produce a lot of detailed logging output at debug and trace levels, and this output can be helpful in troubleshooting issues with the tool.

To enable detailed logging:

  1. add the following lines to log4j.properties (or, if you're an Alfresco Enterprise customer, use JMX to add a new Log4J category if you'd prefer):
log4j.logger.org.alfresco.extension.bulkimport=debug
# or for the highest level of logging:
#log4j.logger.org.alfresco.extension.bulkimport=trace
  1. restart Alfresco
  2. watch alfresco.log for additional logging output from the tool

All messages logged by the tool are prefixed with the text BULKIMPORT, making grepping of the log for relevant messages easier (i.e. tail -f alfresco.log | grep BULKIMPORT).

Note that enabling this level of logging output will slow down the Bulk Import Tool, and will result in the alfresco.log file consuming substantially more disk space as well. It should not be left at this setting for long!

Validating that Custom Content Models have been Deployed to the Repository

If you're seeing errors from the repository related to custom metadata properties, it's worth running the Data Dictionary Web Script to confirm that the custom model has been correctly deployed to the repository. This Web Script displays a simple HTML representation of the entire Data Dictionary that's registered with the running Alfresco server - if a custom content model isn't visible on that page it has not been deployed correctly.

Alfresco version 6.2 or newer

As of Alfresco versions 6.2 or newer, there is a Content Property Restriction Interceptor that prevents the setting of the Content property through the NodeService. This prevention shows up as an error that looks like this:

The node's content can't be updated via NodeService#addProperties directly: 
    node: workspace://SpacesStore/a8e2f27d-b34c-483c-8dc6-b329b06be0ad
    property name: content
  org.alfresco.service.cmr.dictionary.InvalidTypeException: 00220051 The node's content can't be updated via NodeService#addProperties directly: 
    node: workspace://SpacesStore/a8e2f27d-b34c-483c-8dc6-b329b06be0ad
    property name: content
 	at org.alfresco.repo.node.ContentPropertyRestrictionInterceptor.invoke(ContentPropertyRestrictionInterceptor.java:147)
 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
 	at org.alfresco.repo.audit.DisableAuditableBehaviourInterceptor.invoke(DisableAuditableBehaviourInterceptor.java:120)

Please visit the Configuration page for more details on how to configure Alfresco to allow the Bulk Import Tool to the "whitelist" of classes allowed bypass this security feature.

Commonly Seen Symptoms

There are also some issues and mailing list discussions that describe symptoms that have been seen with v1.x the tool, and (usually) their resolution. Some of these may be relevant to v2.x as well:

Feedback

If you're having trouble with the tool, or think you've found a bug, please post a message on the mailing list first.


Back to wiki home.