CHANGES.txt

Version 0.3.13

    - fs-c parse allows now to specify multiple handler types in a single run.
    - New command fs-c gpimport to import FS-C trace data into the Greenplum database.
    - fs-c trace has new option --salt to add a salt to the chunk fingerprint calculation
    - fs-c trace has new option --privacy-mode to allow a directory-and privacy preserving filename handling.
      If --privacy-mode=flat-sha1 is used, the full filepath is fingerprinted using sha-1. If
      --privacy-mode=directory-sha1 is used, each filename of the full filepath is fingerprinted individually.
      This preserves the general directory structure, while still providing sufficient privacy in most situations.
    - Input files are now provided as parameters instead of -f options. 
      The -f options are still supported, but are deprecated. 
	- Add standard parse handler to extract file metadata details and statistics from a trace file
	- Multiple bug fixes
	  - Fix a Java 7 resource leak
      - Fix handling of named pipes (no only directories and regular files are processed at all)
      - Fix an output error if sample size is not large enough for Harnik's method
      - Single threaded mode is now really single threaded
       
Version 0.3.12

	- Implement Harnik's estimation method (-t harniks). 
	  It differs from the original approach by using reservoir sampling because "n" is not known before the first pass.
	  Provides an good (error usually < 1%) estimation of the deduplication ratio without the need to do 
	  a full (often Hadoop-based) analysis.
	- Updated pig scripts.
	  The Hadoop-based analysis is now faster and intermediate results are stored to avoid duplicate
	  calculations
	- fs-c/import is now multi-threaded for higher import throughput
	- Update to Protobuf 2.4.1 from 2.3.0, gives performance improvement from up to 20% during the parsing.

Version 0.3.10

	- Fix a file descriptor leak under Linux
	- Option to import a full file fingerprint during fs-c/import runs
	- Uses Java 7 directory walker (if available, Java 5/6 is still supported as target platform)

Version 0.3.8

    - Fix a bug in fs-c import

Version 0.3.7

    - Fix a bug that causes a trace file corruptions
    - Add command fs-c validate, which checks if a given trace file is valid

Version 0.3.6

    - Import xml.minidom in fs-c only if actually needed. Not all Python installations have xml support available
    - Use a fs-c directory in the installation package as most software has it.
    - Fix the github download
    
Version 0.3.5

    - Overall performance improvements
    - Improvement for very large directories (> 10.000 files)
    - Updated libraries and build system (now using Scala 2.9 and sbt)
    - Automatic FS-C cluster: Multiple machines cooperate crawling a (parallel) file system
    
Version 0.3.2
    
    - Fix that cdc8 is not used as default chunker
    - Add partial file trace data (to better handle large files)
    - Add the --progress-file parameter to which all fully processed files are written

Version 0.2.3

 - Add Visual Basic Script (VBS) for make the usage on windows easier

Version 0.2.0

 - Digest type (default: SHA-1) can be exchanged by every digest method supported by the Java Security
   framework
 - Digest length (default: 20 Byte for SHA-1) can be reduced to decrease the size of trace files
 - Switch from Actor-based Concurrency during Tracing to Executor-based Concurrency due to the better
   conjection control
 - Update to Google Protocol Buffer 2.1.0
 - Import to Hadoop DFS
 - Contrib: PIG Scripts to analyze large datasets using Hadoop MapReduce
 - Different trace format support: Currently legacy (old binary format), protobuf

Version 0.1.0

 - Initial Version