Skip to content

CybOX 3.0: HashType Refactoring

Ivan Kirillov edited this page Nov 5, 2015 · 26 revisions

Issue Description

There are several issues with the current structure for characterizing cryptographic hashes in CybOX, the HashType:

  • The structure is overly verbose and heavyweight for the capture and parsing of ubiquitous types of hash values such as MD5, SHA1, and SHA256; it is arguable that these are by far the most prevalent types of hashes in cyber threat related characterization today. Currently, users must first specify the correct value from the default HashNameVocab vocabulary, populate the Type field with this value and set its xsi:type to point to the vocabulary, and then finally populate the Simple_Hash_Value field with the actual hash value:
  <Type xsi:type="HashNameVocab-1.0">MD5</Type>
  <Simple_Hash_Value>3773a88f65a5e780c8dff9cdc3a056f3</Simple_Hash_Type>
  • The structure has separate fields for capturing simple and fuzzy hash values (Simple_Hash_Value and Fuzzy_Hash_Value, respectively), both fundamentally string values. This seems an unnecessary distinction, as simply specifying the type of a hash (e.g., SSDeep) provides the necessary context for identifying it as simple or fuzzy.

  • Patterning against the structure is semantically confusing, since a pattern must be written against both the Type and *_Hash_Value fields.

Refactoring

For simplifying the capture of hash values, we propose the refactoring of the existing HashType to contain only the following fields.

Field Type Description
type CryptographicHashEnum Specifies the name of a standard cryptographic hashing algorithm, as captured in the CryptographicHashEnum, used to generate the value captured in the hash_value field. This field OR the custom_type field must be used, but not both.
custom_type string Specifies the name of a custom cryptographic hashing algorithm, or one whose name is not found in the CryptographicHashEnum. This field OR the type field must be used, but not both.
hash_value string Specifies a single cryptographic hash value, of the type defined in the type or custom_type fields.
fuzzy_hash_structure FuzzyHashStructureType Enables the characterization of the key internal components of a fuzzy hash calculation with a given block size.

Accordingly, the new CryptographicHashEnum will take all of the values from the existing HashNameVocab-1.0, thus permitting the deprecation of this vocabulary:

Value Description
md5 string
md6 string
sha1 string
sha256 string
sha224 string
sha384 string
sha512 string
ssdeep string

Note that the structure of the existing HashListType will be unchanged, though it will make use of the updated HashType.

JSON Schema (notional)
{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "definitions":{ "CryptographicHashEnum":{"type":"string", "enum": ["md5", "md6", "sha1","sha256", "sha224", "sha384", "sha512", "ssdeep"]}},
  "type":"object",
  "properties":{"type":{"$ref":"#/definitions/CryptographicHashEnum"}, 
                "custom_hash":{"type":"string"},
                "hash_value":{"type":"string"}},
   "required":["hash_value"],
   "oneOf":[{"required":["type"]}, 
            {"required":["custom_type"]}]
}
Example
{
  "file" : {"hashes" : [{"type":"md5",
                         "hash_value":"3773a88f65a5e780c8dff9cdc3a056f3"},
                        {"custom_type" : "superhash",        
                         "hash_value":"f49125dac3:352bb35ffrca2:a123dc4599245"}]
           }
}

Impact

The biggest direct impact is that existing users of the Objects that make use of the HashType or HashListType will have to update their code and/or tooling to take this approach into account. The Objects themselves will not need to be directly updated on account of this change, as they will still use the HashListType and/or HashType from CybOX Common.

Object List

Object Field
Artifact Hashes
File Hashes
Memory Hashes
PDF File Hashes
PDF File Hashes
PDF File Hashes
Win Executable File Hashes
Win Executable File Hashes
Win Executable File Header_Hashes
Win Executable File Data_Hashes
Win Executable File Hashes
Win Executable File Hashes
Win Executable File Hashes
Win Service Service_DLL_Hashes
Win Task Exec_Program_Hashes
Clone this wiki locally