Parsers

CALDERA uses parsers to extract facts from command output. A common use case is to allow operations to take gathered information and feed it into future abilities and decisions - for example, a discovery ability that looks for sensitive files can output file paths, which will then be parsed into file path facts, and a subsequent ability can use those file paths to stage the sensitive files in a staging directory.

Parsers can also be used to create facts with relationships linked between them - this allows users to associate facts together, such as username and password facts.

Under the hood, parsers are python modules that get called when the agent sends command output to the CALDERA server and certain conditions are met:

  • If the corresponding ability has a specified parser associated with the command, the parser module will be loaded and used to parse out any facts from the output. This will occur even if the agent ran the command outside of an operation

  • If the agent ran the command as part of an operation, but the corresponding ability does not have any specified parsers associated with the command, CALDERA will check if the operation was configured to use default parsers. If so, any default parsers loaded within CALDERA will be used to parse out facts from the output. Otherwise, no parsing occurs.

  • If the agent ran the command outside of an operation, but the corresponding ability does not have any specified parsers associated with the command, CALDERA will use its default parsers to parse the output.

Non-default Parser python modules are typically stored in individual plugins, such as stockpile, in the plugin’s app/parsers/ directory. For instance, if you look in plugins/stockpile/app/parsers, you can see a variety of parsers that are provided out-of-the-box.

Default parsers are located in the core CALDERA repo, under app/learning. Two example modules are p_ip.py and p_path.py, which are used to parse IP addresses and file paths, respectively. Note that the default parsers have a different location due to their association with the learning service.

Linking Parsers to an Ability

To associate specific parsers to an ability command, use the parsers keyword in the yaml file within the executor section (see the below example).

    darwin:
      sh:
        command: |
          find /Users -name '*.#{file.sensitive.extension}' -type f -not -path '*/\.*' -size -500k 2>/dev/null | head -5
        parsers:
          plugins.stockpile.app.parsers.basic:
            - source: host.file.path
              edge: has_extension
              target: file.sensitive.extension

Note that the parsers value is a nested dictionary whose key is the Python module import path of the parser to reference; in this case, plugins.stockpile.app.parsers.basic for the Parser located in plugins/stockpile/app/parsers/basic.py. The value of this inner dict is a list of fact mappings that tell the Parser what facts and relationships to save based on the output. In this case, we only have one mapping in the list.

Each mapping consists of the following:

  • Source (required): A fact to create for any matches from the parser

  • Edge (optional): A relationship between the source and target. This should be a string.

  • Target (optional): A fact to create which the source connects to.

In the above example, the basic parser will take each line of output from the find command, save it as a host.file.path fact, and link it to the file.sensitive.extension fact used in the command with the has_extension edge. For instance, if the command was run using a file.sensitive.extension value of docx and the find command returned /path/to/mydoc.docx and /path/to/sensitive.docx, the parser would generate the following facts and relationships:

  • /path/to/mydoc.docx <- has_extension -> docx

  • /path/to/sensitive.docx <- has_extension -> docx

Note that only one parser can be linked to a command at a time, though a single parser can be used to generate multiple facts, as in our hypothetical example above. Also note that the parser only works for the associated command executor, so you can use different parsers for different executors and even different platforms.

The example below shows a more complicated parser - the katz parser in the stockpile plugin. This example has multiple fact mappings for a single parser, since we want to extract different types of information from the Mimikatz output - in particular, the password and password hash information.

  platforms:
    windows:
      psh:
        command: |
          Import-Module .\invoke-mimi.ps1;
          Invoke-Mimikatz -DumpCreds
        parsers:
          plugins.stockpile.app.parsers.katz:
          - source: domain.user.name
            edge: has_password
            target: domain.user.password
          - source: domain.user.name
            edge: has_hash
            target: domain.user.ntlm
          - source: domain.user.name
            edge: has_hash
            target: domain.user.sha1
        payloads:
        - invoke-mimi.ps1

This time, we are using plugins.stockpile.app.parsers.katz, which will take the output from the Invoke-Mimikatz -DumpCreds command and apply the 3 specified mappings when parsing the output. Note that in all 3 mappings, the source fact is the same: domain.user.name, but the relationship edges and target facts are all different, based on what kind of information we want to save. The resulting facts, assuming the command was successful and provided the desired information, will include the username, password, NTLM hash, and SHA1 hash, all linked together with the appropriate relationship edges.