Parsers
Caldera uses parsers to extract facts from command output. A common use case is to allow operations to take gathered information and feed it into future abilities and decisions - for example, a discovery ability that looks for sensitive files can output file paths, which will then be parsed into file path facts, and a subsequent ability can use those file paths to stage the sensitive files in a staging directory.
Parsers can also be used to create facts with relationships linked between them - this allows users to associate facts together, such as username and password facts.
Under the hood, parsers are python modules that get called when the agent sends command output to the Caldera server and certain conditions are met:
If the corresponding ability has a specified parser associated with the command, the parser module will be loaded and used to parse out any facts from the output. This will occur even if the agent ran the command outside of an operation
If the agent ran the command as part of an operation, but the corresponding ability does not have any specified parsers associated with the command, Caldera will check if the operation was configured to use default parsers. If so, any default parsers loaded within Caldera will be used to parse out facts from the output. Otherwise, no parsing occurs.
If the agent ran the command outside of an operation, but the corresponding ability does not have any specified parsers associated with the command, Caldera will use its default parsers to parse the output.
Non-default Parser python modules are typically stored in individual plugins, such as stockpile
, in the
plugin’s app/parsers/
directory. For instance, if you look in plugins/stockpile/app/parsers
,
you can see a variety of parsers that are provided out-of-the-box.
Default parsers are located in the core Caldera repo, under app/learning
.
Two example modules are p_ip.py
and p_path.py
, which are used to parse IP addresses and file
paths, respectively. Note that the default parsers have a different location due to their
association with the learning service.
Linking Parsers to an Ability
To associate specific parsers to an ability command, use the parsers
keyword in the yaml file
within the executor section (see the below example).
darwin:
sh:
command: |
find /Users -name '*.#{file.sensitive.extension}' -type f -not -path '*/\.*' -size -500k 2>/dev/null | head -5
parsers:
plugins.stockpile.app.parsers.basic:
- source: host.file.path
edge: has_extension
target: file.sensitive.extension
Note that the parsers value is a nested dictionary whose key is the Python module import path
of the parser to reference; in this case, plugins.stockpile.app.parsers.basic
for the Parser
located in plugins/stockpile/app/parsers/basic.py
.
The value of this inner dict is a list of fact mappings that tell the Parser what facts and
relationships to save based on the output. In this case, we only have one mapping in the list.
Each mapping consists of the following:
Source (required): A fact to create for any matches from the parser
Edge (optional): A relationship between the source and target. This should be a string.
Target (optional): A fact to create which the source connects to.
In the above example, the basic
parser will take each line of output from the find
command,
save it as a host.file.path
fact, and link it to the file.sensitive.extension
fact used in
the command with the has_extension
edge. For instance, if the command was run using a
file.sensitive.extension
value of docx
and the find
command returned /path/to/mydoc.docx
and /path/to/sensitive.docx
, the parser would generate the following facts and relationships:
/path/to/mydoc.docx
<-has_extension
->docx
/path/to/sensitive.docx
<-has_extension
->docx
Note that only one parser can be linked to a command at a time, though a single parser can be used to generate multiple facts, as in our hypothetical example above. Also note that the parser only works for the associated command executor, so you can use different parsers for different executors and even different platforms.
The example below shows a more complicated parser - the katz
parser in the stockpile
plugin.
This example has multiple fact mappings for a single parser, since we want to extract different
types of information from the Mimikatz output - in particular, the password and password hash
information.
platforms:
windows:
psh:
command: |
Import-Module .\invoke-mimi.ps1;
Invoke-Mimikatz -DumpCreds
parsers:
plugins.stockpile.app.parsers.katz:
- source: domain.user.name
edge: has_password
target: domain.user.password
- source: domain.user.name
edge: has_hash
target: domain.user.ntlm
- source: domain.user.name
edge: has_hash
target: domain.user.sha1
payloads:
- invoke-mimi.ps1
This time, we are using plugins.stockpile.app.parsers.katz
, which will take the output
from the Invoke-Mimikatz -DumpCreds
command and apply the 3 specified mappings when parsing
the output. Note that in all 3 mappings, the source fact is the same: domain.user.name
, but
the relationship edges and target facts are all different, based on what kind of information we
want to save. The resulting facts, assuming the command was successful and provided the desired
information, will include the username, password, NTLM hash, and SHA1 hash, all linked together
with the appropriate relationship edges.