Custom Parser Syntax🔗

Important

The regular expression syntax supported by Taegis XDR Custom Parsers is the Golang variant.

Statements🔗

!SAMPLE=...🔗

A sample message. Everything to the right of the = is interpreted literally, all the way up to a newline. This field is optional, but strongly encouraged.

!SCHEMA=...🔗

This is the schema for this message type, for example scwx.nids, scwx.netflow, scwx.auth. If not specified, the schema from the parent or closest ancestor is used.

!CONFIRMWITH🔗

This is either PATTERN or EXPRESSION. This works in tandem with !CONFIRMSTRING to determine if a message matches this parser. If set to PATTERN, then CONFIRMSTRING is a regex pattern. If set to EXPRESSION, then CONFIRMSTRING is an expression that evaluates to True/False.

!CONFIRMSTRING=🔗

See !CONFIRMWITH.

!DISABLED=🔗

This disables the parser. The parser is completely removed from the runtime catalog. This is useful when you don’t yet know how to handle a message but want to capture minimal documentation of its existence.

!IMPORT=🔗

Import another parser into this parser at the current line. Variables are shared between the importing and imported parser. This allows repeating lines of parser code to be consolidated into one place.

!IMPORTONLY🔗

Indicates that this parser is only for import (via !IMPORT). With extremely rare exceptions, all imported parsers should be !IMPORTONLY. This flag exempts the parser from many validation rules (For example, it doesn’t have to have a parent parser, no CONFIRMWITH/CONFIRMSTRING, etc.)

!TRIMALLOFF🔗

This disables the default behavior of running TRIM_ALL() for all parsers. In some cases, this causes problems as TRIM_ALL() removes leading or trailing braces ({ and } and also [ and ]), which leads to incorrect data for Json fields.

!SANITIZEALLOFF🔗

This disables the default behavior of running SANITIZE_ALL() for all parsers.

Regular Expression Capturing Groups🔗

Capturing groups can be used to extract values from an unstructured log message or portion of a log message.

The syntax for a capture group is {captureVariable} = {sourceString}|({regex pattern}). Resulting matches are stored in a list and can be referenced using the captureVariable and the array value, e.g. captureVariable[1].

Capture groups can also be named using {captureVariable} = {sourceString}|(?P<group_name>{regex pattern}). Resulting matches can be referenced using the captureVariable and the group name, e.g. captureVariable["group_name"].

Examples🔗

# The pattern is read unescaped to the end of the line.

jsonMatch = originalData$|(\{.*})$

# To find patterns such as an IP address

# originalData = Dec 10 16:49:10 10.10.70.10 Dec 10 10:49:10 dddd-aaabbb-01 dddd-aaabbb-01

queryCapture = originalData$|\s\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s

# queryCapture = 10.10.70.10

# To capture a value after a field name

# message = Source Network Address: 10.17.2.186   Source Port:  54692

srcIp = message|Source Network Address:\s+(\S+)\s+

# srcIp = 10.17.2.186

# An example where the '%' character isn't scaped as in Golang regex

# message = %AAA-6-RADIUS_IN_GLOBAL_LIST: radius_db.c:481 RADIUS ACCT

topLevel = message|\s+%(\w+)-(\w+-)?(\d+)-(\w+):\s+(.+)

part = topLevel[1]

# part = AAA

# Named capture group example

# message = Aug 21 12:12:20 10.194.72.254 1 1566000000.000000000 IDOFDEVICE flows src=192.168.0.5 dst=192.168.0.255 mac=DE:ED:BE:EF:AB:AB protocol=udp sport=49154 dport=1128 pattern: deny (src 192.168.0.0/24)

partA = message|^(?P<prefix>0|1)\s+(?P<timestamp>\d+\.\d+)\s+(?P<idofdevice>\S+)\s+(?P<logtype>ip_flow|events|airmarshal_events|flows|security_event|ids-alerts|urls|.*firewall)\s+(?P<remainder>.*)

timestamp = partA["timestamp"]
device = partA["idofdevice"]
logtype = partA["logtype"]

# timestamp = 1566000000.000000000

# device = IDOFDEVICE

# logtype = flows

Indexing and Member Access🔗

Once a value has been captured (from a regex, SPLIT(), JSON(), or any other source), use the [index] operator to read individual elements out of it. The operator is polymorphic: its behavior depends on the type of the value being indexed.

Left Side	Index Type	Returns	Out-of-bounds / Missing
`LIST`	Number	Element at the zero-based index	`NULL`
`OBJECT` (map)	String	Value at the given key	`NULL`
Regex match	Number	Capture group at the index. `[0]` is the full match, `[1]` is the first group, and so on.	Error (out of range)
Regex match	String	Named capture group from `(?P<name>...)`	Error (name not found)
`JSON` value	JSONPath string	Element(s) matching the JSONPath	`NULL`

Note

Numbers and strings are not interchangeable as indexes. Use a number for list elements and positional regex groups. Use a string for object keys, named regex groups, and JSONPath expressions.

Indexing a Regex Match🔗

Capture groups are created with the pipe syntax {var} = {source}|({regex}). The pipe runs the regex against source and produces a regex match value, which is then indexed by capture-group number.

# message = "May 17 12:00:00 host login src=192.168.1.10 user=alice"

match = message|src=(\S+)\s+user=(\S+)

fullMatch = match[0]
# fullMatch: "src=192.168.1.10 user=alice"

srcIp = match[1]
# srcIp: "192.168.1.10"

userName = match[2]
# userName: "alice"

Named Capture Groups🔗

Use Golang's (?P<name>...) syntax to give a group a name, then index by string.

parts = message|^(?P<level>\w+):\s+(?P<msg>.+)$
level = parts["level"]
msg   = parts["msg"]

Indexing a List🔗

Lists are produced by functions such as SPLIT() and are accessed by zero-based numeric index. Reading past the end of a list returns NULL rather than raising an error, so guard with IF if a missing slot would change behavior.

parts = SPLIT("a,b,c,d", ",")
first = parts[0]    # "a"
third = parts[2]    # "c"
miss  = parts[99]   # NULL

Indexing an Object🔗

Objects (key/value maps) returned by functions such as SPLIT_NAME_VALUES() are accessed by string key. Missing keys return NULL.

dict = SPLIT_NAME_VALUES("user=alice,role=admin", ",", "=", "\\")
user = dict["user"]    # "alice"
miss = dict["email"]   # NULL

Indexing JSON with JSONPath🔗

JSON() values are indexed with a JSONPath expression as a string.

data = "{\"user\":{\"name\":\"alice\",\"id\":42}}"
obj  = JSON(data)
name = obj["$.user.name"]    # "alice"
miss = obj["$.user.email"]   # NULL

Preserve Structure flag for wildcards🔗

When a JSONPath uses a wildcard (*) that traverses a list, the default behavior drops slots where the requested field is missing. Append |preserve_structure to the JSONPath to keep one entry per slot, with NULL filling positions where the field is absent. This is useful when the resulting list must align positionally with another array in the input.

data = "{\"items\":[{\"id\":1,\"tag\":\"a\"},{\"id\":2},{\"id\":3,\"tag\":\"c\"}]}"
json = JSON(data)

tagsCompact = json["$.items.*.tag"]
# tagsCompact: ["a", "c"]

tagsAligned = json["$.items.*.tag|preserve_structure"]
# tagsAligned: ["a", NULL, "c"]

Functions🔗

SPLIT(data, delimiter, makeGreedy)🔗

Splits data into tokens separated by delimiter. For example, if (optional) makeGreedy is "true" then the data of 0,,2 with a delimiter of , is evaluated to [0,2] instead of [0,'',2].

Example🔗

data = "aaa,bbb,ccc,eee"
values1 = SPLIT(data, ",", FALSE)
OUTPUT1$ = values1[3]

#OUTPUT1$: eee (String)

data = "aaa,bbb,ccc,,eee"
values1 = SPLIT(data, ",", FALSE)
values2 = SPLIT(data, ",", TRUE)
OUTPUT1$ = values1[3]
OUTPUT2$ = values2[3]

#OUTPUT1$: NULL (null)
#OUTPUT2$: eee (String)

SPLIT_NAME_VALUES(data, delimiter, separator, quoteChar)🔗

Splits data into a collection of name/value pairs where delimiter separates the pairs and separator separates the name vs value. quotechar indicates the character for quoting the value.

Example🔗

data = "User: Unknown, InitiatorPackets: 2, ResponderPackets: 1, InitiatorBytes: 120, ResponderBytes: 66"
dict = SPLIT_NAME_VALUES(data, ",", ":", "\\")
OUTPUT$ = dict["InitiatorBytes"]

# OUTPUT$: 120 (String)

JSON(data)🔗

Converts data into a json object that can be accessed with square brackets containing a json path. See https://goessner.net/articles/JsonPath/ and https://github.com/ohler55/ojg.

Example🔗

data= "{ \"store\": { \"book\": [ { \"category\": \"reference\", \"author\": \"Nigel Rees\", \"title\": \"Sayings of the Century\", \"price\": 8.95 }, { \"category\": \"fiction\", \"author\": \"Evelyn Waugh\", \"title\": \"Sword of Honour\", \"price\": 12.99 }, { \"category\": \"fiction\", \"author\": \"Herman Melville\", \"title\": \"Moby Dick\", \"isbn\": \"0-553-21311-3\", \"price\": 8.99 }, { \"category\": \"fiction\", \"author\": \"J.R. R. Tolkien\", \"title\": \"The Lord of the Rings\", \"isbn\": \"0-395-19395-8\", \"price\": 22.99 } ], \"bicycle\": { \"color\": \"red\", \"price\": 19.95 } } }"
json= JSON(data)
OUTPUT$ = json["$.store.book[*].author"]

# OUTPUT$: "Nigel Rees","Evelyn Waugh","Herman Melville","J. R. R. Tolkien" (String)

Example usage for JSON keys that contain dots:

data= "{ \"store\": { \"book\": [ { \"id.category\": \"reference\" } ] } }"
json= JSON(data)
OUTPUT$ = json["$.store.book[0][\"id.category\"]"]

# OUTPUT$: reference(String)

CEF(data)🔗

Parses data as a CEF-formatted message. The header fields can be accessed with an integer and the named fields can be accessed by name.

Example🔗

!SAMPLE=Nov 6 07:49:03 10.42.0.1 %helloWorld: CEF:0|Check Point|VPN-1 & FireWall-1|Check Point|Log|Address spoofing|Unknown|act=Drop cs3Label=Protection Type cs3=IPS

values = CEF(originalData$)
OUTPUT1$= values[2]
OUTPUT2$= values["act"]
OUTPUT3$= values["Protection Type"]

# OUTPUT1$: VPN-1 & FireWall-1 (String)

# OUTPUT2$: Drop (String)

# OUTPUT3$: IPS (String)

LEEF(data, delimiterOverride)🔗

Parses data as a LEEF-formatted message. The header fields can be accessed with an integer and the named fields can be accessed by name. Optionally, a delimiter override may be specified. LEEF extensions should be either tab-separated or they should indicate an alternate delimiter in field 6 of the header. The override parameter should be used when you know that a device is not compliant with the standard.

DATETIME(data, fmt, handle2DigitYear)🔗

Converts a string to a time value for fields like EventTimeUsec$. Also accepts time.Parse format strings (optional). If handle2digitYear is TRUE, an appropriate year is chosen; usually the current year with an edge case around the new year.

Example🔗

data = "Sep 21 2018 17:35:54"
OUTPUT1$ = DATETIME(data, "Jan 02 2006 15:04:05")
OUTPUT2$ = data

# OUTPUT1$: 2018-09-21 17:35:54 +0000 UTC (time)

# OUTPUT2$: Sep 21 2018 17:35:54 (String)

IS_PRIVATE_IP(string)🔗

Returns boolean if the passed in (IP address) string is in the private IP range. Currently only supports IPv4 and tests against the private IP ranges defined in RFC1918.

Example🔗

data1 = "10.0.0.1"
data2 = "11.0.0.1"
OUTPUT1$ = IS_PRIVATE_IP(data1)
OUTPUT2$ = IS_PRIVATE_IP(data2)

# OUTPUT1$: true (bool)

# OUTPUT2$: false (bool)

IS_VALID_IP(string)🔗

Returns boolean if the passed in string is a valid IP address, leveraging net.ParseIP.

Example🔗

data1 = "10.0.0.1"
data2 = "999.255.255.255"
data3 = "2001:0db8:85a3:0000:0000:8a2e:0370:7334"
OUTPUT1$ = IS_VALID_IP(data1)
OUTPUT2$ = IS_VALID_IP(data2)
OUTPUT3$ = IS_VALID_IP(data3)

# OUTPUT1$: true (bool)

# OUTPUT2$: false (bool)

# OUTPUT3$: true (bool)

REPLACE(data, oldString, newString)🔗

Replaces all occurrences of oldString with newString.

Example🔗

data = "aaaBBBaaaCCC"
OUTPUT$ = REPLACE(data, "aaa", "zzz")

# OUTPUT$: zzzBBBzzzCCC (String)

REPLACE_REGEX(data, pattern, newString)🔗

Replaces all occurrences of oldPattern with newString.

Example🔗

data = "aaaBBBaaaCCC"
OUTPUT$ = REPLACE_REGEX(data, "a+", "z")

# OUTPUT$: zBBBzCCC (String)

LENGTH(value [, "list" | "bytes" | "string" | "keys" | "object"])🔗

Returns the length of value. The result depends on the type of value and the optional second argument. LENGTH() is the recommended replacement for STRLEN() and supports lists, objects, and strings.

Default mode (no second argument):

Input type	Returns
`STRING`	UTF-8 byte length of the string
`LIST`	Number of elements
`OBJECT`	Number of top-level keys
`NULL`	`0`
Any other type	Error

Strict mode (second argument specifies the expected type, case-insensitive). When the value's type does not match the requested mode, LENGTH returns NULL instead of raising an error, so parser logic can branch with IF ... != NULL ....

Mode	Operates on	Notes
`"list"`	`LIST`, `STRING`, `NULL`	List length. A non-empty trimmed `STRING` returns `1` (helpful for unwrapping JSONPath singletons). Other types return `NULL`.
`"bytes"` or `"string"`	`STRING`, `NULL`	UTF-8 byte length. Other types return `NULL`.
`"keys"` or `"object"`	`OBJECT`, `NULL`	Number of top-level keys. Other types return `NULL`.

Note

For STRING values, LENGTH() returns the UTF-8 byte length, not the character (rune) count. Multi-byte characters such as accented letters or emoji count as more than one byte.

Examples🔗

# String byte length (default mode)
data = "1234567890"
OUTPUT1$ = LENGTH(data)
# OUTPUT1$: 10 (int)

# Element count of a list
parts = SPLIT("a,b,c,d", ",")
OUTPUT2$ = LENGTH(parts)
# OUTPUT2$: 4 (int)

# Object key count
obj = JSON("{\"a\":1,\"b\":2,\"c\":3}")
OUTPUT3$ = LENGTH(obj)
# OUTPUT3$: 3 (int)

# Strict mode — only return a length when the value is a list
listLen = LENGTH(parts, "list")
# listLen: 4 (int)

# Strict mode — type mismatch returns NULL (no error), so the parser can branch
notAString = SPLIT(data, ",")
result = LENGTH(notAString, "bytes")
# result: NULL

STRLEN(string)🔗

Deprecated

Use LENGTH(value) or LENGTH(value, "bytes") instead. STRLEN() is retained for backwards compatibility and may be removed in a future release. New parsers should not use this function.

Returns the UTF-8 byte length of the passed-in string. Accepts only STRING or NULL; passing any other type returns an error. If NULL is passed, returns 0.

Example🔗

data = "1234567890"
OUTPUT$ = STRLEN(data)

# OUTPUT$: 10 (int)

UPPERCASE(string)🔗

Returns the passed in string with all Unicode letters mapped to their upper case; just an interface/wrapper for strings.ToUpper().

Example🔗

data = "aaabbbccc acme"
OUTPUT$ = UPPERCASE(data)

# OUTPUT$: AAABBBCCC ACME (String)

LOWERCASE(string)🔗

Returns the passed in string with all Unicode letters mapped to their lower case; just an interface/wrapper for strings.ToLower().

Example🔗

data = "AAABBBCCC ACME"
OUTPUT$ = LOWERCASE(data)

# OUTPUT$: aaabbbccc acme (String)

SANITIZE_ALL()🔗

Cleans up null/empty values in event field variables. For example, all of these are set to null: " ", "N/A", "n/a", "null", "nil", "-". This function is run by default on all parsers unless disabled with !SANITIZEALLOFF.

Example🔗

data = "N/A"
OUTPUT$ = data

# OUTPUT$: NULL (null)

!SANITIZEALLOFF
data = "N/A"
OUTPUT$ = data

# OUTPUT$: N/A (String)

TRIM(data)🔗

Removes whitespace, quotes, braces etc.

Example🔗

data = " aaa bbb bcc "
OUTPUT1$ = "---" + data
OUTPUT2$ = "---" + TRIM(data)

# OUTPUT1$: --- aaa bbb bcc (String)

# OUTPUT2$: ---aaa bbb bcc (String)

TRIM_ALL()🔗

Removes whitespace from the beginning/end of all event field variables. This function is run by default on all parsers unless disabled with !TRIMALLOFF.

Example🔗

!TRIMALLOFF
data = " aaa bbb bcc "
OUTPUT2$ = data

# OUTPUT2$: aaa bbb bcc (String)

ADDFIELD(collection, fieldName, fieldValues)🔗

Adds a field to an array of objects. The values of the field for each object are specified by fieldValues (also an array). The name of the new field is specified by fieldName. If collection is NULL, a new array of objects is created, each with a single field (fieldName) with the provided values.

Example🔗

keys = ["httpSourceName", "httpSourceId"]
values = [json["$.httpSourceName"], json["$.httpSourceId"]]

eventMetadata$.record$ = ADDFIELD(NULL, "key$", keys)
eventMetadata$.record$ = ADDFIELD(eventMetadata$.record$, "value$", values)

# event_metadata = {

#     "httpSourceName": json["$.httpSourceName"]

#     "httpSourceId": json["$.httpSourceId"]

# }

URL_PARSE(url, silent)🔗

Parse a URL.

Tip

For more on working with parsing, see Creating, Editing, and Enabling a Custom Parser in XDR.

If silent is true, this does not throw an error in the case of the URL being invalid, and instead nulls all fields. For badly formatted URLs, it always attempts to extract as much as possible. The expected passed in URL format is one of:

scheme:opaque?query#fragment
scheme://userinfo@host/path?query#fragment

Examples🔗

http://user:password@192.1.1.1:8080/1/asdfasdfasdf.html?key=value&key2=value2#topOfTheMorning
hTtps://Example.com:443/here//is/path.html?a=1+6&x=%2f%2Fkey=%41%0Avalue&b=ddd#top
https://example.com/foo/bar/bar/../baz.html?a=1&b=2
example.com/foo/bar/bar/../baz.html?a=1&b=2

If the scheme is not provided (for example, example.com/index.html instead of http://example.com/index.html), then http is assumed and returned in the scheme value.

The resulting collection object contains the following values, if possible, given the URL:

scheme - normalized; the given scheme converted to lower case or http if not provided.
user - the passed in user, if provided.
host_raw - not normalized; the passed in host including the port, for example Example.com:443
host - normalized host; all lower case, and not including the port, for example example.com
port - the extracted port, if present
path_raw - not normalized; the passed in path, for example /foo/bar/bar/../baz.html. Does not including a trailing ?, even if there is a query part of the URI.
path - normalized path, for example /foo/bar/baz.html. does not including a trailing ?, even if there is a query part of the URI. Normalizations done:
- Characters are URI decoded. (Single pass so %253D is %3D not =)
  - /fo%6F/bar.html → /foo/bar.html
- Multiple forward slashes are reduded to a single one.
  - /foo///bar.html → /foo/bar.html
- Directory traversal sequences are removed.
  - /foo/../bar/ → /bar/
  - /foo/./bar/ → /foo/bar/
query_raw - not normalized; the passed in query string, for example a=1+6&x=%2f%2Fkey=%41%0Avalue&b=ddd. Does not include a leading ? but does preserve order. (Single pass so %253D is %3D not =)
query - normalized query string. Does not include a leading ? but does preserve order. URI decoding done; if URI decoding fails, then the name-value pair where the decoding is unsuccessful is included, in the given order, with no normalization done to it.
- a=1&b=%44%57 → a=1&b=DW
raw_query - DEPRECATED, do not use. Same as query_raw and provided for legacy compatibility but going away as soon as the parsers are updated.
password - the passed in password, if provided.
fragment - the passed in fragment, if provided.

Examples🔗

data = "hTtps://Example.com:443/here//is/path.html?a=1+6&x=%2f%2Fkey=%41%0Avalue&b=ddd#top"
urlParts = URL_PARSE(data, FALSE)
OUTPUT$ = urlParts["path_raw"]

# OUTPUT$: /here//is/path.html (String)

CONTAINS(string, substring)🔗

Wraps golang's strings.Contains(string,subString), returns a bool.

Example🔗

data     = "aaabbbccc acme"
OUTPUT1$ = CONTAINS(data, "roadrunner")
OUTPUT2$ = CONTAINS(data, "acme")

# OUTPUT1$: false    (bool)

# OUTPUT2$: true    (bool)

IDX_OF_TLD(string)🔗

Returns an int64 that signals where in the string the top-level domain is at for indexOfTopPrivateDomain$. If -1 is returned set IsTopPrivateDomainParsed$ to false, otherwise set IsTopPrivateDomainParsed$ to true.

Example🔗

OUTPUT0$ = IDX_OF_TLD("aaa http://example.com")
OUTPUT1$ = IDX_OF_TLD("http://example.com")
OUTPUT2$ = IDX_OF_TLD("")

# OUTPUT0$: 0    (int)

# OUTPUT1$: 0    (int)

# OUTPUT2$: -1    (int)

PARSE_ERROR(string,string)🔗

ParseError explicitly raises an error in the parser .parameters[0] errText if coercion.EvaluateAsString() is passed. parameter[1] is an optional string that is cast into a boolean via ParserValue.BoolValue() to denote if a generic event should be created. It defaults to true if not provided. The message does not normalize to any other schema.

Examples🔗

Creates a Generic Event🔗

test = IF someVal != "Expected_value" THEN PARSE_ERROR("bad data received") ELSE "ok"

Doesn’t Create a Generic Event🔗

tenantId$ = TENANT_LOOKUP("ngav_id", vals["Account"], PARSE_ERROR("Unable to find Taegis tenant id for Deep Armor account " + vals["Account"],"False"))

TENANT_LOOKUP(label, value, default)🔗

Looks up the tenant id in Taegis Tenant Manager based on a label and a value from the message. If no tenant is found, the specified default expression is evaluated. Note that if a tenant is found, the third parameter is not evaluated. This gives the caller the option to provide a default value or to use the PARSE_ERROR() function to raise an error.

Example🔗

tenantId$ = TENANT_LOOKUP("VendorName", messageValues["customerId"], PARSE_ERROR("Customer Id not on file"))

BASE64_DECODE(string)🔗

Returns plain text string of a base64 encoded string input.

Example🔗

OUTPUT$ = BASE64_DECODE("aG1lZXBcISBobWVlcFwh")

#OUTPUT$: hmeep\! hmeep\!    (String)

INT(string, base)🔗

Returns integer of a number string with specific base

Example🔗

OUTPUT$ = INT("4e0", 16)

#OUTPUT$: 1248    (int)

STRING(valueType)🔗

Attempts to cast the variable input into a string representation.

# In some cases "key" can be a string, empty (NULL), an array, or even map.

key = json["$.requestParameters.key"]
# By calling STRING() you guarantee objectKey is set with a value.

objectKey$ = STRING(key)
# Note: ParserValue.StringValue() isn't used directly because addition logic breaks when appending two valuetype.OBJECT to make a list (addition operator).

# valuetype.LIST, valuetype.OBJECT, and valuetype.JSONDATA returns the json string representation all others are cast to their string analogs.

OBJKEYS(value)🔗

Will return a list of the keys of a map or json object.

Example🔗

 # Suppose the original json was:

 { 
    "values" : {
        "c" : "x", 
        "b" : "y", 
        "a" : "z"
    }
 }
keys = OBJKEYS(json["$.values"])
# keys is now an array of ["a", "b", "c"]

# NOTE: this function puts the values in alphabetical order

OBJVALUES(value)🔗

Will return a list of the values of a map or json object.

Example🔗

 # Suppose the original json was:

 {
    "values" : {
        "c" : "x", 
        "b" : {
            "foo" : "bar"
        }, 
        "a" : "z"
    }
 }
vals = OBJVALS(json["$.values"])
# vals is now an array of ["z", "{ 'foo' : 'bar' }", "x"]

# NOTE: this function puts the values in alphabetical order by their key.  This assures that OBJKEYS and OBJVALS output their elements in the same order which is important when combining these functions with ADDFIELD().

FLATTEN(json, keyLabel, valueLabel)🔗

Converts arbitrary json to a list of objects.

Each object has two fields: a key and a value, both of type string. Parameters keyLabel and valueLabel are optional with default values "key"and"value" and "value"and"value" respectively. This function is intended to provide a convenient way to put json data into the schema fields of type KeyValuePairsIndexed; for example, the tags field on the generic schema or the evidence.sourceData.record field of ThirdPartyAlert.

Example🔗

 # Suppose the original json was:

{
    "val" : { 
        "x": [
            "1",
            "2",
            "3"
        ] 
    }
}

# The output would be:

[
    {
        "key$": "val.x.0",
        "value$": "1"
    },         _
    {
        "key$": "val.x.1", 
        "value$": "2"
    },        _
    {
        "key$": "val.x.2", 
        "value$": "3"
    }
]

VALIDATE_ALL_JSONPATHS_EXIST(inputStr, jsonPaths)🔗

Returns TRUE only when the JSON object embedded in inputStr contains every JSONPath listed in jsonPaths. Returns FALSE if any path is missing, inputStr does not end with a JSON object, or the JSON cannot be parsed.

Parameters:

Name	Type	Description
`inputStr`	`STRING`	A string that ends with a JSON object literal (`{ ... }`). Anything before the trailing JSON object is ignored.
`jsonPaths`	`STRING`	A comma-separated list of JSONPath expressions to validate.

Behavior:

The function extracts a JSON object from the end of inputStr using the pattern { ... }. Strings that do not end with a { ... } block return FALSE.
Each path in jsonPaths is evaluated against the parsed JSON. The function returns TRUE only if every path matches at least one element.
Any error (invalid JSON, malformed JSONPath, type coercion failure, etc.) is reported as FALSE rather than raising an error, which makes the function safe to use in !CONFIRMSTRING expressions.

Example🔗

# originalData$:
# May 17 12:00:00 host alert {"event_id":"123","actor":{"id":"u1"},"action":"login"}

ok = VALIDATE_ALL_JSONPATHS_EXIST(originalData$, "$.event_id,$.actor.id,$.action")
# ok: TRUE

incomplete = VALIDATE_ALL_JSONPATHS_EXIST(originalData$, "$.event_id,$.session.id")
# incomplete: FALSE   ($.session.id is not present)

Common Use Case: Gating a Parser with `!CONFIRMSTRING`🔗

VALIDATE_ALL_JSONPATHS_EXIST is useful as an expression-mode confirm string: only match messages whose embedded JSON contains every required field, preventing partial-event mismatches against parsers that need those fields downstream.

!CONFIRMWITH=EXPRESSION
!CONFIRMSTRING=VALIDATE_ALL_JSONPATHS_EXIST(originalData$, "$.event_id,$.actor.id,$.action")