Pattern Matching Functions

SenseTalk's pattern language lets you use natural language to define patterns that you can use in searches to match against text. The following functions can be used with pattern language definitions.

For information about the pattern language, see SenseTalk Pattern Language Basics.

Match, Every Match Functions

Behavior: Use the match and every match functions to locate a pattern within text. The match function finds the first occurrence of the specified pattern, and every match finds every occurrence of the pattern in the source.

Parameters:

  • the match of pattern (required): Can be specified in the pattern language syntax within angle brackets ( < ... > ) or as a variable. For information about the pattern language syntax, see Pattern Language Syntax.
  • in source (required): Can be specified as a quoted string, in a variable, or as an expression that yields text.
  • after position (optional): Specifies a number for a character position within the text such that the search for the pattern match begins with the next character position.
  • before position (optional): Specifies a number for a character position within the text such that the search for the pattern match ends with the character before that character position.
  • caseSensitivity (optional): Specifies any of the standard case sensitivity phrases (caseSensitive, with case, etc.), to determine whether searches for text are case sensitive or not. Default: caseInsensitive

Syntax:

{the} match of pattern [in | within] source { [before | after] [ {position | location} position | {the} end] } {considering case | ignoring case}

every match of pattern [in | within] source { [before | after] [ {position | location} position | {the} end] } {considering case | ignoring case}

 

match(pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )

everyMatch(pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )

Note: When using the match() or everyMatch() traditional function call syntax, the first two parameters are required and the remaining three are optional. The caseSensitive parameter in this case is a boolean (default: False) indicating whether the search is case sensitive. The treatPositionAsBefore parameter is a boolean (default: False) that specifies whether the search should occur before the location given by the position parameter rather than after.

Returns: One or more property lists to provide information about locations in the source text where the pattern was found. The match function returns one property list, and every match returns a property list for each match found. Each match property list contains at least two properties:

  • text: The full text that was matched.
  • text_range: The range of characters in the source where the matched text was located.

When the pattern contains one or more capture groups, the match property list also includes a pair of properties for each capture group included:

  • name: The name of the capture group.
  • name_range: The range where the capture group was found.

Note: Because the full matched text is always returned with the property name text, you should not use the name text for any capture groups within a pattern.

Example:

put the match of <punctuation> within "Green 1: 112-14" --> {text:":", text_range:"8" to "8"}

Example:

put the match of <3 digits> in "1bc3 8472QX905" --> {text:"847", text_range:"6" to "8"}

Example:

put match(<3 digits>, "1bc3 8472QX905") --> {text:"847", text_range:"6" to "8"}

Example:

put every match of <3 digits> in "123456789" --> [{text:"123", text_range:"1" to "3"},{text:"456", text_range:"4" to "6"},{text:"789", text_range:"7" to "9"}]

Example:

put everyMatch (<3 digits>, "123456789") --> [{text:"123", text_range:"1" to "3"},{text:"456", text_range:"4" to "6"},{text:"789", text_range:"7" to "9"}]

Related:

Occurrence, Every Occurrence Functions

Behavior: The occurrence and every occurrence functions return the matched text for a defined pattern. The occurrence function returns the first match found, and every occurrence returns a list of every match found in the source.

Parameters:

  • the occurrence of pattern (required): Can be specified in the pattern language syntax within angle brackets ( < ... > ) or as a variable. For information about the pattern language syntax, see Pattern Language Syntax.
  • in source (required): Can be specified as a quoted string, in a variable, or as an expression that yields text.
  • after position (optional): Specifies a number for a character position within the text such that the search for the pattern match begins with the next character position.
  • before position (optional): Specifies a number for a character position within the text such that the search for the pattern match ends with the character before that character position.
  • caseSensitivity (optional): Specifies any of the standard case sensitivity phrases (caseSensitive, with case, etc.), to determine whether searches for text are case sensitive or not. Default: caseInsensitive

Syntax:

{the} occurrence of pattern [in | within] source { [before | after] [ {position | location} position | {the} end] } {considering case | ignoring case}

every occurrence of pattern [in | within] source { [before | after] [ {position | location} position | {the} end] } {considering case | ignoring case}

 

occurrence(pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )

everyOccurrence(pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )

Note: When using the occurrence() or everyOccurrence() traditional function call syntax, the first two parameters are required and the remaining three are optional. The caseSensitive parameter in this case is a boolean (default: False) indicating whether the search is case sensitive. The treatPositionAsBefore parameter is a boolean (default: False) that specifies whether the search should occur before the location given by the position parameter rather than after.

Returns: Text that matches the pattern, if any; otherwise empty.

Example:

put the occurrence of <3 digits> in "1bc3 8472QX905" --> "847"

Example:

put the occurrence of <"$", digits> in "$895" —> "$8"

Example:

put occurrence (<"$", digits>, "$895") —> "$8"

Example:

put every occurrence of <3 digits> in "123456789" --> [123,456,789]

Example:

put occurrence of <max digits> in "Issue #429 was resolved on 15-Jun-2018" --> 429

put the range of <max digits> in "Issue #429 was resolved on 15-Jun-2018" --> 8 to 10

put every instance of <max digits> in "Issue #429 was resolved on 15-Jun-2018" after position 10 --> [15,2018]

Example:

set KingQuotes to {{

Darkness cannot drive out darkness; only light can do that. Hate cannot drive out hate; only love can do that.

The ultimate measure of a man is not where he stands in moments of comfort and convenience, but where he stands at times of challenge and controversy.

Faith is taking the first step even when you don't see the whole staircase.

Our lives begin to end the day we become silent about things that matter.

Injustice anywhere is a threat to justice everywhere.

I look to a day when people will not be judged by the color of their skin, but by the content of their character.

I have decided to stick with love. Hate is too great a burden to bear.

The time is always right to do what is right.

Life's most persistent and urgent question is, "What are you doing for others?"

We must learn to live together as brothers or perish together as fools.

}}

 

set LWords to <word beginning with "L", chars, word break>

put every occurrence of LWords in KingQuotes

--> [light,love,lives,look,love,Life,learn,live]

 

set JWords to <"J" at start of a word, chars, end of word>

put every match of JWords in KingQuotes

--> [{text:"justice", text_range:"447" to "453"},{text:"judged", text_range:"507" to "512"}]

Related:

 

This topic was last updated on August 19, 2021, at 03:30:51 PM.

Eggplant icon Eggplantsoftware.com | Documentation Home | User Forums | Support | Copyright © 2022 Eggplant