6. Pattern Matching

Introduction

meadow.m!(I, "\bovines?\b", println("Here be sheep!"))

string = "good food"
string.subst!("o*", "e")

"ababacaca".m!(
    "((a|ba|b)+(a|ac)+)",
    s,_,_ -> s.println
)                                       # => ababa

s.m!(G, "(\d+)", e->println("Found number {e}"))
numbers = s.m(G, "(\d+)")

Copying and Substituting Simultaneously

dst = src.subst("this", "that")

# Make All Words Title-Cased
capword = word.subst(G, "(\w+)", capitalize)

# /usr/man/man3/foo.1 changes to /usr/man/cat3/foo.1
catpage = manpage.subst("man(?=\d)", "cat")

bindirs = " /usr/bin /bin /usr/local/bin ".words
libdirs = bindirs.map(subst(, "bin", "lib"))
println(libdirs.join(" "))              # /usr/lib /lib /usr/local/lib

Matching Letters

Matching Words

Commenting Regular Expressions

Finding the Nth Occurrence of a Match

s = "One fish two fish red fish blue fish"
want = 3
count = 0
s.m!(
    G|I,
    "(\w+)\s+fish\b",
    s ->
        count++
        if count == want then
            println("The third fish is a {s} one.") # => red
)
s.m!(
    I, "(?:\w+\s+fish\s+)\{2}(\w+)\s+fish",
    s -> println("The third fish is a {s} one.") # => red
)
colors = s.m(G|I, "(\w+)\s+fish\b")
println("The third fish is a {colors[2]} one.") # => red

pond = "One fish two fish red fish blue fish swim here."
color = pond.m(G|I, "\b(\w+)\s+fish\b").last
println("Last fish is {color}.")        #=> Last fish is blue.

Matching Multiple Lines

Reading Records with a Pattern Separator

Extracting a Range of Lines

file.open.to_list.each_with_index(s,i -> println(s) if i.member?(15 .. 17))

r = Regexp::new_range(m?(, I, "<XMP>"), m?(, I, "</XMP>"))
l.each(s -> if s.member?(r) then print(s))

header = Regexp::new_range(_ -> True, == "")
body = Regexp::new_range(== "", _ -> True)
l.each(s ->
    in_header = s.member?(header)
    in_body = s.member?(body)
)

header = Regexp::new_range(m?(, I, "^From:?\s"), == "")
Sys::stdall.collect(s ->
    if s.member?(header) then
        c = "[^<>(),;\s]"
        s.m(G, "({c}+\@{c}+)")
    else []
).flatten.uniq.each(println)

Matching Shell Globs as Regular Expressions

s.glob2pat =
    patmap = {
        "*", ".*",
        "?", ".",
        "[", "[",
        "]", "]",
    }
    s.subst!(G, "(.)", (c -> patmap{c} or c.quotemeta))
    "^{s}$"

Speeding Up Interpolated Matches

# no need

Testing for a Valid Pattern

# not possible? do not allow runtime constructed regexps?

Honoring Locale Settings in Regular Expressions

Approximate Matching

Matching from Where the Last Pattern Left Off

Greedy and Non-Greedy Matches

Detecting Duplicate Words

Expressing AND, OR, and NOT in a Single Pattern

Matching Multiple-Byte Characters

Matching a Valid Mail Address

Matching Abbreviations

Program: urlify

Program: tcgrep

Regular Expression Grabbag