Powered by AppSignal & Oban Pro

Regular Expressions

regexp.livemd

Regular Expressions

Introduction

Regular expressions are implemented in the Regex module.

Syntax

Although there are lower-level options, regular expressions are usually compiled using the ~r sigil.

hello_pattern = ~r/hello/

The pattern can be made case insensitive by adding an i to the end:

insensitive_hello_pattern = ~r/hello/i

The pattern goes in between the two forward slashes. For instance, the following pattern matches most floats:

float_pattern = ~r/(\d*\.\d+|\d+)/

String interpolation is supported for dynamic pattern construction:

city = "Copenhagen"
country = "Denmark"
location_pattern = ~r/#{city} is in #{country}/

Matching

Matching is a boolean operation that evaluates to whether a pattern is present in a string. This is implemented through the match/2 function as well as the =~ shorthand.

test_suite = [
  "Hello, world!",
  "he said hello to the world",
  "end of days"
]

contains_hello =
  test_suite
  |> Enum.map(fn test ->
    %{
      sensitive: test =~ hello_pattern,
      insensitive: test =~ insensitive_hello_pattern
    }
  end)
test_suite = [
  "0",
  "3.14",
  "there were 2 flowers",
  "regular expression",
  "2.25 was the price"
]

contains_float =
  test_suite
  |> Enum.map(fn test -> test =~ float_pattern end)
text =
  "The Danish Queen lives in Copenhagen. Copenhagen is in Denmark. It is a medium-sized capital."

states_origin =
  text
  |> String.split(".")
  |> Enum.any?(fn sentence -> sentence =~ location_pattern end)

Capturing Anonymous Fields

The matched substring, and all substring matches of parentheses in the pattern represents fields. Given a match, these fields can be extracted as a list (in the order they are present in the pattern):

test_suite = [
  "a + b",
  "a * b",
  " a / b",
  "a+b"
]

test_suite
|> Enum.map(fn test -> Regex.run(~r/([^ ]+) (\+|\-|\*|\/) ([^ ]+)/, test) end)

Capturing Named Fields

Each field can be named using the ? syntax. Doing so allows us to bind these names to values at the match site through a map:

test_suite
|> Enum.map(fn test ->
  Regex.named_captures(~r/(?[^ ]+) (?\+|\-|\*|\/) (?[^ ]+)/, test)
end)

Replacing Fields

The full match can be replaced by a string that may refer to the fields. This can be done by referring to the field indices:

test_suite
|> Enum.map(fn test ->
  Regex.replace(~r/([^ ]+) (\+|\-|\*|\/) ([^ ]+)/, test, "\\3 \\2 \\1")
end)

But it can also be accomplished through an anonymous function that has access to the fields:

test_suite
|> Enum.map(fn test ->
  Regex.replace(~r/([^ ]+) (\+|\-|\*|\/) ([^ ]+)/, test, fn _full, lhs, op, rhs ->
    "(#{lhs}) #{op} (#{rhs})"
  end)
end)