Powered by AppSignal & Oban Pro

Strings and binaries

strings.livemd

Strings and binaries

Strings in Elixir are delimited by double quotes, and they are encoded in UTF-8:

A UTF-8 string:

"Cześć"

You can concatenate them with <> operator and interpolate with #{}.

"hello" <> " " <> "world"

Interpolation works for all types that support conversion to string:

number = 42
"I am #{number} years old!"

Finally, IO.puts/1 prints a string to the standard output:

IO.puts("hello world")

Let’s dive deeper into how strings are represented internally.

Strings as Binaries

Strings in Elixir are represented internally by contiguous sequences of bytes, known as binaries. Thus, we can use functions such as byte_size on strings. Functions specific to strings are in the String module.

Strings are binaries:

is_binary("cześć")

In UTF-8, ‘ś’ and ‘ć’ take two bytes each.

# 💡 Try changing `byte_size` to `String.length/1`.
byte_size("cześć")

Using the String module:

String.upcase("cześć")