Lab: Explorer
Mix.install([
{:explorer, "~> 0.10.0"},
{:kino, "~> 0.14.2"},
{:kino_explorer, "~> 0.1.23"}
])
Setup
alias Explorer.DataFrame
alias Explorer.Series
Explorer.Series
Reading and Writing Data
df = Explorer.Datasets.fossil_fuels()
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
input = Kino.Input.text("Filename")
filename = Kino.Input.read(input)
DataFrame.to_csv(df, filename)
:ok
Working with Series
Expolorer
works up from the concept of a Series
. A DataFrame
can be thought of as a row-aligned map of Series
. These are like vectors
in R or series
in Pandas.
Series
can be constructed from Elixir basic types
s1 = Series.from_list([1, 2, 3])
#Explorer.Series<
Polars[3]
s64 [1, 2, 3]
>
s2 = Series.from_list(["a", "b", "c"])
#Explorer.Series<
Polars[3]
string ["a", "b", "c"]
>
s3 = Series.from_list([~D[2024-10-31], ~D[2024-11-01]])
#Explorer.Series<
Polars[2]
date [2024-10-31, 2024-11-01]
>
Series.dtype(s3)
:date
Series.size(s3)
2
Series
are also nullable
s = Series.from_list([1.0, 2.0, nil, nil, 5.0])
#Explorer.Series<
Polars[5]
f64 [1.0, 2.0, nil, nil, 5.0]
>
Missing values can be filled using one of the following strategies:
-
:forward
- replacenil
with the previous value -
:backward
- replacenil
with the next value -
:max
- replacenil
with the series maximum -
:min
- replacenil
with the series minimum -
:mean
- replacenil
with the series mean
Series.fill_missing(s, :forward)
#Explorer.Series<
Polars[5]
f64 [1.0, 2.0, 2.0, 2.0, 5.0]
>
Series.fill_missing(s, :backward)
#Explorer.Series<
Polars[5]
f64 [1.0, 2.0, 5.0, 5.0, 5.0]
>
Series.fill_missing(s, :max)
#Explorer.Series<
Polars[5]
f64 [1.0, 2.0, 5.0, 5.0, 5.0]
>
Series.fill_missing(s, :min)
#Explorer.Series<
Polars[5]
f64 [1.0, 2.0, 1.0, 1.0, 5.0]
>
Series.fill_missing(s, :mean)
#Explorer.Series<
Polars[5]
f64 [1.0, 2.0, 2.6666666666666665, 2.6666666666666665, 5.0]
>
A goal of Explorer
is useful error messages
Series.from_list([1, 2, 3, "a"])
Series
implement the Access
protocol. You can slice and dice a Series
in many ways:
s = 1..10 |> Enum.to_list() |> Series.from_list()
#Explorer.Series<
Polars[10]
s64 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>
s[1]
2
s[-1]
10
s[0..4]
#Explorer.Series<
Polars[5]
s64 [1, 2, 3, 4, 5]
>
s[[0, 4, 4]]
#Explorer.Series<
Polars[3]
s64 [1, 5, 5]
>
Explorer
comparisons return boolean series
s = 1..11 |> Enum.to_list() |> Series.from_list()
s1 = 11..1//-1 |> Enum.to_list() |> Series.from_list()
Series.equal(s, s1)
#Explorer.Series<
Polars[11]
boolean [false, false, false, false, false, true, false, false, false, false, false]
>
Series.equal(s, 5)
#Explorer.Series<
Polars[11]
boolean [false, false, false, false, true, false, false, false, false, false, false]
>
Series.not_equal(s, 10)
#Explorer.Series<
Polars[11]
boolean [true, true, true, true, true, true, true, true, true, false, true]
>
Series.greater_equal(s, 5)
#Explorer.Series<
Polars[11]
boolean [false, false, false, false, true, true, true, true, true, true, true]
>
Explorer
supports arithmetic
Series.add(s, s1)
#Explorer.Series<
Polars[11]
s64 [12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12]
>
Series.subtract(s, 4)
#Explorer.Series<
Polars[11]
s64 [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]
>
Series.multiply(s, s1)
#Explorer.Series<
Polars[11]
s64 [11, 20, 27, 32, 35, 36, 35, 32, 27, 20, 11]
>
Sorting a Series
is trivial as well
1..100
|> Enum.to_list()
|> Enum.shuffle()
|> Series.from_list()
|> Series.sort()
#Explorer.Series<
Polars[100]
s64 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, ...]
>
s =
1..100
|> Enum.to_list()
|> Enum.shuffle()
|> Series.from_list()
ids =
s
|> Series.argsort()
|> Series.to_list()
[7, 48, 70, 59, 64, 18, 62, 24, 60, 20, 45, 6, 38, 3, 41, 76, 72, 31, 92, 10, 5, 34, 42, 37, 91, 71,
69, 22, 82, 68, 80, 30, 88, 95, 56, 84, 14, 77, 26, 74, 81, 99, 16, 29, 85, 13, 55, 53, 9, 40, ...]
Series.slice(s, ids)
#Explorer.Series<
Polars[100]
s64 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, ...]
>
Calculating cumulative sumsn
s = 1..100 |> Enum.to_list() |> Series.from_list()
Series.cumulative_sum(s)
#Explorer.Series<
Polars[100]
s64 [1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231, 253, 276, 300, 325, 351, 378, 406, 435, 465, 496, 528, 561, 595, 630, 666, 703, 741, 780, 820, 861, 903, 946, 990, 1035, 1081, 1128, 1176, 1225, 1275, ...]
>
Calcuate rolling sum
Series.window_sum(s, 4)
#Explorer.Series<
Polars[100]
s64 [1, 3, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62, 66, 70, 74, 78, 82, 86, 90, 94, 98, 102, 106, 110, 114, 118, 122, 126, 130, 134, 138, 142, 146, 150, 154, 158, 162, 166, 170, 174, 178, 182, 186, 190, 194, ...]
>
Counting and listing unique values
s = Series.from_list(["a", "b", "b", "c", "c", "c"])
Series.distinct(s)
#Explorer.Series<
Polars[3]
string ["a", "b", "c"]
>
Series.n_distinct(s)
3
Series.frequencies(s)
#Explorer.DataFrame<
Polars[3 x 2]
values string ["c", "b", "a"]
counts u32 [3, 2, 1]
>
Working with DataFrames
A DataFrame
is a collection of Series
of the same size, which is why you can create a DataFrame
from a Keyword
list
DataFrame.new(a: [1, 2, 3], b: ["a", "b", "c"])
#Explorer.DataFrame<
Polars[3 x 2]
a s64 [1, 2, 3]
b string ["a", "b", "c"]
>
DataFrame.names(df)
["year", "country", "total", "solid_fuel", "liquid_fuel", "gas_fuel", "cement", "gas_flaring",
"per_capita", "bunker_fuels"]
DataFrame.dtypes(df)
%{
"bunker_fuels" => {:s, 64},
"cement" => {:s, 64},
"country" => :string,
"gas_flaring" => {:s, 64},
"gas_fuel" => {:s, 64},
"liquid_fuel" => {:s, 64},
"per_capita" => {:f, 64},
"solid_fuel" => {:s, 64},
"total" => {:s, 64},
"year" => {:s, 64}
}
DataFrame.shape(df)
{1094, 10}
{DataFrame.n_rows(df), DataFrame.n_columns(df)}
{1094, 10}
DataFrame.head(df)
#Explorer.DataFrame<
Polars[5 x 10]
year s64 [2010, 2010, 2010, 2010, 2010]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA"]
total s64 [2308, 1254, 32500, 141, 7924]
solid_fuel s64 [627, 117, 332, 0, 0]
liquid_fuel s64 [1601, 953, 12381, 141, 3649]
gas_fuel s64 [74, 7, 14565, 0, 374]
cement s64 [5, 177, 2598, 0, 204]
gas_flaring s64 [0, 0, 2623, 0, 3697]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37]
bunker_fuels s64 [9, 7, 663, 0, 321]
>
DataFrame.tail(df)
#Explorer.DataFrame<
Polars[5 x 10]
year s64 [2014, 2014, 2014, 2014, 2014]
country string ["VIET NAM", "WALLIS AND FUTUNA ISLANDS", "YEMEN", "ZAMBIA", "ZIMBABWE"]
total s64 [45517, 6, 6190, 1228, 3278]
solid_fuel s64 [19246, 0, 137, 132, 2097]
liquid_fuel s64 [12694, 6, 5090, 797, 1005]
gas_fuel s64 [5349, 0, 581, 0, 0]
cement s64 [8229, 0, 381, 299, 177]
gas_flaring s64 [0, 0, 0, 0, 0]
per_capita f64 [0.49, 0.44, 0.24, 0.08, 0.22]
bunker_fuels s64 [761, 1, 153, 33, 9]
>
Verbs and Macros
In Explorer
, like in dplyr
, there are five main verbs to work with DataFrame
s
-
select
-
filter
-
mutate
-
sort
-
summarise
require DataFrame, as: DF
Explorer.DataFrame
Select
DF.select(df, ["year", "country"])
#Explorer.DataFrame<
Polars[1094 x 2]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
>
DF.select(df, &String.ends_with?(&1, "fuel"))
#Explorer.DataFrame<
Polars[1094 x 3]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
>
DF.discard(df, &String.ends_with?(&1, "fuel"))
#Explorer.DataFrame<
Polars[1094 x 7]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
Filter
DF.filter(df, country == "BRAZIL")
#Explorer.DataFrame<
Polars[5 x 10]
year s64 [2010, 2011, 2012, 2013, 2014]
country string ["BRAZIL", "BRAZIL", "BRAZIL", "BRAZIL", "BRAZIL"]
total s64 [114468, 119829, 128178, 137354, 144480]
solid_fuel s64 [15965, 17498, 17165, 18773, 20089]
liquid_fuel s64 [74689, 78849, 84409, 88898, 92454]
gas_fuel s64 [14372, 13778, 16328, 19399, 21297]
cement s64 [8040, 8717, 9428, 9517, 9691]
gas_flaring s64 [1402, 987, 848, 767, 949]
per_capita f64 [0.58, 0.6, 0.63, 0.67, 0.7]
bunker_fuels s64 [5101, 5516, 5168, 4895, 4895]
>
DF.filter(df, country == "ALGERIA" and year > 2012)
#Explorer.DataFrame<
Polars[2 x 10]
year s64 [2013, 2014]
country string ["ALGERIA", "ALGERIA"]
total s64 [36669, 39651]
solid_fuel s64 [198, 149]
liquid_fuel s64 [14170, 14422]
gas_fuel s64 [17863, 20151]
cement s64 [2516, 2856]
gas_flaring s64 [1922, 2073]
per_capita f64 [0.96, 1.02]
bunker_fuels s64 [687, 581]
>
DF.filter_with(df, fn ldf ->
ldf["country"]
|> Series.equal("ALGERIA")
|> Series.and(Series.greater(ldf["year"], 2012))
end)
#Explorer.DataFrame<
Polars[2 x 10]
year s64 [2013, 2014]
country string ["ALGERIA", "ALGERIA"]
total s64 [36669, 39651]
solid_fuel s64 [198, 149]
liquid_fuel s64 [14170, 14422]
gas_fuel s64 [17863, 20151]
cement s64 [2516, 2856]
gas_flaring s64 [1922, 2073]
per_capita f64 [0.96, 1.02]
bunker_fuels s64 [687, 581]
>
Mutate
DF.mutate(df, new_column: solid_fuel + cement)
#Explorer.DataFrame<
Polars[1094 x 11]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
new_column s64 [632, 294, 2930, 0, 204, ...]
>
DF.mutate(
df,
gas_fuel: Series.cast(gas_fuel, :float),
gas_and_liquid_fuel: gas_fuel + liquid_fuel
)
#Explorer.DataFrame<
Polars[1094 x 11]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel f64 [74.0, 7.0, 14565.0, 0.0, 374.0, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
gas_and_liquid_fuel s64 [1675, 960, 26946, 141, 4023, ...]
>
DF.mutate(
df,
%{"gas_fuel" => gas_fuel - 10}
)
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [64, -3, 14555, -10, 364, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
Sort
DF.sort_by(df, year)
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
DF.sort_by(df, asc: total, desc: year)
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2013, 2012, 2011, 2011, ...]
country string ["NIUE", "NIUE", "NIUE", "NIUE", "TUVALU", ...]
total s64 [1, 2, 2, 2, 2, ...]
solid_fuel s64 [0, 0, 0, 0, 0, ...]
liquid_fuel s64 [1, 2, 2, 2, 2, ...]
gas_fuel s64 [0, 0, 0, 0, 0, ...]
cement s64 [0, 0, 0, 0, 0, ...]
gas_flaring s64 [0, 0, 0, 0, 0, ...]
per_capita f64 [0.52, 1.04, 1.04, 1.04, 0.0, ...]
bunker_fuels s64 [0, 0, 0, 0, 0, ...]
>
DF.sort_by(df, asc: window_sum(total, 2))
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2011, 2012, 2010, 2011, ...]
country string ["FEDERATED STATES OF MICRONESIA", "FEDERATED STATES OF MICRONESIA", "FEDERATED STATES OF MICRONESIA", "TUVALU", "TUVALU", ...]
total s64 [31, 33, 37, 2, 2, ...]
solid_fuel s64 [0, 0, 0, 0, 0, ...]
liquid_fuel s64 [31, 33, 37, 2, 2, ...]
gas_fuel s64 [0, 0, 0, 0, 0, ...]
cement s64 [0, 0, 0, 0, 0, ...]
gas_flaring s64 [0, 0, 0, 0, 0, ...]
per_capita f64 [0.3, 0.32, 0.36, 0.0, 0.0, ...]
bunker_fuels s64 [1, 1, 1, 0, 0, ...]
>
Distinct
DF.distinct(df, ["year", "country"])
#Explorer.DataFrame<
Polars[1094 x 2]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
>
DF.distinct(df, ["country"], keep_all: true)
#Explorer.DataFrame<
Polars[222 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
DF.rename(df, year: "year_test")
#Explorer.DataFrame<
Polars[1094 x 10]
year_test s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
DF.rename_with(df, &(&1 <> "_test"))
#Explorer.DataFrame<
Polars[1094 x 10]
year_test s64 [2010, 2010, 2010, 2010, 2010, ...]
country_test string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total_test s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel_test s64 [627, 117, 332, 0, 0, ...]
liquid_fuel_test s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel_test s64 [74, 7, 14565, 0, 374, ...]
cement_test s64 [5, 177, 2598, 0, 204, ...]
gas_flaring_test s64 [0, 0, 2623, 0, 3697, ...]
per_capita_test f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels_test s64 [9, 7, 663, 0, 321, ...]
>
Dummies
DF.dummies(df, ["year"])
#Explorer.DataFrame<
Polars[1094 x 5]
year_2010 u8 [1, 1, 1, 1, 1, ...]
year_2011 u8 [0, 0, 0, 0, 0, ...]
year_2012 u8 [0, 0, 0, 0, 0, ...]
year_2013 u8 [0, 0, 0, 0, 0, ...]
year_2014 u8 [0, 0, 0, 0, 0, ...]
>
DF.dummies(df, ["country"])
#Explorer.DataFrame<
Polars[1094 x 222]
country_AFGHANISTAN u8 [1, 0, 0, 0, 0, ...]
country_ALBANIA u8 [0, 1, 0, 0, 0, ...]
country_ALGERIA u8 [0, 0, 1, 0, 0, ...]
country_ANDORRA u8 [0, 0, 0, 1, 0, ...]
country_ANGOLA u8 [0, 0, 0, 0, 1, ...]
country_ANGUILLA u8 [0, 0, 0, 0, 0, ...]
country_ANTIGUA & BARBUDA u8 [0, 0, 0, 0, 0, ...]
country_ARGENTINA u8 [0, 0, 0, 0, 0, ...]
country_ARMENIA u8 [0, 0, 0, 0, 0, ...]
country_ARUBA u8 [0, 0, 0, 0, 0, ...]
country_AUSTRALIA u8 [0, 0, 0, 0, 0, ...]
country_AUSTRIA u8 [0, 0, 0, 0, 0, ...]
country_AZERBAIJAN u8 [0, 0, 0, 0, 0, ...]
country_BAHAMAS u8 [0, 0, 0, 0, 0, ...]
country_BAHRAIN u8 [0, 0, 0, 0, 0, ...]
country_BANGLADESH u8 [0, 0, 0, 0, 0, ...]
country_BARBADOS u8 [0, 0, 0, 0, 0, ...]
country_BELARUS u8 [0, 0, 0, 0, 0, ...]
country_BELGIUM u8 [0, 0, 0, 0, 0, ...]
country_BELIZE u8 [0, 0, 0, 0, 0, ...]
country_BENIN u8 [0, 0, 0, 0, 0, ...]
country_BERMUDA u8 [0, 0, 0, 0, 0, ...]
country_BHUTAN u8 [0, 0, 0, 0, 0, ...]
country_BOSNIA & HERZEGOVINA u8 [0, 0, 0, 0, 0, ...]
country_BOTSWANA u8 [0, 0, 0, 0, 0, ...]
country_BRAZIL u8 [0, 0, 0, 0, 0, ...]
country_BRITISH VIRGIN ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_BRUNEI (DARUSSALAM) u8 [0, 0, 0, 0, 0, ...]
country_BULGARIA u8 [0, 0, 0, 0, 0, ...]
country_BURKINA FASO u8 [0, 0, 0, 0, 0, ...]
country_BURUNDI u8 [0, 0, 0, 0, 0, ...]
country_CAMBODIA u8 [0, 0, 0, 0, 0, ...]
country_CANADA u8 [0, 0, 0, 0, 0, ...]
country_CAPE VERDE u8 [0, 0, 0, 0, 0, ...]
country_CAYMAN ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_CENTRAL AFRICAN REPUBLIC u8 [0, 0, 0, 0, 0, ...]
country_CHAD u8 [0, 0, 0, 0, 0, ...]
country_CHILE u8 [0, 0, 0, 0, 0, ...]
country_CHINA (MAINLAND) u8 [0, 0, 0, 0, 0, ...]
country_COLOMBIA u8 [0, 0, 0, 0, 0, ...]
country_COMOROS u8 [0, 0, 0, 0, 0, ...]
country_CONGO u8 [0, 0, 0, 0, 0, ...]
country_COOK ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_COSTA RICA u8 [0, 0, 0, 0, 0, ...]
country_COTE D IVOIRE u8 [0, 0, 0, 0, 0, ...]
country_CROATIA u8 [0, 0, 0, 0, 0, ...]
country_CUBA u8 [0, 0, 0, 0, 0, ...]
country_CYPRUS u8 [0, 0, 0, 0, 0, ...]
country_CZECH REPUBLIC u8 [0, 0, 0, 0, 0, ...]
country_DEMOCRATIC PEOPLE S REPUBLIC OF KOREA u8 [0, 0, 0, 0, 0, ...]
country_DEMOCRATIC REPUBLIC OF THE CONGO (FORMERLY ZAIRE) u8 [0, 0, 0, 0, 0, ...]
country_DENMARK u8 [0, 0, 0, 0, 0, ...]
country_DJIBOUTI u8 [0, 0, 0, 0, 0, ...]
country_DOMINICA u8 [0, 0, 0, 0, 0, ...]
country_DOMINICAN REPUBLIC u8 [0, 0, 0, 0, 0, ...]
country_ECUADOR u8 [0, 0, 0, 0, 0, ...]
country_EGYPT u8 [0, 0, 0, 0, 0, ...]
country_EL SALVADOR u8 [0, 0, 0, 0, 0, ...]
country_EQUATORIAL GUINEA u8 [0, 0, 0, 0, 0, ...]
country_ERITREA u8 [0, 0, 0, 0, 0, ...]
country_ESTONIA u8 [0, 0, 0, 0, 0, ...]
country_ETHIOPIA u8 [0, 0, 0, 0, 0, ...]
country_FAEROE ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_FALKLAND ISLANDS (MALVINAS) u8 [0, 0, 0, 0, 0, ...]
country_FEDERATED STATES OF MICRONESIA u8 [0, 0, 0, 0, 0, ...]
country_FIJI u8 [0, 0, 0, 0, 0, ...]
country_FINLAND u8 [0, 0, 0, 0, 0, ...]
country_FRANCE (INCLUDING MONACO) u8 [0, 0, 0, 0, 0, ...]
country_FRENCH GUIANA u8 [0, 0, 0, 0, 0, ...]
country_FRENCH POLYNESIA u8 [0, 0, 0, 0, 0, ...]
country_GABON u8 [0, 0, 0, 0, 0, ...]
country_GAMBIA u8 [0, 0, 0, 0, 0, ...]
country_GEORGIA u8 [0, 0, 0, 0, 0, ...]
country_GERMANY u8 [0, 0, 0, 0, 0, ...]
country_GHANA u8 [0, 0, 0, 0, 0, ...]
country_GIBRALTAR u8 [0, 0, 0, 0, 0, ...]
country_GREECE u8 [0, 0, 0, 0, 0, ...]
country_GREENLAND u8 [0, 0, 0, 0, 0, ...]
country_GRENADA u8 [0, 0, 0, 0, 0, ...]
country_GUADELOUPE u8 [0, 0, 0, 0, 0, ...]
country_GUATEMALA u8 [0, 0, 0, 0, 0, ...]
country_GUINEA u8 [0, 0, 0, 0, 0, ...]
country_GUINEA BISSAU u8 [0, 0, 0, 0, 0, ...]
country_GUYANA u8 [0, 0, 0, 0, 0, ...]
country_HAITI u8 [0, 0, 0, 0, 0, ...]
country_HONDURAS u8 [0, 0, 0, 0, 0, ...]
country_HONG KONG SPECIAL ADMINSTRATIVE REGION OF CHINA u8 [0, 0, 0, 0, 0, ...]
country_HUNGARY u8 [0, 0, 0, 0, 0, ...]
country_ICELAND u8 [0, 0, 0, 0, 0, ...]
country_INDIA u8 [0, 0, 0, 0, 0, ...]
country_INDONESIA u8 [0, 0, 0, 0, 0, ...]
country_IRAQ u8 [0, 0, 0, 0, 0, ...]
country_IRELAND u8 [0, 0, 0, 0, 0, ...]
country_ISLAMIC REPUBLIC OF IRAN u8 [0, 0, 0, 0, 0, ...]
country_ISRAEL u8 [0, 0, 0, 0, 0, ...]
country_ITALY (INCLUDING SAN MARINO) u8 [0, 0, 0, 0, 0, ...]
country_JAMAICA u8 [0, 0, 0, 0, 0, ...]
country_JAPAN u8 [0, 0, 0, 0, 0, ...]
country_JORDAN u8 [0, 0, 0, 0, 0, ...]
country_KAZAKHSTAN u8 [0, 0, 0, 0, 0, ...]
country_KENYA u8 [0, 0, 0, 0, 0, ...]
country_KIRIBATI u8 [0, 0, 0, 0, 0, ...]
country_KUWAIT u8 [0, 0, 0, 0, 0, ...]
country_KYRGYZSTAN u8 [0, 0, 0, 0, 0, ...]
country_LAO PEOPLE S DEMOCRATIC REPUBLIC u8 [0, 0, 0, 0, 0, ...]
country_LATVIA u8 [0, 0, 0, 0, 0, ...]
country_LEBANON u8 [0, 0, 0, 0, 0, ...]
country_LESOTHO u8 [0, 0, 0, 0, 0, ...]
country_LIBERIA u8 [0, 0, 0, 0, 0, ...]
country_LIBYAN ARAB JAMAHIRIYAH u8 [0, 0, 0, 0, 0, ...]
country_LIECHTENSTEIN u8 [0, 0, 0, 0, 0, ...]
country_LITHUANIA u8 [0, 0, 0, 0, 0, ...]
country_LUXEMBOURG u8 [0, 0, 0, 0, 0, ...]
country_MACAU SPECIAL ADMINSTRATIVE REGION OF CHINA u8 [0, 0, 0, 0, 0, ...]
country_MACEDONIA u8 [0, 0, 0, 0, 0, ...]
country_MADAGASCAR u8 [0, 0, 0, 0, 0, ...]
country_MALAWI u8 [0, 0, 0, 0, 0, ...]
country_MALAYSIA u8 [0, 0, 0, 0, 0, ...]
country_MALDIVES u8 [0, 0, 0, 0, 0, ...]
country_MALI u8 [0, 0, 0, 0, 0, ...]
country_MALTA u8 [0, 0, 0, 0, 0, ...]
country_MARSHALL ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_MARTINIQUE u8 [0, 0, 0, 0, 0, ...]
country_MAURITANIA u8 [0, 0, 0, 0, 0, ...]
country_MAURITIUS u8 [0, 0, 0, 0, 0, ...]
country_MEXICO u8 [0, 0, 0, 0, 0, ...]
country_MONGOLIA u8 [0, 0, 0, 0, 0, ...]
country_MONTENEGRO u8 [0, 0, 0, 0, 0, ...]
country_MONTSERRAT u8 [0, 0, 0, 0, 0, ...]
country_MOROCCO u8 [0, 0, 0, 0, 0, ...]
country_MOZAMBIQUE u8 [0, 0, 0, 0, 0, ...]
country_MYANMAR (FORMERLY BURMA) u8 [0, 0, 0, 0, 0, ...]
country_NAMIBIA u8 [0, 0, 0, 0, 0, ...]
country_NAURU u8 [0, 0, 0, 0, 0, ...]
country_NEPAL u8 [0, 0, 0, 0, 0, ...]
country_NETHERLAND ANTILLES u8 [0, 0, 0, 0, 0, ...]
country_NETHERLANDS u8 [0, 0, 0, 0, 0, ...]
country_NEW CALEDONIA u8 [0, 0, 0, 0, 0, ...]
country_NEW ZEALAND u8 [0, 0, 0, 0, 0, ...]
country_NICARAGUA u8 [0, 0, 0, 0, 0, ...]
country_NIGER u8 [0, 0, 0, 0, 0, ...]
country_NIGERIA u8 [0, 0, 0, 0, 0, ...]
country_NIUE u8 [0, 0, 0, 0, 0, ...]
country_NORWAY u8 [0, 0, 0, 0, 0, ...]
country_OCCUPIED PALESTINIAN TERRITORY u8 [0, 0, 0, 0, 0, ...]
country_OMAN u8 [0, 0, 0, 0, 0, ...]
country_PAKISTAN u8 [0, 0, 0, 0, 0, ...]
country_PALAU u8 [0, 0, 0, 0, 0, ...]
country_PANAMA u8 [0, 0, 0, 0, 0, ...]
country_PAPUA NEW GUINEA u8 [0, 0, 0, 0, 0, ...]
country_PARAGUAY u8 [0, 0, 0, 0, 0, ...]
country_PERU u8 [0, 0, 0, 0, 0, ...]
country_PHILIPPINES u8 [0, 0, 0, 0, 0, ...]
country_PLURINATIONAL STATE OF BOLIVIA u8 [0, 0, 0, 0, 0, ...]
country_POLAND u8 [0, 0, 0, 0, 0, ...]
country_PORTUGAL u8 [0, 0, 0, 0, 0, ...]
country_QATAR u8 [0, 0, 0, 0, 0, ...]
country_REPUBLIC OF CAMEROON u8 [0, 0, 0, 0, 0, ...]
country_REPUBLIC OF KOREA u8 [0, 0, 0, 0, 0, ...]
country_REPUBLIC OF MOLDOVA u8 [0, 0, 0, 0, 0, ...]
country_REUNION u8 [0, 0, 0, 0, 0, ...]
country_ROMANIA u8 [0, 0, 0, 0, 0, ...]
country_RUSSIAN FEDERATION u8 [0, 0, 0, 0, 0, ...]
country_RWANDA u8 [0, 0, 0, 0, 0, ...]
country_SAINT HELENA u8 [0, 0, 0, 0, 0, ...]
country_SAINT LUCIA u8 [0, 0, 0, 0, 0, ...]
country_SAMOA u8 [0, 0, 0, 0, 0, ...]
country_SAO TOME & PRINCIPE u8 [0, 0, 0, 0, 0, ...]
country_SAUDI ARABIA u8 [0, 0, 0, 0, 0, ...]
country_SENEGAL u8 [0, 0, 0, 0, 0, ...]
country_SERBIA u8 [0, 0, 0, 0, 0, ...]
country_SEYCHELLES u8 [0, 0, 0, 0, 0, ...]
country_SIERRA LEONE u8 [0, 0, 0, 0, 0, ...]
country_SINGAPORE u8 [0, 0, 0, 0, 0, ...]
country_SLOVAKIA u8 [0, 0, 0, 0, 0, ...]
country_SLOVENIA u8 [0, 0, 0, 0, 0, ...]
country_SOLOMON ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_SOMALIA u8 [0, 0, 0, 0, 0, ...]
country_SOUTH AFRICA u8 [0, 0, 0, 0, 0, ...]
country_SPAIN u8 [0, 0, 0, 0, 0, ...]
country_SRI LANKA u8 [0, 0, 0, 0, 0, ...]
country_ST. KITTS-NEVIS u8 [0, 0, 0, 0, 0, ...]
country_ST. PIERRE & MIQUELON u8 [0, 0, 0, 0, 0, ...]
country_ST. VINCENT & THE GRENADINES u8 [0, 0, 0, 0, 0, ...]
country_SUDAN u8 [0, 0, 0, 0, 0, ...]
country_SURINAME u8 [0, 0, 0, 0, 0, ...]
country_SWAZILAND u8 [0, 0, 0, 0, 0, ...]
country_SWEDEN u8 [0, 0, 0, 0, 0, ...]
country_SWITZERLAND u8 [0, 0, 0, 0, 0, ...]
country_SYRIAN ARAB REPUBLIC u8 [0, 0, 0, 0, 0, ...]
country_TAIWAN u8 [0, 0, 0, 0, 0, ...]
country_TAJIKISTAN u8 [0, 0, 0, 0, 0, ...]
country_THAILAND u8 [0, 0, 0, 0, 0, ...]
country_TIMOR-LESTE (FORMERLY EAST TIMOR) u8 [0, 0, 0, 0, 0, ...]
country_TOGO u8 [0, 0, 0, 0, 0, ...]
country_TONGA u8 [0, 0, 0, 0, 0, ...]
country_TRINIDAD AND TOBAGO u8 [0, 0, 0, 0, 0, ...]
country_TUNISIA u8 [0, 0, 0, 0, 0, ...]
country_TURKEY u8 [0, 0, 0, 0, 0, ...]
country_TURKMENISTAN u8 [0, 0, 0, 0, 0, ...]
country_TURKS AND CAICOS ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_TUVALU u8 [0, 0, 0, 0, 0, ...]
country_UGANDA u8 [0, 0, 0, 0, 0, ...]
country_UKRAINE u8 [0, 0, 0, 0, 0, ...]
country_UNITED ARAB EMIRATES u8 [0, 0, 0, 0, 0, ...]
country_UNITED KINGDOM u8 [0, 0, 0, 0, 0, ...]
country_UNITED REPUBLIC OF TANZANIA u8 [0, 0, 0, 0, 0, ...]
country_UNITED STATES OF AMERICA u8 [0, 0, 0, 0, 0, ...]
country_URUGUAY u8 [0, 0, 0, 0, 0, ...]
country_UZBEKISTAN u8 [0, 0, 0, 0, 0, ...]
country_VANUATU u8 [0, 0, 0, 0, 0, ...]
country_VENEZUELA u8 [0, 0, 0, 0, 0, ...]
country_VIET NAM u8 [0, 0, 0, 0, 0, ...]
country_WALLIS AND FUTUNA ISLANDS u8 [0, 0, 0, 0, 0, ...]
country_YEMEN u8 [0, 0, 0, 0, 0, ...]
country_ZAMBIA u8 [0, 0, 0, 0, 0, ...]
country_ZIMBABWE u8 [0, 0, 0, 0, 0, ...]
country_BONAIRE, SAINT EUSTATIUS, AND SABA u8 [0, 0, 0, 0, 0, ...]
country_CURACAO u8 [0, 0, 0, 0, 0, ...]
country_REPUBLIC OF SOUTH SUDAN u8 [0, 0, 0, 0, 0, ...]
country_REPUBLIC OF SUDAN u8 [0, 0, 0, 0, 0, ...]
country_SAINT MARTIN (DUTCH PORTION) u8 [0, 0, 0, 0, 0, ...]
>
Sampling
DF.sample(df, 10)
#Explorer.DataFrame<
Polars[10 x 10]
year s64 [2010, 2013, 2013, 2012, 2010, ...]
country string ["MADAGASCAR", "TOGO", "COLOMBIA", "MYANMAR (FORMERLY BURMA)", "CYPRUS", ...]
total s64 [534, 725, 24441, 3019, 2102, ...]
solid_fuel s64 [29, 0, 3581, 301, 19, ...]
liquid_fuel s64 [483, 470, 11564, 1613, 1901, ...]
gas_fuel s64 [0, 11, 6306, 976, 0, ...]
cement s64 [22, 244, 1530, 125, 181, ...]
gas_flaring s64 [0, 0, 1459, 3, 0, ...]
per_capita f64 [0.03, 0.1, 0.52, 0.06, 1.9, ...]
bunker_fuels s64 [54, 77, 1547, 31, 391, ...]
>
DF.sample(df, 0.4)
#Explorer.DataFrame<
Polars[437 x 10]
year s64 [2013, 2013, 2010, 2012, 2010, ...]
country string ["PALAU", "NAURU", "NIUE", "GREECE", "MACEDONIA", ...]
total s64 [70, 12, 1, 21828, 2346, ...]
solid_fuel s64 [0, 0, 0, 8760, 1426, ...]
liquid_fuel s64 [70, 12, 1, 10099, 749, ...]
gas_fuel s64 [0, 0, 0, 2287, 60, ...]
cement s64 [0, 0, 0, 681, 112, ...]
gas_flaring s64 [0, 0, 0, 0, 0, ...]
per_capita f64 [3.36, 1.16, 0.52, 1.96, 1.14, ...]
bunker_fuels s64 [15, 4, 0, 2545, 6, ...]
>
DF.sample(df, 10000, replace: true)
#Explorer.DataFrame<
Polars[10000 x 10]
year s64 [2010, 2013, 2014, 2013, 2013, ...]
country string ["MALAYSIA", "LESOTHO", "UZBEKISTAN", "CROATIA", "ICELAND", ...]
total s64 [59579, 664, 28692, 4786, 518, ...]
solid_fuel s64 [15103, 500, 1677, 699, 101, ...]
liquid_fuel s64 [20006, 164, 2086, 2344, 417, ...]
gas_fuel s64 [16596, 0, 23929, 1425, 0, ...]
cement s64 [2688, 0, 1000, 317, 0, ...]
gas_flaring s64 [5186, 0, 0, 0, 0, ...]
per_capita f64 [2.12, 0.32, 0.97, 1.12, 1.59, ...]
bunker_fuels s64 [1432, 0, 0, 98, 156, ...]
>
Pull and Slice
Slicing and dicing can be done with the Access
protocol or with explicit pull/slice/take functions
df["year"]
#Explorer.Series<
Polars[1094]
s64 [2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, ...]
>
DF.pull(df, "year")
#Explorer.Series<
Polars[1094]
s64 [2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, ...]
>
df[["year", "country"]]
#Explorer.DataFrame<
Polars[1094 x 2]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
>
DF.slice(df, [1, 20, 50])
#Explorer.DataFrame<
Polars[3 x 10]
year s64 [2010, 2010, 2010]
country string ["ALBANIA", "BENIN", "DEMOCRATIC REPUBLIC OF THE CONGO (FORMERLY ZAIRE)"]
total s64 [1254, 1388, 551]
solid_fuel s64 [117, 0, 0]
liquid_fuel s64 [953, 1211, 471]
gas_fuel s64 [7, 0, 12]
cement s64 [177, 177, 67]
gas_flaring s64 [0, 0, 0]
per_capita f64 [0.43, 0.15, 0.01]
bunker_fuels s64 [7, 127, 126]
>
DF.slice(df, -10, 5)
#Explorer.DataFrame<
Polars[5 x 10]
year s64 [2014, 2014, 2014, 2014, 2014]
country string ["UNITED STATES OF AMERICA", "URUGUAY", "UZBEKISTAN", "VANUATU", "VENEZUELA"]
total s64 [1432855, 1840, 28692, 42, 50510]
solid_fuel s64 [450047, 2, 1677, 0, 204]
liquid_fuel s64 [576531, 1700, 2086, 42, 28445]
gas_fuel s64 [390719, 25, 23929, 0, 12731]
cement s64 [11314, 112, 1000, 0, 1088]
gas_flaring s64 [4244, 0, 0, 0, 8042]
per_capita f64 [4.43, 0.54, 0.97, 0.16, 1.65]
bunker_fuels s64 [30722, 251, 0, 10, 1256]
>
DF.slice(df, 10, 5)
#Explorer.DataFrame<
Polars[5 x 10]
year s64 [2010, 2010, 2010, 2010, 2010]
country string ["AUSTRALIA", "AUSTRIA", "AZERBAIJAN", "BAHAMAS", "BAHRAIN"]
total s64 [106589, 18408, 8366, 451, 7981]
solid_fuel s64 [56257, 3537, 6, 1, 0]
liquid_fuel s64 [31308, 9218, 2373, 450, 1123]
gas_fuel s64 [17763, 5073, 4904, 0, 6696]
cement s64 [1129, 579, 174, 0, 163]
gas_flaring s64 [132, 0, 909, 0, 0]
per_capita f64 [4.81, 2.19, 0.92, 1.25, 6.33]
bunker_fuels s64 [3307, 575, 398, 179, 545]
>
DF.slice(df, 12..42)
#Explorer.DataFrame<
Polars[31 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AZERBAIJAN", "BAHAMAS", "BAHRAIN", "BANGLADESH", "BARBADOS", ...]
total s64 [8366, 451, 7981, 16345, 403, ...]
solid_fuel s64 [6, 1, 0, 839, 0, ...]
liquid_fuel s64 [2373, 450, 1123, 2881, 363, ...]
gas_fuel s64 [4904, 0, 6696, 10753, 8, ...]
cement s64 [174, 0, 163, 1873, 31, ...]
gas_flaring s64 [909, 0, 0, 0, 1, ...]
per_capita f64 [0.92, 1.25, 6.33, 0.11, 1.44, ...]
bunker_fuels s64 [398, 179, 545, 313, 108, ...]
>
Joins
Joining is fast and easy. Specify the columns to join on and how to join. Polars even supports cartesian (cross) joins, so Explorer
does too
df1 = DF.select(df, ["year", "country", "total"])
df2 = DF.select(df, ["year", "country", "cement"])
DF.join(df1, df2)
#Explorer.DataFrame<
Polars[1094 x 4]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
>
df3 = df |> DF.select(["year", "cement"]) |> DF.slice(0, 500)
#Explorer.DataFrame<
Polars[500 x 2]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
>
Grouping
Explorer
supports groupby operations. These operations are limited based on what is possible in Polars, but they do most of what you need to do.
grouped = DF.group_by(df, ["country"])
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
DF.groups(grouped)
["country"]
DF.ungroup(grouped)
#Explorer.DataFrame<
Polars[1094 x 10]
year s64 [2010, 2010, 2010, 2010, 2010, ...]
country string ["AFGHANISTAN", "ALBANIA", "ALGERIA", "ANDORRA", "ANGOLA", ...]
total s64 [2308, 1254, 32500, 141, 7924, ...]
solid_fuel s64 [627, 117, 332, 0, 0, ...]
liquid_fuel s64 [1601, 953, 12381, 141, 3649, ...]
gas_fuel s64 [74, 7, 14565, 0, 374, ...]
cement s64 [5, 177, 2598, 0, 204, ...]
gas_flaring s64 [0, 0, 2623, 0, 3697, ...]
per_capita f64 [0.08, 0.43, 0.9, 1.68, 0.37, ...]
bunker_fuels s64 [9, 7, 663, 0, 321, ...]
>
What most really care about is aggregation. Which country has the max per_capita
value?
grouped
|> DF.summarise(max_per_capita: max(per_capita))
|> DF.sort_by(desc: max_per_capita)
#Explorer.DataFrame<
Polars[222 x 2]
country string ["QATAR", "CURACAO", "TRINIDAD AND TOBAGO", "KUWAIT", "NETHERLAND ANTILLES", ...]
max_per_capita f64 [13.54, 10.72, 9.84, 8.16, 7.45, ...]
>