State of the art Neural Networks in Elixir
alias AIPlayground.KinoUI
My Elixir History
- first got in contact with Elixir in 2014 when starting my Bachelor’s Thesis
- after that I have started using Elixir for most of my personal and university projects
- used it for all my thesis projects (Bachelor’s, Master’s, PhD)
- first professional experience at 8x8 on a big online communication tool
- currently working at Octoscreen, web project with crawling and AI
Agenda
- Text generation
- Text classification
- Token classification
- Image classification
- Text to image
- Translations and summarization
- Live demo on Neural Network building
- Some models we cannot run on Beam yet
- Bonus model
Intro
- neural networks, models
- we are mainly talking about open source models from Huggingface
-
possible ways to use ML models in Elixir:
- implement custom Neural Networks using Axon (on top of Nx)
- use Bumblebee (on top of Axon)
- use native code (especially Rust with rust-bert)
- export model from other language using ONNX
- using REST APIs
- call Python scripts using system commands
- with the exception of live demo we will not focus on code, this is not a technical presentation, but more like a demonstration
- repo with everything related to this presentation
- after the presentation we can discuss more in detail
repository_id = "CompVis/stable-diffusion-v1-4"
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/clip-vit-large-patch14"})
{:ok, clip} =
Bumblebee.load_model({:hf, repository_id, subdir: "text_encoder"},
log_params_diff: false
)
{:ok, unet} =
Bumblebee.load_model({:hf, repository_id, subdir: "unet"},
params_filename: "diffusion_pytorch_model.bin",
log_params_diff: false
)
{:ok, vae} =
Bumblebee.load_model({:hf, repository_id, subdir: "vae"},
architecture: :decoder,
params_filename: "diffusion_pytorch_model.bin",
log_params_diff: false
)
{:ok, scheduler} = Bumblebee.load_scheduler({:hf, repository_id, subdir: "scheduler"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, repository_id, subdir: "feature_extractor"})
{:ok, safety_checker} =
Bumblebee.load_model({:hf, repository_id, subdir: "safety_checker"},
log_params_diff: false
)
serving =
Bumblebee.Diffusion.StableDiffusion.text_to_image(clip, unet, vae, tokenizer, scheduler,
num_steps: 10,
num_images_per_prompt: 1,
safety_checker: safety_checker,
safety_checker_featurizer: featurizer,
compile: [batch_size: 1, sequence_length: 50],
defn_options: [compiler: EXLA]
)
text_input =
Kino.Input.textarea("Text",
default: "numbat, forest, high quality, detailed, digital art"
)
form = Kino.Control.form([text: text_input], submit: "Run")
frame = Kino.Frame.new()
form
|> Kino.Control.stream()
|> Kino.listen(fn %{data: %{text: text}} ->
Kino.Frame.render(frame, Kino.Markdown.new("Running..."))
output = Nx.Serving.run(serving, text)
for result <- output.results do
Kino.Image.new(result.image)
end
|> Kino.Layout.grid(columns: 2)
|> then(&Kino.Frame.render(frame, &1))
end)
Kino.Layout.grid([form, frame], boxed: true, gap: 16)
1. Text generation using GPT2
- Generative Pre-trained Transformer
- basis for ChatGPT
- using the Bumblebee Elixir library
model_runner = &AIPlayground.run_gpt2/1
[
KinoUI.text_to_text("Good example", "What is the capital of Romania?", model_runner),
KinoUI.text_to_text("Missing ending", "I was driving and then", model_runner),
KinoUI.text_to_text("Same ending", "I was walking and", model_runner),
KinoUI.text_to_text("Just repeating", "I am done.", model_runner),
KinoUI.text_to_text("Again repeating", "I am 18 years old and", model_runner)
]
|> List.flatten()
|> KinoUI.build_grid_layout()
2. Text classification using RoBERTa
- RoBERTa = A Robustly Optimized BERT Pretraining Approach - training recipe to improve BERT
- from Facebook AI
- BERT = Bidirectional Encoder Representations from Transformers
- variant of BERT, which has some variation that is used for Google’s Search Engine and Microsoft’s Bing Search
- using the Bumblebee Elixir library
model_runner = &AIPlayground.run_roberta/1
[
KinoUI.text_to_scored_list("Surprise", "Oh wow! I did not know that!", model_runner),
KinoUI.text_to_scored_list("Joy", "That is so nice, I like it!", model_runner),
KinoUI.text_to_scored_list("Sadness", "I am all alone", model_runner),
KinoUI.text_to_scored_list(
"Tricky joy",
"I'm reading a book on anti-gravity. It's impossible to put down!",
model_runner
),
KinoUI.text_to_scored_list(
"Tricky sad",
"Why don't some couples go to the gym? Because some relationships don't work out.",
model_runner
)
]
|> List.flatten()
|> KinoUI.build_grid_layout()
3. Token classification using BERT NER
- using the Bumblebee Elixir library
model_runner = &AIPlayground.run_bert_ner/1
[
KinoUI.text_to_highlighted_text(
"Good example",
"Rachel Green works at Ralph Lauren in New York City in the sitcom Friends.",
model_runner
),
KinoUI.text_to_highlighted_text(
"Another good example",
"Apple Inc., founded by Steve Jobs and Steve Wozniak, is headquartered in Cupertino, California.",
model_runner
),
KinoUI.text_to_highlighted_text(
"Different language",
"Albert Einstein wurde am 14. März 1879 in Ulm, Deutschland geboren.",
model_runner
)
]
|> List.flatten()
|> KinoUI.build_grid_layout()
4. Image classification using ResNet
- Residual Networks
- used by Facebook’s DeepFace for face recognition
- using the Bumblebee Elixir library
model_runner = &AIPlayground.run_res_net/1
[
KinoUI.image_to_scored_list("Animals", model_runner)
]
|> List.flatten()
|> KinoUI.build_grid_layout()
5. Text to image using Stable diffusion
- relatively new model (from 2022) proposed as a competitor for generative models for image operations
- based on ResNet
- using the Bumblebee Elixir library
- needs GPU
"numbat, forest, high quality, detailed, digital art"
|> KinoUI.text_to_image_layout(&AIPlayground.run_stable_diffusion/1)
6. Text translation and summarization using T5
- we will use a NIF (Native Interface Function) built in Rust
- using the rustler Elixir library to access the rust-bert library
- translate English to Romanian
translation_model_runner = &AIPlayground.translate_en_to_ro/1
summarization_model_runner = &AIPlayground.summarize/1
summarization_positive_example = """
After the sound and the fury, weeks of demonstrations and anguished calls for racial justice, the man whose death gave rise to an international movement, and whose last words — “I can’t breathe” — have been a rallying cry, will be laid to rest on Tuesday at a private funeral in Houston.
George Floyd, who was 46, will then be buried in a grave next to his mother's.
The service, scheduled to begin at 11 a.m. at the Fountain of Praise church, comes after five days of public memorials in Minneapolis, North Carolina and Houston and two weeks after a Minneapolis police officer was caught on video pressing his knee into Mr. Floyd’s neck for nearly nine minutes before Mr. Floyd died.
That officer, Derek Chauvin, has been charged with second-degree murder and second-degree manslaughter.
His bail was set at $1.25 million in a court appearance on Monday.
The outpouring of anger and outrage after Mr. Floyd's death — and the speed at which protests spread from tense, chaotic demonstrations in the city where he died to an international movement from Rome to Rio de Janeiro — has reflected the depth of frustration borne of years of watching black people die at the hands of the police or vigilantes while calls for change went unmet.
"""
[
KinoUI.text_to_text("Good Translation", "Good morning!", translation_model_runner),
KinoUI.text_to_text(
"Good summarization",
summarization_positive_example,
summarization_model_runner
),
KinoUI.text_to_text("Bad translation", "Aloha!", translation_model_runner),
KinoUI.text_to_text("Bad summarization", "Hello", summarization_model_runner)
]
|> List.flatten()
|> KinoUI.build_grid_layout()
7. Demo building a custom Deep Neural Network
- we will build together an LSTM using the Axon Elixir library
- Axon helps us build Neural Networks in a similar manner as Keras does when using Python
- Axon is based on Nx (Numerical Elixir) which helps build efficient number crunching systems
8. Models we cannot use in Elixir yet
- Flair (https://huggingface.co/flair)
9. Bonus model
- chat with ChatGPT (GPT3.5) using the OpenAI REST API
- using the dvcrn/ex_openai Elixir library
Conclusions
- Neural networks are a good way to tackle many problems, especially ones human can do well
- Even if a given model’s first version seems hard to work with, humanity has proven to improve on them (just think about GPT history)
- now we can more or less safely choose Elixir for AI involved projects too, we have many ways to run models and pretty efficiently too
Glossary
- NN - Neural Network - a network of multiple neurons, this is how the brain is constructed, and this is what Artificial Neural Networks also try to mimic, in computer science they are basically just some interconnected nodes, which do some mathematical calculations and passing data to the next neuron
- AI - Artificial Intelligence -> algorithms that help tech mimic human intelligence
- ML - Machine Learning -> AI methods that work on the same basis: train, validate, infer, basically you train the machine to understand your data, then you validate its perfomance on some data not seen by the machine in the training process, and then use it to infer the same connections about data humans do not understand yet, this is also similar how humans are trained, think about lectures/homework for training, exams for validating, and putting the knowledge in practice for inferring
- ML model (or AI model) -> just a specific ML algorithm
- Huggingface - platform containing lots of ML models, letting you test them, use them for inference, and also helping you run the code on your device, basically a place where you search for tech to resolve your problem using AI
- Nx - Numerical Elixir -> a library for Elixir developers who want to implement efficient and performant number crunching operations (think like numpy for Python); it can use Google’s EXLA + Tensorflow or LibTorch for a backend
- Axon -> a library in Elixir to work with neural networks (think like Keras for Python), it works on top of Nx
- Bumblebee -> library containing pretrained and prebuilt neural networks, it works on top of Axon
- ONNX - Open Neural Network Exchange -> an open ecosystem that empowers AI developers to choose the right tools as their project evolves, ONNX provides an open source format for AI models, both deep learning and traditional ML.
- LSTM - Long Short-Term Memory -> a type of neural network layer that lets the neural network to memorize data, i.e. being able to factor in short and long contexts too