Powered by AppSignal & Oban Pro
Would you like to see your link here? Contact us

Monocular Depth Estimation: MiDaS v2.1

demo_midas/Midas.livemd

Monocular Depth Estimation: MiDaS v2.1

File.cd!(__DIR__)
# for windows JP
System.shell("chcp 65001")
System.put_env("NNCOMPILED", "YES")

Mix.install([
  {:tfl_interp, path: ".."},
  {:cimg, "~> 0.1.19"},
  {:nx, "~> 0.4.0"},
  {:kino, "~> 0.7.0"}
])

0.Original work

Intelligent Systems Lab Org:

“Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer”

Thanks a lot!!!


Implementation with TflInterp in Elixir

1.Defining the inference module: Midas

defmodule Midas do
  @width 256
  @height 256

  alias TflInterp, as: NNInterp

  use NNInterp,
    model: "./model/model_opt.tflite",
    url: "https://github.com/isl-org/MiDaS/releases/download/v2_1/model_opt.tflite",
    inputs: [f32: {1, @height, @width, 3}],
    outputs: [f32: {1, @height, @width, 3}]

  def apply(img) do
    # preprocess
    input0 =
      CImg.builder(img)
      |> CImg.resize({@width, @height})
      |> CImg.to_binary(range: {-1.0, 1.0})

    # prediction
    output =
      session()
      |> NNInterp.set_input_tensor(0, input0)
      |> NNInterp.invoke()
      |> NNInterp.get_output_tensor(0)
      |> Nx.from_binary(:f32)
      |> Nx.reshape({@height, @width})

    # postprocess
    [min, max] =
      [Nx.window_min(output, {@height, @width}), Nx.window_max(output, {@height, @width})]
      |> Enum.map(&Nx.squeeze/1)
      |> Enum.map(&Nx.to_number/1)

    {w, h, _, _} = CImg.shape(img)

    Nx.subtract(output, min)
    |> Nx.divide(max - min)
    |> Nx.to_binary()
    |> CImg.from_binary(@width, @height, 1, 1)
    |> CImg.resize({w, h})
  end
end

Launch Midas.

# TflInterp.stop(Midas)
Midas.start_link([])

Display the properties of the Midas model.

TflInterp.info(Midas)

2.Let’s try it

Load a photo and apply Midas to it.

img = CImg.load("sample.jpg")

result =
  Midas.apply(img)
  |> CImg.color_mapping(:jet)

Enum.map([img, result], &CImg.display_kino(&1, :jpeg))
|> Kino.Layout.grid(columns: 2)