Powered by AppSignal & Oban Pro
Would you like to see your link here? Contact us

Monocular Depth Estimation by MiDaS v2.1

MiDaS.livemd

Monocular Depth Estimation by MiDaS v2.1

0.Original work

Intelligent Systems Lab Org:
“Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer”

Thanks a lot!!!


Implementation for Elixir/Nerves using TflInterp

1.Helper module

Create the module to assist with tasks such as downloading a model.

defmodule Model do
  @model_file "midas_opt.tflite"

  @wearhouse "https://github.com/isl-org/MiDaS/releases/download/v2_1/model_opt.tflite"
  @local "/data/#{@model_file}"

  def file() do
    @local
  end

  def get() do
    Req.get!(@wearhouse).body
    |> then(fn x -> File.write(@local, x) end)
  end

  def rm() do
    File.rm(@local)
  end

  def exists?() do
    File.exists?(@local)
  end
end

Get the tflite model from @wearhouse and store it in @local.

Model.get()

2.Defining the inference module: Midas

  • Pre-processing:
    Resize the input image to the size of @midas_shape and create a Float32 binary sequence normalized to the range {-2.0, 2.0}.

  • Post-processing:
    Cut the f32 depth image at its min-max value and map it to a 0-255 gray image.

defmodule Midas do
  #use TflInterp, model: Model.file()
  use TflInterp

  @midas_shape {256, 256}

  def apply(img) do
    # preprocess
    bin =
      img
      |> CImg.resize(@midas_shape)
      |> CImg.to_binary(range: {-2.0, 2.0})

    # prediction
    outputs =
      __MODULE__
      |> TflInterp.set_input_tensor(0, bin)
      |> TflInterp.invoke()
      |> TflInterp.get_output_tensor(0)
      |> Nx.from_binary({:f, 32})
      |> Nx.reshape({256, 256})

    # postprocess
    [min, max] =
      [Nx.window_min(outputs, {256, 256}), Nx.window_max(outputs, {256, 256})]
      |> Enum.map(&Nx.squeeze/1)
      |> Enum.map(&Nx.to_number/1)

    _result =
      outputs
      |> Nx.subtract(min)
      |> Nx.divide(max - min)
      |> Nx.to_binary()
      |> CImg.from_binary(256, 256, 1, 1, dtype: " Midas.apply()
|> CImg.resize({320, 240})
|> CImg.color_mapping(:jet)
|> CImg.display_kino(:jpeg)

4.TIL ;-)

Date: Feb. 5, 2022 / Nerves-livebook rpi3

It takes a long time to quantize the depth image in post-processing,

The heatmap scale (256) is narrow, so you may not see the depth details.

License

Copyright 2022 Shozo Fukuda. Apache License Version 2.0