Powered by AppSignal & Oban Pro
Would you like to see your link here? Contact us

Hand Tracking by SSD MobileNet

HandTrac.livemd

Hand Tracking by SSD MobileNet

0.Original work

Victor Dibia’s “Real-time Hand-Detection using Neural Networks (SSD) on Tensorflow”

https://github.com/victordibia/handtracking

From his github: > Both examples above were run on a macbook pro CPU (i7, 2.5GHz, 16GB). Some fps numbers are: > > | FPS | Image Size | Device | Comments | > |—-|———-|—————————-|—————————————————–| > | 21 | 320 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run without visualizing results | > | 16 | 320 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) | > | 11 | 640 * 480 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) |


Shubham Panchal’s “Hand Detection using TFLite in Android”

https://github.com/shubham0204/Hand_Detection_TFLite_Android

Thanks Victor and Shubham!!!


Implementation for Elixir/Nerves using TflInterp

1.Helper module

Create the module to assist with tasks such as downloading a model.

defmodule Model do
  @model_file "hand_trac.tflite"

  @wearhouse "https://github.com/shoz-f/tinyML_livebook/releases/download/model/#{@model_file}"
  @local "/data/#{@model_file}"

  def file() do
    @local
  end

  def get() do
    Req.get!(@wearhouse).body
    |> then(fn x -> File.write(@local, x) end)
  end

  def rm() do
    File.rm(@local)
  end

  def exists?() do
    File.exists?(@local)
  end
end

Get the tflite model from @wearhouse and store it in @local.

Model.get()

2.Defining the inference module: HandTrac

  • Pre-processing:
    Resize the input image to the size of @handtrack_shape and create a Float32 binary sequence normalized to the range {-1.0, 1.0}.

  • Post-processing:
    Extract the BBOXes with scores that exceed the threshold @threshold from the inference results.

defmodule HandTrac do
  # TflInterp, model: Model.file()
  use TflInterp

  @handtrack_shape {300, 300}
  @threshold 0.9

  alias CImg.Builder

  def apply(jpeg) do
    # preprocess
    bin =
      CImg.from_binary(jpeg)
      |> CImg.resize(@handtrack_shape)
      |> CImg.to_binary(range: {-1.0, 1.0})

    # prediction
    __MODULE__
    |> TflInterp.set_input_tensor(0, bin)
    |> TflInterp.invoke()

    [bboxes, scores] =
      for i <- [0, 2] do
        TflInterp.get_output_tensor(__MODULE__, i)
        |> Nx.from_binary({:f, 32})
        |> Nx.reshape({10, :auto})
      end

    # postprocess
    index =
      Nx.to_flat_list(scores)
      |> Enum.with_index()
      |> (&amp;(for {score, index} <- &amp;1, score >= @threshold do index end)).()

    unless index == [] do
      bboxes
      |> Nx.take(Nx.tensor(index))
      |> Nx.to_batched_list(1)
    else
      []
    end
  end

  def draw_result(results, jpeg) do
    Enum.reduce(results, Builder.from_binary(jpeg), fn box, canvas ->
      [y1, x1, y2, x2] = Nx.to_flat_list(box)
      CImg.draw_rect(canvas, x1, y1, x2, y2, {255, 0, 0})
    end)
    |> Builder.runit()
    |> CImg.resize({640, 480})
    |> CImg.to_binary(:jpeg)
  end
end

Launch HandTrac.

HandTrac.start_link(model: Model.file())

Displays the properties of the HandTrac model.

TflInterp.info(HandTrac)

3.Let’s try it

In one shot.

img = Picam.next_frame()

HandTrac.apply(img)
|> HandTrac.draw_result(img)
|> Kino.Image.new(:jpeg)

In continuous shooting.

Kino.animate(10, 0, fn i ->
  img = Picam.next_frame()

  res =
    HandTrac.apply(img)
    |> HandTrac.draw_result(img)
    |> Kino.Image.new(:jpeg)

  {:cont, res, i + 1}
  #:halt
end)

4.TIL ;-)

Date: Feb. 3, 2022 / Nerves-livebook rpi3

It is still not at a practical level. So far, there seems to be no problem in the post-processing, which is computationally inexpensive. However, the pre-processing and inference sections, which process large amounts of data, take a lot of processing time.

Problems to be solved:

  • Both the inference time(about 3 FPS) and the input data transfer time are too time-consuming.

License

Copyright 2022 Shozo Fukuda. Apache License Version 2.0