Deepgram Listen (Speech-to-Text) Examples

transcription_examples.livemd

Alex Filatov

@alexfilatov

deepgram-elixir-sdk

Share to X

Share to Bluesky

More notebooks

Deepgram Listen (Speech-to-Text) Examples

Mix.install([
  {:deepgram, "~> 0.1"},
  {:kino, "~> 0.9"}
])

Introduction

This notebook demonstrates how to use Deepgram’s Speech-to-Text (Listen) API through the Elixir SDK. We’ll explore:

Prerecorded audio transcription
Live audio streaming
Various transcription features (speaker diarization, smart formatting, etc.)

Setup

First, let’s set up our Deepgram client with our API key:

api_key_input = Kino.Input.password("Deepgram API Key")

api_key = Kino.Input.read(api_key_input)
client = Deepgram.new(api_key: api_key)

Transcribing Audio from URL

# Example public audio URL
audio_url = "https://static.deepgram.com/examples/interview_speech-analytics.wav"

# Basic transcription
{:ok, response} = Deepgram.Listen.transcribe_url(client, %{url: audio_url})

# Display the transcript
response["results"]["channels"]
|> Enum.at(0)
|> Map.get("alternatives")
|> Enum.at(0)
|> Map.get("transcript")

Enhanced Transcription Options

# More options for better transcription quality
{:ok, response} = Deepgram.Listen.transcribe_url(
  client,
  %{url: audio_url},
  %{
    model: "nova-2",          # Use the latest model for best results
    smart_format: true,       # Format numbers, dates, etc.
    punctuate: true,          # Add punctuation
    diarize: true,            # Identify different speakers
    paragraphs: true,         # Organize into paragraphs
    utterances: true,         # Split by speaker turns
    detect_language: true     # Detect the spoken language
  }
)

# Display the enhanced transcript with speakers
transcript_with_speakers = response["results"]["channels"]
|> Enum.at(0)
|> Map.get("alternatives")
|> Enum.at(0)
|> Map.get("paragraphs")
|> Map.get("transcript")

transcript_with_speakers

Transcribing Audio Files

First, let’s download a sample file to work with:

# Download a sample audio file
sample_file_path = "/tmp/sample_audio.wav"
sample_url = "https://static.deepgram.com/examples/sample1.wav"

{:ok, %HTTPoison.Response{body: file_content}} = HTTPoison.get(sample_url)
File.write!(sample_file_path, file_content)

# Now transcribe the file
{:ok, audio_data} = File.read(sample_file_path)
{:ok, file_response} = Deepgram.Listen.transcribe_file(
  client, 
  audio_data,
  %{model: "nova-2"}
)

file_transcript = file_response["results"]["channels"]
|> Enum.at(0)
|> Map.get("alternatives")
|> Enum.at(0)
|> Map.get("transcript")

file_transcript

Async Transcription with Callbacks

For longer audio files, you might want to use async transcription:

# Note: You need a publicly accessible webhook URL for this to work
webhook_url = "https://webhook.site/your-unique-id"

{:ok, async_response} = Deepgram.Listen.transcribe_url_callback(
  client,
  %{url: "https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav"},
  webhook_url,
  %{model: "nova-2"}
)

# This will return a request_id that you can use to track the transcription
async_response

Live Audio Streaming

For live audio streaming, you’ll use WebSockets. Here’s how to set up a live transcription session:

# This would typically be in a supervision tree in a real application
# For demonstration purposes only
{:ok, websocket} = Deepgram.Listen.live_transcription(
  client,
  %{
    model: "nova-2",
    interim_results: true,
    punctuate: true,
    encoding: "linear16",
    sample_rate: 16000,
    channels: 1
  }
)

# In a real application, you would send audio data like this:
# Deepgram.Listen.WebSocket.send_audio(websocket, audio_chunk)

# And handle messages in a receive block:
# receive do
#   {:deepgram_result, result} -> 
#     transcript = result["channel"]["alternatives"] |> hd |> Map.get("transcript")
#     IO.puts("Transcript: #{transcript}")
#   {:deepgram_error, error} -> 
#     IO.puts("Error: #{inspect(error)}")
# end

# For this example, we'll just close the connection after a few seconds
Process.sleep(2000)
Deepgram.Listen.WebSocket.close(websocket)

Advanced Features

Speech Recognition with Redaction

{:ok, redacted_response} = Deepgram.Listen.transcribe_url(
  client,
  %{url: "https://static.deepgram.com/examples/sample_with_pii.wav"},
  %{
    model: "nova-2",
    redact: ["pci", "ssn", "pii"],  # Redact personally identifiable information
    redact_replace: "[REDACTED]"    # Replace redacted content with this text
  }
)

redacted_transcript = redacted_response["results"]["channels"]
|> Enum.at(0)
|> Map.get("alternatives")
|> Enum.at(0)
|> Map.get("transcript")

redacted_transcript

Topic Detection in Audio

{:ok, topics_response} = Deepgram.Listen.transcribe_url(
  client,
  %{url: "https://static.deepgram.com/examples/financial-call.wav"},
  %{
    model: "nova-2",
    detect_topics: true  # Identify topics in the audio
  }
)

# Extract topics
topics = topics_response["results"]["channels"]
|> Enum.at(0)
|> Map.get("alternatives")
|> Enum.at(0)
|> Map.get("topics")

topics

Sentiment Analysis in Audio

{:ok, sentiment_response} = Deepgram.Listen.transcribe_url(
  client,
  %{url: "https://static.deepgram.com/examples/positive-review.wav"},
  %{
    model: "nova-2",
    detect_sentiment: true  # Analyze sentiment in the audio
  }
)

# Extract sentiment
sentiment = sentiment_response["results"]["channels"]
|> Enum.at(0)
|> Map.get("alternatives")
|> Enum.at(0)
|> Map.get("sentiment")

sentiment

Conclusion

These examples demonstrate the capabilities of Deepgram’s Speech-to-Text API through the Elixir SDK. You can combine different features to create powerful applications that understand spoken language.

For more information, refer to:

Other notebooks:

@TomBers

livebookNotes

Attractors

attractors.livemd

decimal vega_lite kino

2022-8-18
Kevin Pan
@feng19

spider_man

ElixirJobs

elixirjobs.livemd

spider_man floki nimble_csv kino

2022-8-18
@TomBers

livebookNotes

Fun with Graphs

graphs.livemd

vega_lite kino math

2022-8-18
@TomBers

livebookNotes

Epicycloid - draw Curves with Straight Lines

Epicycloid.livemd

vega_lite kino math

2022-8-18
Michael Crumm
@mcrumm

req_sandbox

Ecto SQL Sandbox Usage Guide

usage.livemd

req req_sandbox jason phoenix phoenix_ecto ecto_sql postgrex kino bandit

2022-10-30
Olli Varis
@ovaris

livebooks

Electricity prices in Finland

sahkonhinnat.livemd

kino vega_lite kino_vega_lite jason httpoison plug_cowboy

2024-9-12
Stewart
@imakestews

cur

Book Changeset

book_changeset.livemd

jason kino youtube hidden_cell ecto

2025-6-30

Back