Of course! sounddevice is a fantastic and popular Python library for audio processing. It provides a clean and powerful interface to PortAudio, the cross-platform audio I/O library.

Here’s a comprehensive guide covering what it is, how to install it, and how to use it for common tasks.
What is sounddevice?
sounddevice is a Python library for playing and recording audio streams. Its main features are:
- Cross-Platform: Works on Windows, macOS, and Linux.
- Simple API: The syntax is intuitive and easy to learn.
- Real-time Capabilities: Designed for low-latency audio input and output, making it great for live audio applications.
- NumPy Integration: It works seamlessly with NumPy arrays, which is the standard for numerical computing in Python. This makes it easy to process audio with libraries like SciPy and NumPy itself.
- Full-Duplex Support: You can record and play audio at the same time.
Under the Hood: It's a Python wrapper around the PortAudio library.
Installation
First, you need to install the library. It's recommended to install it using pip.

pip install sounddevice
Important Prerequisite: PortAudio
sounddevice is just a wrapper. You need the underlying PortAudio library installed on your system for it to work.
- On macOS (using Homebrew):
brew install portaudio
- On Debian/Ubuntu:
sudo apt-get update sudo apt-get install libportaudio2
- On Windows: The installer from
pipusually handles this automatically, but if you encounter issues, you may need to download the PortAudio binaries and add them to your system's PATH.
Core Concepts: NumPy and Audio
sounddevice represents audio as NumPy arrays.
- A mono audio signal is a 1D NumPy array (e.g.,
np.arrayof shape(N,)). - A stereo audio signal is a 2D NumPy array (e.g.,
np.arrayof shape(N, 2)), where each row contains the left and right channel samples for that time step. - The data type of the array is important. Common types are
float32(range -1.0 to 1.0) andint16(range -32768 to 32767).sounddeviceoften defaults tofloat32.
Common Use Cases with Code Examples
Let's dive into the most common tasks.
A. Playing a Sound
You can play a NumPy array or a WAV file directly.
Playing a NumPy Array (a simple sine wave)
This is the "Hello, World!" of audio programming.
import numpy as np
import sounddevice as sd
import time
# 1. Define parameters
sample_rate = 44100 # Hertz
frequency = 440 # Hertz (A4 note)
duration = 3 # seconds
# 2. Generate the audio data (a sine wave)
# t is a time vector from 0 to duration
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
# The sine wave formula: A * sin(2 * pi * f * t)
amplitude = 0.5 # Keep amplitude below 1.0 to avoid clipping
audio_data = amplitude * np.sin(2 * np.pi * frequency * t)
# 3. Play the audio
print(f"Playing a {frequency}Hz tone for {duration} seconds...")
sd.play(audio_data, samplerate=sample_rate)
# 4. Wait for the playback to finish before the script ends
sd.wait()
print("Playback finished.")
Playing a WAV File
sounddevice has a convenient play function that can read WAV files directly.
import sounddevice as sd
# The filename of your WAV file
filename = 'my_audio.wav'
print(f"Playing file: {filename}")
sd.play(filename) # sd.play can take a filename directly
sd.wait()
print("Playback finished.")
B. Recording Audio
Recording is just as straightforward. The sd.rec() function starts a recording and returns immediately. You must use sd.wait() to block until the recording is complete.
import numpy as np
import sounddevice as sd
# 1. Define parameters
duration = 5 # seconds
sample_rate = 44100
channels = 2 # for stereo recording
# 2. Start recording
# sd.rec() returns immediately, so we store the returned object
print("Recording started. Speak into your microphone...")
recording = sd.rec(int(duration * sample_rate),
samplerate=sample_rate,
channels=channels,
dtype='float32') # Use float32 for better processing
# 3. Wait for the recording to finish
sd.wait()
# 4. The recording is now a NumPy array
print("Recording finished.")
print(f"Recording shape: {recording.shape}") # Should be (duration * sample_rate, channels)
# You can now save or process the recording
# For example, save it to a WAV file
from scipy.io import wavfile
wavfile.write('my_recording.wav', sample_rate, recording)
print("Recording saved as 'my_recording.wav'")
C. Full-Duplex Audio (Simultaneous Recording and Playback)
This is a powerful feature. A classic example is an audio delay effect.
import numpy as np
import sounddevice as sd
import time
# Parameters
sample_rate = 44100
duration = 10 # seconds of audio to buffer
delay_seconds = 1.0
# Create a buffer to store audio
# We'll use a circular buffer approach
buffer_size = int(duration * sample_rate)
delay_samples = int(delay_seconds * sample_rate)
audio_buffer = np.zeros((buffer_size, 2), dtype='float32') # Stereo buffer
write_idx = 0
# The callback function is called by PortAudio for each audio block
def callback(indata, outdata, frames, time_info, status):
"""
This function is called in real-time.
- indata: The input audio (from microphone).
- outdata: The output audio (to speakers). We must fill this.
"""
global write_idx, audio_buffer
# 1. Read the new input and store it in our buffer
audio_buffer[write_idx : write_idx + frames] = indata
write_idx = (write_idx + frames) % buffer_size
# 2. Find the index to read from for the delayed sound
read_idx = (write_idx - delay_samples) % buffer_size
# 3. Get the delayed audio from the buffer
delayed_audio = audio_buffer[read_idx : read_idx + frames]
# 4. Mix the original input (indata) with the delayed audio
# This creates the echo effect.
outdata[:] = indata + 0.5 * delayed_audio # 0.5 is the volume of the echo
# Create the stream
# We need to specify the input and output devices
# device=(input_device_id, output_device_id) or None for default
# loopback=True can also be useful, but a callback is more flexible here
print("Starting full-duplex stream with a 1-second delay...")
with sd.InputStream(callback=callback,
samplerate=sample_rate,
channels=2,
blocksize=1024):
# The 'with' statement keeps the stream open.
# We just sleep for the desired duration.
time.sleep(duration)
print("Stream closed.")
D. Querying Audio Devices
You can list all available input and output devices to find their IDs, which you can then use to select a specific device (e.g., a USB microphone or headphones).
import sounddevice as sd
# Print a list of all devices
print("Available audio devices:")
sd.query_devices()
# You can also get a more compact list
print("\n--- Device List ---")
for i, device in enumerate(sd.query_devices()):
print(f"ID: {i}, Name: '{device['name']}', Max Input Channels: {device['max_input_channels']}, Max Output Channels: {device['max_output_channels']}")
You would then use the device ID when creating a stream:
sd.InputStream(device=3, ...) to use the device with ID 3.
Summary of Key Functions
| Function | Description |
|---|---|
sd.play(data, samplerate, ...) |
Plays audio from a NumPy array or a WAV file. |
sd.rec(frames, samplerate, ...) |
Starts recording audio into a NumPy array. Returns immediately. |
sd.wait() |
Blocks until playback or recording is finished. Essential to use after sd.play() or sd.rec(). |
sd.stop() |
Stops a currently playing or recording stream. |
sd.query_devices() |
Returns a list of all available audio devices with their details. |
sd.InputStream(...) |
Creates an input stream for advanced, real-time control (like full-duplex). |
sd.OutputStream(...) |
Creates an output stream for advanced, real-time control. |
sd.Stream(...) |
Creates a full-duplex stream (both input and output). |
Alternatives
While sounddevice is excellent, it's good to know other options:
pyaudio: The original Python wrapper for PortAudio. It's more low-level and can be more complex to use, but it offers very fine-grained control.simpleaudio/pydub: These are higher-level libraries focused on simplicity.simpleaudiois great for simple playback of WAV files, whilepydubis excellent for manipulating audio files (e.g., converting formats, slicing, applying effects) but not for real-time streaming.librosa: A specialized library for music and audio analysis. It's built on top ofsounddevice(orpyaudio) and NumPy/SciPy, providing tools for feature extraction (like MFCCs), rhythm analysis, and more. If your goal is analysis,librosais a better choice.
For most general-purpose audio I/O tasks in Python, sounddevice is the recommended choice due to its balance of power, simplicity, and performance.
