Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sounddevice multiplatform client #111

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions sounddevice_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
"""
Simple (hopefully) multiplatform client mimicing the linux command

arecord -f S16_LE -c1 -r 16000 -t raw -D default | nc localhost 43001

which streams audio data from the microphone to the server.
Tested on Mac Os.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just tested it :)

Suggested change
Tested on Mac Os.
Tested on Mac Os and Linux.

"""
import sys
import sounddevice as sd
import socket
import numpy as np
from argparse import ArgumentParser


parser = ArgumentParser(__doc__)
parser.add_argument('--host', type=str, default='localhost', help='Host name')
parser.add_argument('--port', type=int, default=43007, help='Port number')
parser.add_argument('--chunk', type=int, default=1000, help='Chunk size in ms')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a hint: there is also another variable called chunk with another meaning. maybe we can rename this one to a more speaking one which indicates its about MS?

args = parser.parse_args()



# Audio configuration needed for whisper_online_server.py
SAMPLE_RATE = 16000
CHANNELS = 1
CHUNK = 1024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename it to

Suggested change
CHUNK = 1024
BLOCKSIZE = 1024

since its only used for this regards? =)

# SIGNED INT16 LITTLE ENDIAN is setup as for sounddevice
DTYPE = np.int16
Gldkslfmsd marked this conversation as resolved.
Show resolved Hide resolved

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((args.host, args.port))

print("Recording and streaming audio...")

def callback(indata, frames, time, status):
if status:
print(status, file=sys.stderr)
# Convert the audio data to bytes and send it over the socket
sock.sendall(indata.tobytes())

try:
# Open the audio stream
with sd.InputStream(samplerate=SAMPLE_RATE, channels=CHANNELS, dtype='int16', callback=callback, blocksize=CHUNK):
print("Press Ctrl+C to stop the recording")
while True:
sd.sleep(args.chunk)
except KeyboardInterrupt:
print("Stopping...")
finally:
# Close the socket
sock.close()