I Connected OpenClaw to a Voice-Controlled Robot Arm. It Waves at Me When I Sit Down at My Desk.
(I am not an engineer. I bought a $280 robot arm on Amazon and spent a weekend giving it a brain. Here is exactly how I did it.)
author: thalia_diedrick
date: 2026-03-29
status: COMPLETE
read_time: ~30 min
tags: [robot-arm] [voice-ai] [raspberry-pi] [openclaw] [elevenlabs] [marketer-builds-stuff]
▶ Watch the build in action
// Why I built this
I want to start by being just a little bit vulnerable.
I did not build this because I knew how. I spent a lot of my life wishing I knew how to do this sort of thing, but always felt myself pulled away by a different path. I am a marketer through and through. I'm a consumer who enjoys the intersection of creativity and analytics. However, I built it because I could not stop thinking about a question.
What does it mean for a marketing leader who loves to tinker to give AI a physical body?
When I was young I used to watch YouTube videos about how to turn computer mice into robots. I was obsessed with the idea you could make them do something different than their original intention. Can you guess who got in serious trouble for jail breaking their first iPhone? There was a point where I couldn't click on anything, but I did have a Hello Kitty theme.
I never figured out how to hack my computer mouse. I just wasted my grandpa's ink printing out tutorials from instructables.com and saving them in a butterfly binder. I did not have anyone around me who knew anything about technology. No one to show me where to start. No internet rabbit hole deep enough to get me from "I want to know how this works" to actually knowing.
So I just watched the videos. And I wanted to build things. And I did not know how. And eventually I stopped trying.
That feeling stayed with me for a long time.
I have a Spiderman tattoo. I grew up on Marvel. And the character I always came back to was Tony Stark. Not because of the money or the attitude but because of the garage. The idea that you could just build something. That intelligence and curiosity and enough stubbornness could result in something that had never existed before.
I wanted to feel like that. I actually ran home after watching the 2008 film, covering my arms in duct tape to make my suit of course, and ended up with quite a bit of red skin. I just did not know yet that eventually I would.
I run marketing at GoTu, the largest dental staffing marketplace in the country. My job is pipelines, conversion rates, and campaign strategy. Not product development or python scripts.
But for the last three years I have been building AI-powered workflows and solutions for my team. Slowly at first, then obsessively. Over one hundred AI teammates and functions running our marketing infrastructure. And this has impacted all of our team. We have a growth marketer building a bespoke data dashboard that does the high-end visualization we had been waiting on for months. My Director of Community and Advocacy who worked in sports is building out an entire platform to mobilize our UAC for feedback, testimonials, and attendance in sessions. They are AI-fluent and I am proud to lead by their side. When everyone on your team can build, something shifts. You stop asking what's possible and start asking what you want to exist. The democratization of skills has unlocked our wildest imaginations and honestly made us a little audacious.
But, now what would it actually feel like if OpenClaw had a body? I had played around a ton with ClawdBot when it was still called that and had even built an EA agent I enjoyed. But it was not disruptive. I wanted to create something that felt as close to having a real EA as possible
A few months ago I pre-ordered the NEO from 1X Robotics. I told my husband we were safe because it couldn't crack a walnut. He let me know that he also cannot crack a walnut with his bare hands.
I do not know exactly when it will arrive. I do not know exactly what it will do when it gets here. But I know that the thing I have been imagining since we were kids is actually becoming real.
Robots are coming into the physical world. Not as sci-fi. Not as factory equipment. As things that live in your house and help you with your life.
I wanted to understand it before it arrived on my doorstep
So I bought a robot arm on Amazon.
This is the full log of everything that happened next.
// What you are building
By the end of this guide you will have a voice assistant called Jarvis that lives on a Raspberry Pi on your desk. It listens for the wake word "Hey Jarvis" and responds to questions out loud using a real AI voice. It connects to your Gmail and Google Calendar directly. It controls a physical robot arm that presses colored buttons when important emails arrive or meetings are about to start. It greets you with a wave when you sit down at your desk. It runs automatically every time the Pi powers on without you touching anything.
You do not need an engineering background to build this. I did not have one.
You need patience, a willingness to Google things, and the ability to sit with confusion long enough for it to become clarity.
// The parts list
Here is everything I bought. Total cost around $475.
LewanSoul xArm 6DOF Robot Arm — Amazon — $279.99
CanaKit Raspberry Pi 4 4GB — Amazon — $149.97
USB Microphone — Amazon — ~$30
USB Speaker — Amazon — ~$16
Recordable Buttons — Amazon — ~$16
You will also need a PC to use the initial HiWonder software for the arm, a micro SD card (comes with CanaKit), and a USB cable for the Arduino (comes with the arm kit). I did everything but test the robot on my Mac mini.
I purchased a PCA9685 board that ended up not being needed for this build. The arm has its own controller. I kept it in case I add more sensors later.
// Before you start: accounts you need
Set these up before you touch any hardware.
OpenClaw at openclaw.ai. Free and open source. This is the AI agent infrastructure that becomes Jarvis's brain.
ElevenLabs at elevenlabs.io. Free tier gives you enough to start. This is Jarvis's voice.
Anthropic at console.anthropic.com. You need an API key. OpenClaw uses Claude under the hood.
Google Cloud at console.cloud.google.com. Free. You need this to connect Jarvis to your actual Gmail and Calendar without the connection expiring every few days.
Zapier at zapier.com. Free tier available. You can use Zapier MCP as a quick way to connect OpenClaw to Gmail and Calendar before setting up direct Google OAuth.
// Step 1: Build the arm
The LewanSoul arm comes with good instructions. Follow them. It takes about two hours.
The one thing I wish someone had told me: the pre-installed action groups seem to assume the arm is elevated off the desk. If you place it flat on a table the pre-programmed movements will try to go lower than physically possible. Ignore the pre-installed action groups entirely. You will program your own positions later and they will be much more reliable.
When you are done assembling, plug the arm into your computer via the included USB cable and open the HiConnect software to confirm everything moves. Once you know the arm works, set it aside. You will come back to it.
// Step 2: Set up the Raspberry Pi
The Pi is the brain that sits between your Mac and the arm.
Flash the SD card
Download Raspberry Pi Imager from raspberrypi.com/software. Insert the micro SD card into your computer using the CanaKit reader.
In the Imager select Raspberry Pi 4 as the device, Raspberry Pi OS 64-bit as the OS, and your SD card as the storage.
Before you click Write, click the gear icon and fill in your hostname (raspberrypi), enable SSH, your WiFi network name and password, your username, and a password you will remember.
Click Write. Wait about 10 minutes.
Boot and connect
Put the SD card in the Pi. Power it on. Wait 60 seconds. Then from your Mac terminal:
ssh yourusername@raspberrypi.local
It will ask if you trust the connection. Type yes. Then enter your password.
If it connects you will see a prompt like this:
yourusername@raspberrypi:~ $
You are now inside the Pi.
// Step 3: Connect the arm to the Pi
Plug the arm USB cable into one of the black USB ports on the Pi. The blue ports are USB 3 and the arm does not benefit from the extra speed.
Confirm the Pi sees the arm:
lsusb
You should see a line mentioning STMicroelectronics. That is your arm.
Install the Python libraries that control it:
pip3 install hid --break-system-packages
pip3 install xarm --break-system-packages
The xarm library has a small bug with newer versions of the hid module. Open the controller file:
nano ~/.local/lib/python3.13/site-packages/xarm/controller.py
Find the USB section in the __init__ method and replace it so it reads:
elif com_port.startswith('USB'):
import hid
self._device = hid.Device(0x0483, 0x5750)
self._device.nonblocking = True
self._usb_recv_event = False
self._is_serial = False
Save and exit. Then set permissions:
sudo chmod 666 /dev/hidraw0
Make this permanent:
echo 'SUBSYSTEM=="hidraw", ATTRS{idVendor}=="0483", ATTRS{idProduct}=="5750", MODE="0666"' | sudo tee /etc/udev/rules.d/99-xarm.rules
sudo udevadm control --reload-rules
Test that the arm moves:
python3 - << 'EOF'
import xarm
from time import sleep
arm = xarm.Controller('USB')
arm.setPosition(1, 100, 1000, True)
sleep(1)
arm.setPosition(1, 900, 1000, True)
EOF
If servo 1 sweeps back and forth, everything is working.
// Step 4: Install OpenClaw on your Mac
OpenClaw is the agent infrastructure. It runs on your Mac and the Pi communicates with it.
First install Xcode tools:
xcode-select --install
A popup will appear. Click Install. Wait for it to finish (5 to 10 minutes).
Then install Homebrew:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
When it finishes, add it to your path:
echo >> ~/.zprofile
echo 'eval "$(/opt/homebrew/bin/brew shellenv zsh)"' >> ~/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv zsh)"
Install Node and then OpenClaw:
brew install node
npm install -g openclaw@latest
openclaw onboard --install-daemon
Follow the setup prompts. When it asks for an API key, use your Anthropic key. Once it is running you will see the OpenClaw TUI in your terminal. Type something to confirm it responds.
// Step 5: Enable the OpenClaw HTTP API
By default OpenClaw only accepts connections from localhost. You need to open it to your local network so the Pi can reach it.
nano ~/.openclaw/openclaw.json
Find the gateway section and make sure it includes bind set to lan and the http endpoints block:
"gateway": {
"port": 18789,
"mode": "local",
"bind": "lan",
"http": {
"endpoints": {
"responses": {
"enabled": true
}
}
}
}
Save and restart:
openclaw gateway restart
Find your Mac's IP address:
ipconfig getifaddr en1
Write this down. You will need it in every script on the Pi.
Test the API:
curl -sS http://localhost:18789/v1/responses \
-H "Authorization: Bearer YOUR_GATEWAY_TOKEN" \
-H "Content-Type: application/json" \
-H "x-openclaw-agent-id: main" \
-d '{"model": "openclaw", "input": "say hello"}'
Your gateway token is in openclaw.json under gateway.auth.token. If it returns a response you are ready for the voice layer.
// Step 6: Set up the voice layer on the Pi
SSH back into the Pi. Install everything:
pip3 install faster-whisper --break-system-packages
pip3 install elevenlabs --break-system-packages
pip3 install openwakeword --break-system-packages
sudo apt-get install -y portaudio19-dev mpg123
pip3 install pyaudio --break-system-packages
pip3 install scipy --break-system-packages
Find your audio devices:
aplay -l
arecord -l
Note the card numbers for your speaker and microphone. Mine were card 3 for the speaker and card 4 for the mic. Yours may differ. Update all the scripts below with your actual card numbers.
Test your speaker:
speaker-test -D plughw:3,0 -c 2 -t wav -l 1
You should hear front left, front right.
Test your microphone:
arecord -D plughw:4,0 -f S16_LE -r 48000 -c 1 -d 5 /tmp/test.wav
aplay -D plughw:3,0 /tmp/test.wav
Speak for 5 seconds. You should hear yourself play back.
Create the voice script:
mkdir ~/arm
cat > ~/arm/speak.py << 'EOF'
import sys
import subprocess
from elevenlabs import ElevenLabs
client = ElevenLabs(api_key="YOUR_ELEVENLABS_KEY")
def speak(text):
padded = "hmm. " + text
audio = client.text_to_speech.convert(
text=padded,
voice_id="YOUR_VOICE_ID",
model_id="eleven_turbo_v2",
output_format="mp3_44100_128"
)
audio_bytes = b"".join(audio)
with open("/tmp/speech.mp3", "wb") as f:
f.write(audio_bytes)
subprocess.run(["mpg123", "-a", "plughw:3,0", "-k", "10", "/tmp/speech.mp3"])
if __name__ == "__main__":
text = " ".join(sys.argv[1:])
speak(text)
EOF
Test it:
python3 ~/arm/speak.py "Hello. I am Jarvis."
If you hear the voice through your speaker, the voice layer is working.
// Step 7: The wake word (this took embarrassingly long)
The wake word uses OpenWakeWord. The tricky part: the model expects audio at 16000Hz but your USB mic records at 48000Hz. You have to downsample the audio before feeding it to the model. That one detail cost me about two hours.
Create the main Jarvis loop:
cat > ~/arm/wake_loop.py << 'EOF'
import pyaudio
import numpy as np
import subprocess
import requests
import re
from scipy.signal import resample
from openwakeword.model import Model
MAC_IP = "YOUR_MAC_IP"
GATEWAY_TOKEN = "YOUR_GATEWAY_TOKEN"
MIC_INDEX = 4
owwModel = Model(inference_framework='onnx')
audio = pyaudio.PyAudio()
stream = audio.open(
rate=48000,
channels=1,
format=pyaudio.paInt16,
input=True,
frames_per_buffer=4800,
input_device_index=MIC_INDEX
)
print("Listening for 'Hey Jarvis'...")
while True:
raw = stream.read(4800, exception_on_overflow=False)
samples = np.frombuffer(raw, dtype=np.int16).astype(np.float32)
resampled = resample(samples, int(len(samples) * 16000 / 48000))
audio_int16 = resampled.astype(np.int16)
owwModel.predict(audio_int16)
for mdl in owwModel.prediction_buffer.keys():
scores = list(owwModel.prediction_buffer[mdl])
if scores[-1] > 0.5:
owwModel.reset()
print("Wake word detected!")
subprocess.run(["python3", "/root/arm/speak.py", "Yes?"])
print("Recording question...")
subprocess.run([
"arecord", "-D", "plughw:4,0", "-f", "S16_LE",
"-r", "16000", "-c", "1", "-d", "5", "/tmp/question.wav"
])
from faster_whisper import WhisperModel
model = WhisperModel("tiny.en", compute_type="int8")
segments, _ = model.transcribe("/tmp/question.wav")
question = " ".join([s.text for s in segments]).strip()
print(f"Heard: {question}")
if question:
resp = requests.post(
f"http://{MAC_IP}:18789/v1/responses",
headers={
"Authorization": f"Bearer {GATEWAY_TOKEN}",
"Content-Type": "application/json",
"x-openclaw-agent-id": "main"
},
json={"model": "openclaw", "input": question}
)
data = resp.json()
answer = ""
for item in data.get("output", []):
if item.get("type") == "message":
for c in item.get("content", []):
if c.get("type") == "output_text":
answer += c.get("text", "")
clean = re.sub(r'\*+', '', answer).strip()
print(f"Answer: {clean}")
if clean:
subprocess.run(["python3", "/root/arm/speak.py", clean])
EOF
Test it:
python3 ~/arm/wake_loop.py
Say "Hey Jarvis" and wait. If it responds with "Yes?" and then answers your question out loud, the core loop is working.
// Step 8: Make the arm actually do something
This is where it got fun.
Create positions for your arm. Each servo can go from about 100 to 900. You need to find the exact values for your setup by experimenting.
cat > ~/arm/positions.py << 'EOF'
import xarm
from time import sleep
arm = xarm.Controller('USB')
def home():
arm.setPosition(1, 500, 1000, True)
arm.setPosition(2, 500, 1000, True)
arm.setPosition(3, 500, 1000, True)
arm.setPosition(4, 500, 1000, True)
arm.setPosition(5, 500, 1000, True)
arm.setPosition(6, 500, 1000, True)
sleep(1)
def wave():
home()
arm.setPosition(2, 200, 800, True)
arm.setPosition(3, 300, 800, True)
sleep(0.5)
for _ in range(3):
arm.setPosition(6, 200, 400, True)
sleep(0.3)
arm.setPosition(6, 800, 400, True)
sleep(0.3)
home()
def press_red():
home()
arm.setPosition(1, 300, 1000, True)
arm.setPosition(2, 650, 1000, True)
arm.setPosition(3, 350, 1000, True)
sleep(0.5)
arm.setPosition(3, 300, 500, True)
sleep(0.3)
home()
def press_blue():
home()
arm.setPosition(1, 700, 1000, True)
arm.setPosition(2, 650, 1000, True)
arm.setPosition(3, 350, 1000, True)
sleep(0.5)
arm.setPosition(3, 300, 500, True)
sleep(0.3)
home()
if __name__ == "__main__":
import sys
action = sys.argv[1] if len(sys.argv) > 1 else "wave"
if action == "wave": wave()
elif action == "red": press_red()
elif action == "blue": press_blue()
else: home()
EOF
Test each position:
python3 ~/arm/positions.py wave
python3 ~/arm/positions.py red
python3 ~/arm/positions.py blue
Adjust the servo values until the arm actually hits each button cleanly. This takes patience. I spent about an hour on this part alone.
// A note on Zapier MCP
OpenClaw connects to your tools through Zapier MCP. Go to mcp.zapier.com, connect your accounts, and copy the MCP server URL into your openclaw.json config. This gives OpenClaw access to Gmail, Calendar, HubSpot, and more in one setup.
The issue I ran into was speed for time-sensitive functions. When the heartbeat job checks your email through Zapier, it goes through an extra layer — OpenClaw calls Zapier, Zapier calls Google, Google responds, Zapier responds, OpenClaw processes. Each hop adds latency. Combined with the arm movements and ElevenLabs generating audio, my heartbeat jobs were taking over 12 minutes to complete. That meant Jarvis could not respond to Telegram messages for 12 minutes every time a check ran.
The fix was not replacing Zapier entirely. I still use it for HubSpot and other integrations. But for the two functions that run constantly — email and calendar — I bypassed Zapier and connected directly to Google via OAuth (Step 8). Those calls now take under 5 seconds instead of 45.
Use Zapier for everything to start. Then move your highest frequency functions to direct API connections once you feel the latency.
// Step 9: Connect Gmail and Google Calendar
This is the step that makes it real. Jarvis checks your email and calendar and decides when to press a button.
Go to console.cloud.google.com. Create a new project. Enable the Gmail API and the Google Calendar API.
Go to Credentials. Create an OAuth 2.0 Client ID. Set the application type to Desktop. Download the credentials JSON file.
Copy the credentials file to your Pi:
scp ~/Downloads/credentials.json yourusername@raspberrypi.local:~/arm/credentials.json
Install the Google libraries on the Pi:
pip3 install google-auth google-auth-oauthlib google-api-python-client --break-system-packages
Create the auth script:
cat > ~/arm/google_auth.py << 'EOF'
from google_auth_oauthlib.flow import InstalledAppFlow
SCOPES = [
'https://www.googleapis.com/auth/gmail.readonly',
'https://www.googleapis.com/auth/calendar.readonly'
]
flow = InstalledAppFlow.from_client_secrets_file('/root/arm/credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
import pickle
with open('/root/arm/token.pickle', 'wb') as f:
pickle.dump(creds, f)
print("Auth complete. Token saved.")
EOF
Run this from the Pi with a display or tunnel. The first time it will open a browser for Google sign-in. After that, the token persists.
python3 ~/arm/google_auth.py
Now create the email and calendar checker:
cat > ~/arm/check_triggers.py << 'EOF'
import pickle
import subprocess
from datetime import datetime, timezone, timedelta
from googleapiclient.discovery import build
def get_creds():
with open('/root/arm/token.pickle', 'rb') as f:
return pickle.load(f)
def check_email():
creds = get_creds()
service = build('gmail', 'v1', credentials=creds)
results = service.users().messages().list(
userId='me', q='is:unread from:ceo@yourcompany.com', maxResults=1
).execute()
if results.get('messages'):
print("CEO email detected. Pressing red button.")
subprocess.run(['python3', '/root/arm/positions.py', 'red'])
subprocess.run(['python3', '/root/arm/speak.py', 'You have an important email from the CEO.'])
def check_calendar():
creds = get_creds()
service = build('calendar', 'v3', credentials=creds)
now = datetime.now(timezone.utc)
soon = now + timedelta(minutes=5)
events = service.events().list(
calendarId='primary',
timeMin=now.isoformat(),
timeMax=soon.isoformat(),
singleEvents=True,
orderBy='startTime'
).execute()
if events.get('items'):
event = events['items'][0]
name = event.get('summary', 'a meeting')
print(f"Upcoming meeting: {name}. Pressing blue button.")
subprocess.run(['python3', '/root/arm/positions.py', 'blue'])
subprocess.run(['python3', '/root/arm/speak.py', f'You have {name} starting in less than five minutes.'])
if __name__ == '__main__':
check_email()
check_calendar()
EOF
Test it:
python3 ~/arm/check_triggers.py
If you have an unread CEO email it will press the red button. If you have a meeting in the next 5 minutes it will press the blue button.
// Step 10: The greeting wave
I wanted Jarvis to wave when I sat down at my desk. The simplest version: wave on boot.
cat > ~/arm/greet.py << 'EOF'
import subprocess
from time import sleep
sleep(5)
subprocess.run(['python3', '/root/arm/positions.py', 'wave'])
subprocess.run(['python3', '/root/arm/speak.py', 'Good morning. I am Jarvis. Ready when you are.'])
EOF
Later I added a simple motion detection version using the USB camera, but the boot wave was the version that made me feel something. The first time it waved at me and said good morning I just stared at it for a while.
// Step 11: Auto-start everything on boot
Create a systemd service so everything runs when the Pi powers on:
sudo tee /etc/systemd/system/jarvis.service << 'EOF'
[Unit]
Description=Jarvis Assistant
After=network-online.target sound.target
Wants=network-online.target
[Service]
Type=simple
User=root
WorkingDirectory=/root/arm
ExecStartPre=/bin/sleep 10
ExecStart=/usr/bin/python3 /root/arm/start_all.py
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
Create the startup script:
cat > ~/arm/start_all.py << 'EOF'
import subprocess
import threading
from time import sleep
def run_greet():
subprocess.run(['python3', '/root/arm/greet.py'])
def run_wake_loop():
subprocess.run(['python3', '/root/arm/wake_loop.py'])
def run_triggers():
while True:
subprocess.run(['python3', '/root/arm/check_triggers.py'])
sleep(120)
threading.Thread(target=run_greet, daemon=True).start()
sleep(10)
threading.Thread(target=run_triggers, daemon=True).start()
run_wake_loop()
EOF
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable jarvis.service
sudo systemctl start jarvis.service
Check that it is running:
sudo systemctl status jarvis.service
If it shows active (running) then Jarvis will start automatically every time you plug in the Pi.
// Step 12: What I learned
I am not an engineer. I do not call myself one. But I built a thing that listens, thinks, speaks, and moves. And it checks my email and calendar.
Here is what I actually learned:
Hardware is unforgiving. Software lets you undo. Hardware does not. When the arm swings into the table because your coordinates are wrong, nothing catches that error. You just hear the sound and wince.
Audio is its own universe. Sample rates, buffer sizes, device indices. None of it is intuitive. The documentation is scattered across forums from 2016.
The wake word was the hardest part. Not because the library is bad. OpenWakeWord is excellent. But because the sample rate mismatch between the mic and the model is the kind of bug that gives you no error message. It just silently does not work.
The moment it worked was not dramatic. It was quiet. I said "Hey Jarvis" and it said "Yes?" and I asked it a question and it answered and I just sat there. I thought that maybe I'd have to return all this to Amazon and forget it ever happened lol.
If you are reading this and you are not an engineer and you want to build something, I want you to know that the gap between wanting to build and actually building has never been smaller. The tools exist. The instructions exist. You just have to be willing to sit with confusion for a while.
I sat with it for a weekend. And now there is a robot arm on my desk that waves at me in the morning.
That is worth the confusion.
Already tomorrow,
Thalía