Lädt...

🔧 Building a Real-Time Voice Assistant with Local LLMs on a Raspberry Pi


Nachrichtenbereich: 🔧 Programmierung
🔗 Quelle: dev.to

Introduction

In this document, I’m sharing my journey of turning a Raspberry Pi into a powerful, real-time voice assistant. The goal was to:

  • Capture voice input through a web interface.
  • Process the text using a local LLM (like Mistral) running on the Pi.
  • Generate voice responses using Piper for text-to-speech (TTS).
  • Stream everything in real-time via WebSockets.

All of this runs offline on the Raspberry Pi — no cloud services involved. Let’s dive into how I built it step by step!

1. Setting up the Raspberry Pi

First, I set up my Raspberry Pi with the latest Raspberry Pi OS. It’s important to enable hardware interfaces and connect a USB microphone and speaker.

Steps:

  1. Update the system:
   sudo apt-get update
   sudo apt-get upgrade
  1. Enable the audio interface:
   sudo raspi-config

Navigate to System Options > Audio and select the correct output/input device.

2. Installing Ollama for Local LLMs

Ollama makes it easy to run local LLMs like Mistral on your Raspberry Pi. I installed it using:

curl -fsSL https://ollama.com/install.sh | sh

Once installed, I pulled the Mistral model:

ollama pull mistral

To confirm it works, I ran a quick test:

ollama run mistral

The model was ready to process text right on the Pi!

3. Setting up Piper for Text-to-Speech (TTS)

For offline voice generation, I chose Piper — a fantastic open-source TTS engine.

  1. Install dependencies:
   sudo apt-get install wget build-essential libsndfile1
  1. Download Piper for ARM64 (Raspberry Pi):
   wget https://github.com/rhasspy/piper/releases/download/v1.0.0/piper_arm64.tar.gz
   tar -xvzf piper_arm64.tar.gz
   chmod +x piper
   sudo mv piper /usr/local/bin/
  1. Test if Piper works:
   echo "Hello, world!" | piper --model en_US --output_file output.wav
   aplay output.wav

Now the Pi could "talk" back!

4. Creating the Backend (Node.js)

I built a simple Node.js server to:

  • Accept text from the client (voice input from a web app).
  • Process it using Mistral (via Ollama).
  • Convert the LLM response to speech with Piper.
  • Stream the audio back to the client.

server.js:

const express = require('express');
const { exec } = require('child_process');
const WebSocket = require('ws');

const app = express();
const PORT = 3001;

// WebSocket setup
const wss = new WebSocket.Server({ port: 3002 });

wss.on('connection', (ws) => {
  console.log('Client connected');

  ws.on('message', (message) => {
    console.log('Received:', message);

    // Run Mistral LLM
    exec(`ollama run mistral "${message}"`, (err, stdout) => {
      if (err) {
        console.error('LLM error:', err);
        ws.send('Error processing your request.');
        return;
      }

      // Convert LLM response to speech using Piper
      exec(`echo "${stdout}" | piper --model en_US --output_file output.wav`, (ttsErr) => {
        if (ttsErr) {
          console.error('Piper error:', ttsErr);
          ws.send('Error generating speech.');
          return;
        }

        // Send the audio file back to the client
        ws.send(JSON.stringify({ text: stdout, audio: 'output.wav' }));
      });
    });
  });
});

app.listen(PORT, () => {
  console.log(`Server running at http://localhost:${PORT}`);
});

5. Building the Real-Time Web Interface (React)

For the frontend, I created a simple React app to:

  • Record voice input.
  • Display real-time text responses.
  • Play the generated speech audio.

App.js:

import React, { useState } from 'react';

function App() {
  const [text, setText] = useState('');
  const [response, setResponse] = useState('');
  const [audio, setAudio] = useState(null);

  const ws = new WebSocket('ws://localhost:3002');

  const handleSend = () => {
    ws.send(text);
  };

  ws.onmessage = (event) => {
    const data = JSON.parse(event.data);
    setResponse(data.text);

    fetch(`http://localhost:3001/${data.audio}`)
      .then(res => res.blob())
      .then(blob => {
        setAudio(URL.createObjectURL(blob));
      });
  };

  return (
    <div>
      <h1>Voice Assistant</h1>
      <textarea value={text} onChange={(e) => setText(e.target.value)} />
      <button onClick={handleSend}>Send</button>
      <h2>Response:</h2>
      <p>{response}</p>
      {audio && <audio controls src={audio} />}
    </div>
  );
}

export default App;

6. Running the Project

Once the backend and frontend were ready, I launched both:

  • Start the backend:
  node server.js
  • Run the React app:
  npm start

I accessed the web app on my Raspberry Pi’s IP at port 3000 and spoke into the mic — and voilà! The assistant responded in real-time, all processed locally.

Conclusion

Building a real-time, fully offline voice assistant on a Raspberry Pi was an exciting challenge. With:

  • Ollama for running local LLMs (like Mistral)
  • Piper for high-quality text-to-speech
  • WebSockets for real-time communication
  • React for a smooth web interface

... I now have a personalized voice AI that works without relying on the cloud.

...

📰 Building a Local Voice Assistant with LLMs and Neural Networks on Your CPU Laptop


📈 35.08 Punkte
🔧 AI Nachrichten

📰 Building Your Own Open Source, Privacy-Protecting Voice Assistant With A Raspberry Pi


📈 28.52 Punkte
📰 IT Security Nachrichten

🪟 Windows 11’s Voice Access adds custom voice commands, voice shortcuts in insider build


📈 25.66 Punkte
🪟 Windows Tipps

🍏 What Siri Isn’t: Perplexity’s Voice Assistant and the Potential of LLMs Integrated with iOS


📈 24.43 Punkte
🍏 iOS / Mac OS

📰 Microsoft previews Spanish voice features for its Copilot Voice AI assistant


📈 24.22 Punkte
📰 IT Nachrichten

📰 I tested the new Copilot Voice, Microsoft's AI voice assistant. You can, too - for free


📈 24.22 Punkte
📰 IT Nachrichten

🔧 Voice AI: How to build a voice AI assistant?


📈 24.22 Punkte
🔧 Programmierung

🕵️ Hacking Voice Assistant Systems with Inaudible Voice Commands


📈 24.22 Punkte
🕵️ Reverse Engineering

📰 Samsung Is Delaying the 'Voice' Part of Its New Bixby Voice Assistant


📈 24.22 Punkte
📰 IT Security Nachrichten

🪟 Callfluent: Create AI Voice Call Agents for AI Call center &amp; Automate calls like Real Human Voice


📈 23.33 Punkte
🪟 Windows Tipps

📰 Raspberry Pi Fans Can Build Their Own AI Voice Assistant


📈 23.3 Punkte
📰 IT Nachrichten

🔧 What are LLMs, Local LLMs and RAG?


📈 22.96 Punkte
🔧 Programmierung

📰 Home Assistant's New Voice Assistant Answers To 'Hey Jarvis'


📈 22.77 Punkte
📰 IT Security Nachrichten

🔧 Build an AI Voice assistant like Siri (use OpenAI AI Assistant)


📈 22.77 Punkte
🔧 Programmierung

📰 Home Assistant: Voice-Assistant-Wettbewerb startet in dieser Woche


📈 22.77 Punkte
📰 IT Nachrichten

🐧 Google Assistant Defeats Siri &amp; Bixby In MKBHD’s Voice Assistant Test


📈 22.77 Punkte
🐧 Linux Tipps

📰 Google Assistant Takes the Crown Beating Bixby and Siri In Voice Assistant Test


📈 22.77 Punkte
📰 IT Security Nachrichten

🍏 Amazon to kill Echo's local voice processing feature in favor of Voice ID


📈 22.54 Punkte
🍏 iOS / Mac OS

🎥 The women building voice AI and their role in the voice revolution


📈 22.33 Punkte
🎥 Videos

📰 Running Local LLMs and VLMs on the Raspberry Pi


📈 21.84 Punkte
🔧 AI Nachrichten

🎥 This giant Raspberry Pi replica is powered by Raspberry Pi Pico and flashes a real LED


📈 21.5 Punkte
🎥 Video | Youtube

🔧 JetBrains AI Assistant can now use local LLMs


📈 21.3 Punkte
🔧 Programmierung

🐧 [OC] Lexido, your AI cmd assistant, now supports local LLMs such as Llama2, Gemma, Mistral, and more!


📈 21.3 Punkte
🐧 Linux Tipps

🐧 [OC] Lexido, your AI cmd assistant, now supports local LLMs such as Llama2, Gemma, Mistral, and more!


📈 21.3 Punkte
🐧 Linux Tipps

🔧 Local Voice Search: Entschlüsselung der Algorithmen von Google Assistant und Siri


📈 21.09 Punkte
🔧 Programmierung

🎥 Building a cloud-free digital voice assistant with FOSS


📈 20.88 Punkte
🎥 IT Security Video

🐧 Building a Voice Activated Digital Assistant


📈 20.88 Punkte
🐧 Linux Tipps

📰 Huawei is Building Its Own AI Voice Assistant to Take on Siri and Bixby


📈 20.88 Punkte
📰 IT Security Nachrichten

🐧 The Complete Guide to Building Your Free Local AI Assistant with Ollama and Open WebUI


📈 17.76 Punkte
🐧 Linux Tipps

🔧 From Claude to Ollama: Building a Local AI Coding Assistant


📈 17.76 Punkte
🔧 Programmierung

🔧 Ollama and Web-LLM: Building Your Own Local AI Search Assistant


📈 17.76 Punkte
🔧 Programmierung