# How to Create Voice Agents with Groq and SILMA TTS

This guide explains how to create voice agents optimized for ultra-low latency using Groq LLM models and SILMA TTS.

## 🎯 Quick Summary

When creating a new voice agent at https://www.halavoice.store/app/agents, you can now select:

1. **Groq LLM Models** - Ultra-low latency (~80ms Time-To-First-Token)
2. **SILMA TTS** - Bilingual Arabic/English voice cloning with fast synthesis

## 📝 Step-by-Step Guide

### 1. Navigate to Agent Creation

1. Go to https://www.halavoice.store/app/agents
2. Click the "Create Agent" button
3. The Agent Creation Wizard will open

### 2. Select Use Case (Step 1)

Choose the use case that best fits your needs:
- **Receptionist** - Handle incoming calls, route to departments
- **Appointment Setter** - Schedule meetings, qualify leads
- **Sales Agent** - Lead qualification and sales conversations
- **Support Agent** - Customer service and troubleshooting
- **Survey Agent** - Conduct surveys and collect feedback

Click "Next" to continue.

### 3. Configure Agent Basics (Step 2)

1. **Agent Name**: Enter a name for your agent (e.g., "Sarah", "Receptionist")
2. **Telephony Provider**: Choose your provider
   - **Twilio** - ElevenLabs Conversational AI (default)
   - **Plivo** - OpenAI Realtime API
   - **Twilio + OpenAI** - Twilio with OpenAI Realtime
   - **SILMA TTS** - Bilingual voice cloning (select this for SILMA!)
   - **ElevenLabs SIP** - Native SIP trunk
   - **OpenAI SIP** - OpenAI with SIP trunk

3. **Voice Selection** (depends on provider):
   - For **SILMA TTS**: Select a voice clone you've created
   - For **Others**: Select from available voices

### 4. Configure Personality & LLM Model (Step 3)

This is where you select the **Groq LLM Model**:

1. **Voice Tone**: Choose professional, friendly, casual, or authoritative
2. **Personality**: Select helpful, enthusiastic, calm, or confident

3. **LLM Model Selection** (NEW!):
   
   **Voice-Optimized Models** (~80ms TTFT):
   - ⚡ **Llama 3.3 70B (Groq)** - Recommended for voice agents
   - ⚡ **Llama 3.1 70B (Groq)** - Alternative option
   
   **Standard Models** (~300-500ms TTFT):
   - GPT-4o Mini
   - GPT-4o
   - Claude 3.5 Sonnet
   - Claude 3 Haiku
   - Gemini 2.5 Flash
   - And others...

4. **Recommended Selection**:
   - For **real-time voice**: Select Groq models (marked with ⚡)
   - For **complex reasoning**: Select standard models

Click "Next" to continue.

### 5. Configure Prompts (Step 4)

1. **System Prompt**: Describe your agent's role and behavior
   - Use `{{company_name}}`, `{{agent_name}}` variables
   - Define responsibilities and guidelines

2. **First Message**: What the agent says when answering
   - Example: "Thank you for calling {{company_name}}. How may I help you today?"

Click "Next" to continue.

### 6. Configure Voice Settings (Step 5)

Adjust voice parameters (available for certain providers):
- **Stability**: Voice consistency (0.0 - 1.0)
- **Similarity Boost**: Voice cloning accuracy (0.0 - 1.0)
- **Speed**: Speech rate (0.5 - 2.0)

Click "Next" to continue.

### 7. Review and Create (Step 6)

Review all settings:
- ✅ Use Case and Name
- ✅ Telephony Provider and Voice
- ✅ Language and LLM Model
- ✅ Personality and Prompts
- ✅ Voice Settings

Click "Create Agent" to finalize.

## 🎤 Selecting SILMA TTS

To use SILMA TTS for your voice agent:

1. **Create Voice Clones First**:
   - Go to Settings → SILMA TTS
   - Click "Create Clone"
   - Upload reference audio (8-30 seconds)
   - Enter name, language, and reference text
   - Save the voice clone

2. **Select SILMA TTS Provider**:
   - In Agent Creation Wizard → Step 2 (Basics)
   - Choose **SILMA TTS** as the telephony provider
   - Select your voice clone from the dropdown

3. **Benefits of SILMA TTS**:
   - Bilingual support (Arabic & English)
   - Instant voice cloning
   - Natural-sounding speech
   - Works with voice-optimized LLMs like Groq

## ⚡ Performance Optimization Tips

### For Ultra-Low Latency (~400ms end-to-end):

1. **Use Groq LLM Model**:
   - Select "Llama 3.3 70B (Groq)" in Step 3
   - This provides ~80ms Time-To-First-Token

2. **Configure SILMA TTS API Server** (optional):
   - Go to Settings → SILMA TTS → Server Configuration
   - Select "API Server (Fast)" mode
   - This provides ~5s synthesis vs 75s local mode

3. **Optimize Geographic Region**:
   - Select region closest to your users
   - Reduces network latency by 50%

### Expected Latency Comparison:

| Configuration | End-to-End Latency |
|--------------|-------------------|
| Standard LLM + Local SILMA | ~80 seconds |
| Standard LLM + API Server | ~10 seconds |
| **Groq LLM + API Server** | **~400ms** ⚡ |

## 🔧 Configuration Checklist

Before creating your agent:

- [ ] Created SILMA voice clones (if using SILMA TTS)
- [ ] Selected appropriate telephony provider
- [ ] Chosen voice-optimized LLM (Groq recommended)
- [ ] Configured system prompt and first message
- [ ] Adjusted voice settings (if available)
- [ ] Reviewed all settings before creation

## 📊 Monitoring Performance

After creating your agent:

1. Make a test call to your agent
2. Measure response latency
3. Check server logs for timing information:
   ```bash
   pm2 logs agentlabs | grep -i "groq\|silma\|latency"
   ```

## 🎉 Summary

You can now create voice agents with:
- **Groq LLM Models** - Selected in Step 3 (Personality)
- **SILMA TTS** - Selected in Step 2 (Telephony Provider)

The combination of Groq + SILMA TTS provides the best performance for real-time voice conversations!

## 📚 Additional Resources

- [Groq Setup Guide](./GROQ_SETUP.md)
- [SILMA API Setup Guide](./SILMA_API_SETUP.md)
- [Voice Agent Optimization Summary](./VOICE_AGENT_OPTIMIZATION_SUMMARY.md)
- [Deployment Status](./DEPLOYMENT_STATUS.md)

## 🆘 Troubleshooting

**Issue**: Groq models not showing in LLM selection
- **Solution**: Verify database migration was applied successfully

**Issue**: SILMA TTS not available in provider list
- **Solution**: Ensure SILMA TTS plugin is enabled in settings

**Issue**: High latency even with Groq
- **Solution**: Check network connectivity and configure SILMA API server mode

---

**Questions?** Check the SILMA TTS settings page or refer to the documentation above.
