# Groq API Setup for Voice Agents

This guide explains how to configure Groq API for ultra-low latency voice agents.

## Why Groq?

Based on performance benchmarks from https://www.ntik.me/posts/voice-agent:
- **Time-To-First-Token (TTFT)**: ~80ms vs 300-500ms for GPT-4o-mini
- **End-to-End Latency**: ~400ms vs 1.7s with traditional LLMs
- **Performance**: 3x faster than OpenAI models
- **Ideal For**: Real-time voice conversations

## Setup Steps

### 1. Get Groq API Key

1. Visit [console.groq.com](https://console.groq.com)
2. Sign up or log in
3. Navigate to API Keys section
4. Create a new API key
5. Copy the API key

### 2. Configure Environment Variable

Add the Groq API key to your environment:

```bash
export GROQ_API_KEY='your-groq-api-key-here'
```

For PM2, add to `/home/ashraffarid2010/halavoice.store/.env`:

```bash
GROQ_API_KEY=gsk_your_api_key_here
```

### 3. Restart the Server

```bash
pm2 restart agentlabs
```

### 4. Verify Integration

Check the server logs for Groq LLM initialization:

```bash
pm2 logs agentlabs --lines 50 | grep -i groq
```

## Recommended Models for Voice Agents

The following Groq models are optimized for voice applications:

1. **llama-3.3-70b-versatile** (Recommended)
   - Best balance of speed and quality
   - TTFT: ~80ms
   - Ideal for most voice agent use cases

2. **llama-3.1-70b-versatile**
   - Alternative option
   - Similar performance characteristics
   - TTFT: ~85ms

3. **mixtral-8x7b-32768**
   - Faster but slightly less capable
   - TTFT: ~70ms
   - Good for simple voice interactions

## Usage in Voice Agents

When creating or editing a voice agent:

1. Go to Agent Settings
2. Select "Model" configuration
3. Choose a Groq model (marked with "Voice Optimized" badge)
4. Save the agent

## Performance Monitoring

The system automatically tracks:
- Time-to-first-token latency
- Total completion time
- Connection pool statistics
- API health status

View these metrics in the SILMA TTS settings page.

## Troubleshooting

### API Key Not Working
- Verify the API key is correct
- Check that the key has active credits
- Ensure no IP restrictions on the key

### High Latency
- Check network connectivity to Groq API
- Verify you're using Groq models (not OpenAI)
- Monitor connection pool stats

### Migration Not Applied
- Run: `node /tmp/run_migration.js`
- Check database logs for errors

## Cost Considerations

Groq offers competitive pricing:
- Free tier available for testing
- Pay-as-you-go for production
- Significantly cheaper per token than OpenAI

Current pricing (check console for latest):
- Llama 3.3 70B: $0.59/M input tokens, $0.79/M output tokens
- Llama 3.1 70B: $0.59/M input tokens, $0.79/M output tokens

## Resources

- [Groq Console](https://console.groq.com)
- [Groq Documentation](https://console.groq.com/docs)
- [Voice Agent Performance Guide](https://www.ntik.me/posts/voice-agent)