Overview
Embeddings are numerical representations of text that capture semantic meaning. The application uses these vectors to power search functionality and document comparisons. By default, a local embedding model is used, but you can configure the system to use OpenAI’s embedding API or custom ONNX models.Quick Setup
For most users, the default embedding configuration works out of the box. You can easily customize it using environment variables in your deployment.Environment Variables
Embedding Configuration Options
Supported Models
Default Local Model
By default, the application uses the AllMiniLmL6V2 model, which offers:- Fast, efficient embedding generation
- 384-dimensional vectors
- Good balance of performance and quality
- No external API dependencies
OpenAI Models
For higher quality embeddings, you can use OpenAI’s embedding models:- Higher quality embeddings
- More dimensions (1536 for text-embedding-3-small)
- Better semantic understanding
- Requires internet connectivity
- Incurs API usage costs
- Adds network latency
Custom ONNX Models
For advanced users, custom ONNX models can be used:Performance Considerations
OpenAI Models
Higher quality but add latency and cost
Local Models
Faster and work offline but may have lower quality
Custom ONNX
Flexible, configurable for specific use cases
Troubleshooting
Common Issues
OpenAI Connection Errors
OpenAI Connection Errors
- Check your OPEN_RESPONSES_EMBEDDINGS_API_KEY is correct
- Verify network connectivity to OPEN_RESPONSES_EMBEDDINGS_URL
- Confirm your OpenAI account has available quota
Local Model Performance
Local Model Performance
- The default model requires approximately 150MB of RAM
- Ensure your container has sufficient memory allocated
Custom ONNX Model Issues
Custom ONNX Model Issues
- Verify file paths are correct and the files are accessible
- Ensure your model is compatible with the application
- Check logs for specific error messages