Mistral Small - The best model in its weight class. With improved text performance, multimodal understanding, and an expanded context window of up to 128k tokens. Outperforms comparable models while delivering inference speeds of 150 tokens per second.
Runs on a single RTX 4090 or Mac with 32GB RAM. Perfect for on-device applications.
150 tokens per second inference speed with low-latency function calling capabilities.
Excellent foundation for building specialized models in legal, medical, and technical domains.
Supports image analysis, document verification, and visual inspection tasks.
24B parameters model delivering GPT-4 level performance with faster response times.
Your data stays private with our secure infrastructure and strict privacy policies.
Easy integration with comprehensive API documentation and code examples.
Free tier available with competitive pricing for enterprise usage.