Models
The Models section of the Core Screen gives you complete control over which AI models are available in your Mamentis environment and how they're configured.
Available Models
Mamentis integrates with leading AI providers to give you access to the best models for each task:
Text Generation Models
OpenAI
- GPT-4 Turbo: Latest high-performance model for complex reasoning
- GPT-4: Reliable model for most text generation tasks
- GPT-3.5 Turbo: Fast and cost-effective for simpler tasks
Anthropic
- Claude 3 Opus: Top-tier model for complex analysis and reasoning
- Claude 3 Sonnet: Balanced performance and speed
- Claude 3 Haiku: Lightning-fast responses for simple queries
- Gemini Ultra: Google's most capable multimodal model
- Gemini Pro: Balanced performance for general use
- PaLM 2: Specialized for coding and technical tasks
Meta
- Llama 2 70B: Open-source model for privacy-conscious applications
- Code Llama: Specialized for programming tasks
Specialized Models
Microsoft
- Azure OpenAI: Enterprise-grade OpenAI models
- Bing Search Integration: Web-enabled responses
Mistral AI
- Mistral Large: High-performance European AI model
- Mistral Medium: Balanced performance and cost
Others
- DeepSeek: Code-specialized models
- XAI Grok: Real-time information integration
Model Configuration
Enabling/Disabling Models
- Navigate to Core Screen > Models
- Toggle models on/off based on your needs
- Configure access permissions for team members
- Set usage quotas and limits
Performance Settings
Response Quality vs Speed
- High Quality: Uses most capable models, slower responses
- Balanced: Optimal mix of quality and speed
- Fast: Prioritizes quick responses, may sacrifice some quality
Cost Optimization
- Cost-Effective: Favors less expensive models when possible
- Balanced: Considers both cost and performance
- Performance First: Uses best models regardless of cost
Model Management
Usage Analytics
Monitor how your models are performing:
- Response Times: Average time per model
- Success Rates: Completion percentages
- Quality Scores: User satisfaction ratings
- Cost Analysis: Spending per model and time period
A/B Testing
Compare model performance for your specific use cases:
- Create Test Groups: Define different model configurations
- Run Comparisons: Send similar queries to different models
- Analyze Results: Compare quality, speed, and cost
- Optimize Selection: Update routing rules based on results
Model Updates
Stay current with the latest model releases:
- Automatic Updates: Enable auto-updates for model versions
- Testing New Models: Beta access to cutting-edge models
- Version Management: Roll back to previous versions if needed
- Deprecation Notices: Advanced warning when models are retired
Advanced Configuration
Local Model Deployment
Run models on your own infrastructure:
- Docker Containers: Deploy models in containers
- Kubernetes: Scale model deployments
- GPU Resources: Optimize for hardware acceleration
- Privacy Mode: Keep all data on-premises
Fine-Tuning
Customize models for your specific domain:
- Data Preparation: Upload training data
- Training Configuration: Set hyperparameters
- Training Monitoring: Track progress and metrics
- Model Deployment: Deploy fine-tuned models
- Performance Validation: Test against benchmarks
Security and Compliance
Data Handling
- Encryption: All model communications encrypted
- Data Residency: Choose geographic regions for processing
- Audit Trails: Complete logs of model usage
- Access Controls: Role-based permissions for model access
Compliance Features
- SOC 2: Certified secure model operations
- GDPR: European data protection compliance
- HIPAA: Healthcare data protection (Enterprise only)
- Custom Compliance: Configure for industry-specific requirements
Best Practices
Model Selection
- Task-Specific: Use specialized models for specific tasks
- Fallback Strategy: Configure backup models for reliability
- Performance Monitoring: Regularly review model performance
- Cost Management: Balance performance with budget constraints
For model characteristics and task-to-model mapping tips, see the high-level AI Models guide and related sections.
Optimization Tips
- Batch Processing: Group similar requests for efficiency
- Caching: Enable response caching for repeated queries
- Load Balancing: Distribute requests across multiple models
- Rate Limiting: Prevent exceeding API limits
Troubleshooting
Model Unavailable:
- Check service status on provider dashboards
- Verify API keys and permissions
- Try alternative models with similar capabilities
Poor Performance:
- Review model selection criteria
- Adjust routing rules
- Consider upgrading to more capable models
High Costs:
- Analyze usage patterns
- Implement cost controls and budgets
- Optimize model selection for cost-effectiveness
Integration with Other Features
The Models section integrates seamlessly with:
- Data: Models can access your configured data sources
- Knowledge: Models leverage your knowledge base
- Tools: Models can call external tools and APIs
- Compute: Models utilize available compute resources
Next, explore how to manage your Data sources and connections.