System Status
Operating System
β OK
Windows 11 Pro
Build 22621.963
CUDA Version
β OK
12.1
Drivers: 535.104.05
PyTorch Version
β OK
2.2.0
CUDA 12.1 support enabled
CPU
β OK
AMD Ryzen 9 7950X
16 cores / 32 threads @ 4.5 GHz
GPU
β OK
NVIDIA RTX 4090
24 GB VRAM | Compute 8.9
23.8 GB available (99% free)
RAM
β OK
64 GB DDR5
5600 MHz
42 GB available (65% free)
Storage
β OK
2 TB NVMe SSD
Read: 7000 MB/s | Write: 5000 MB/s
1.2 TB available (60% free)
Python Environment
β OK
Python 3.10.12
Virtual environment: mokingbird_env
Recommendations
Excellent VRAM! You can fine-tune 7B-13B models with LoRA or 7B models with full fine-tuning.
NVMe SSD detected. Fast checkpoint saving and data loading.
Suggested: Enable Unsloth for 2x faster training on your RTX 4090.
Consider enabling Flash Attention 2 for memory efficiency.
Dataset Selection
Model Selection
Llama 3.1 8B Instruct
Meta's flagship model optimized for instruction following and chat applications.
Min VRAM:
8 GB
Recommended:
16 GB
Architecture:
Llama
License:
Llama 3.1
Qwen 2.5 7B Instruct
Alibaba's multilingual model with strong performance across languages.
Min VRAM:
8 GB
Recommended:
14 GB
Architecture:
Qwen
License:
Apache 2.0
Mistral 7B v0.3
Mistral AI's efficient 7B model with sliding window attention.
Min VRAM:
8 GB
Recommended:
14 GB
Architecture:
Mistral
License:
Apache 2.0
Phi 3 Mini 4K
Microsoft's small but powerful model optimized for efficiency.
Min VRAM:
4 GB
Recommended:
8 GB
Architecture:
Phi
License:
MIT
Gemma 2 9B Instruct
Google's open model with strong performance and safety features.
Min VRAM:
10 GB
Recommended:
18 GB
Architecture:
Gemma
License:
Gemma
DeepSeek Coder 6.7B
Specialized code generation model with strong programming capabilities.
Min VRAM:
8 GB
Recommended:
14 GB
Architecture:
DeepSeek
License:
MIT
Framework Selection
Recommended Framework
β #1
Auto-selected based on your setup
Unsloth
95% confidence
Speed
2x faster
Memory
-60% VRAM
Compatibility
Excellent
Optimized for your RTX 4090 GPU
Native support for LoRA/QLoRA
Automatic Flash Attention 2
Best for Llama 3.1 architecture
Supervised Fine-Tuning (SFT)
Reinforcement Learning (RL)
π― Smart Config Matcher
User Preferences
Training Configuration
Model
Llama 3.1 8B
Dataset
medical_qa_5k
Method
LoRA (rank 32)
Framework
Unsloth
Estimated Resources
VRAM Usage
22.5 / 24 GB
Training Time
~3.5 hours
Expected Quality
0.92 accuracy
Est. Cost (Cloud)
$2.50
β Ready to Train
All prerequisites met. Click "Start Training" to begin fine-tuning your model.
Training will take approximately 3.5 hours with estimated quality of 0.92.
Training Loss
Gradient Norm
Post-Processing & Deployment
1οΈβ£ Merge LoRA Weights
Merge adapter weights into base model for standalone deployment.
2οΈβ£ Convert to GGUF
Convert to llama.cpp format for CPU/GPU inference.
3οΈβ£ Deploy to Ollama
Create Modelfile and import to Ollama for easy local serving.
4οΈβ£ Deploy to vLLM
High-throughput serving with OpenAI-compatible API.
5οΈβ£ Convert to ONNX
Export to ONNX format for cross-platform deployment.
6οΈβ£ Push to HuggingFace
Upload model to HuggingFace Hub for sharing.