VRAMify

Calculate VRAM requirements for LLM inference

Model Parameters
Inference Settings
Context window size
VRAM Requirements
Model Weights
14.00 GB
KV Cache
0.50 GB
Total VRAM
14.50 GB
Fits on RTX 4090 (24GB)
📐 Formulae Used
Model Weights: params × bytes_per_param
KV Cache: 2 × layers × hidden_dim × seq_len × batch_size × bytes_per_element