|

"what would be the rough monthly cloud costs to run a dedicated private cloud server with multiple AMD Ryzen CPU + AMD Radeon GPU distributed configuration so that there's at least 400GB of total GPU RAM and I would be choosing for the cloud server(s) to be located in the western United States? My intention is I want to run a private dedicated AI server for self-hosting your largest Deepseek R1 open model with the most parameters, I believe over 600 billion with a model size of over 400 gigabytes, is that right? You may give me price comparisons between those common providers in the form of estimated monthly cost, and you may phrase these cost numbers in terms of the currently well understood state of the industry when it comes to startup cloud AI expenses for self-hosted AI web applications"

"Running a **private dedicated cloud server** with **400GB+ total GPU RAM** for hosting large AI models like DeepSeek-R1 in the western US involves significant costs. Below is a detailed cost breakdown based on current industry pricing (Q3 2024), along with technical considerations:

---

### **Key Technical Requirements**

1. **GPU VRAM**: ≥400 GB total (e.g., 10× RTX 6000 Ada [48GB each] or 5× A100 80GB).

2. **CPU**: Multi-core AMD Ryzen 9 or EPYC (matches AMD GPU preference).

3. **RAM**: ≥800 GB system RAM (2× GPU VRAM).

4. **Storage**: 2–4 TB NVMe SSD (for 400GB+ model + datasets).

5. **Network**: 10–100 Gbps dedicated bandwidth.

6. **Location**: US West (Oregon/N. California).

---

### **Monthly Cost Estimates (Western US)**

| **Provider** | **Configuration** | **Monthly Cost** |

|---------------------|---------------------------------------------------|------------------|

| **AWS** | 10x `g5.48xlarge` (8x A10G = 192GB VRAM) ❌
+ **A100 Custom** (8x A100 80GB = 640GB VRAM) | $12,000 – $35,000+ |

| **Google Cloud** | 5x `a2-ultragpu-8g` (8x A100 80GB = 640GB VRAM) | $30,000 – $40,000 |

| **Azure** | 5x `ND96amsr_A100_v4` (8x A100 80GB + AMD EPYC) | $28,000 – $36,000 |

| **Lambda Labs** | 10x RTX 6000 Ada (480GB VRAM) + Dual EPYC | $15,000 – $20,000 |

| **CoreWeave** | 5x H100 80GB (400GB VRAM) + Custom AMD EPYC | $25,000 – $35,000 |

> ⚠️ **Note**: **AMD Radeon GPU servers are extremely rare** in public clouds. NVIDIA (A100/H100) dominates due to CUDA/cuDNN support. AMD GPUs would require custom setups (not cost-effective).

---

### **Cost Drivers & Realities**

1. **GPU Costs**:

- A100 80GB: ~$2.50–$3.50/hr → **$1,800–$2,500/month per GPU**.

- 400GB VRAM ≈ 5x A100 80GB ($9k–$12.5k/month) or 10x RTX 6000 Ada ($7k–$10k/month).

2. **AMD Hardware Limitations**:

- No major cloud offers AMD Radeon GPUs for AI (poor ROCm ecosystem support).

- Ryzen CPUs are rare in cloud servers; EPYC is standard (AWS/Azure offer AMD CPU options).

3. **DeepSeek-R1 Model Reality**:

- The largest DeepSeek-R1 is **67B parameters** (~260GB FP32, **~130GB in 8-bit quantized**).

- *No 600B-parameter model exists* (confusion with Gemini/Mixtral?).

- **400GB VRAM is overkill** for DeepSeek-R1; 200–300GB suffices for larger 70B-120B models.

---

### **Recommendations for Cost Efficiency**

1. **Use Quantized Models**:

- 8-bit quantization reduces VRAM needs by 50% (e.g., 70B model fits in 140GB VRAM).

2. **Hybrid Cloud**:

- Reserve 1–3 year commitments (AWS/Azure: 40–60% savings).

3. **Consider Inferencing Optimizers**:

- vLLM or TensorRT-LLM improve throughput → fewer GPUs needed.

4. **Alternative Providers**:

- **Lambda Labs**: $1.50/hr for RTX 6000 Ada (48GB) → 10x GPUs = $10,800/month.

- **Vast.ai** (Spot): ~$0.75/hr for A100 80GB → 5x GPUs = ~$2,700/month (unreliable for production).

---

### **Realistic Monthly Budget**

| **Scenario** | **Provider** | **Estimated Cost** |

|---------------------------|--------------|-------------------|

| **Production (Stable)** | AWS / Azure | $18,000 – $28,000 |

| **Budget (Spot/Preemptible)** | Vast.ai/Lambda | $8,000 – $15,000 |

| **AMD DIY (Colocation)** | Custom Build | $6,000–$10,000* |

> *DIY: Includes colocation ($500/month) + 10x AMD MI250 (384GB VRAM) @ $40k hardware amortized over 3 years. **Not recommended** due to driver/software hurdles.

---

### **Summary**

- **Minimum realistic cost**: **$8,000–$12,000/month** for a 200–400GB VRAM cluster (using spot/preemptible instances).

- **Dedicated production-grade**: **$20,000–$35,000/month**.

- **AMD GPUs are impractical**—opt for NVIDIA A100/H100 to avoid toolchain issues.

- **Scale down**: Use model quantization + smaller clusters. DeepSeek-R1 runs on **2x A100 80GB** ($3k/month).

For a startup, start with smaller instances (e.g., 1–2 GPUs) and scale as traffic grows. Consider serverless options (Banana.dev, Beam) for lower fixed costs."

|

|

| %

| (deepthink r1 AI cloud frontier model 2025-08-17--1155--usawest)

-- ---- ----

---- ----

Reply to this note

Please Login to reply.

Discussion

No replies yet.