Running FULL DeepSeek R1 671B Locally (Test and Install!)

Bijan Bowen 62,180 2 weeks ago

Video Not Working? Fix It Now

Timestamps: 00:00 - Intro 01:12 - How It Works 08:27 - Performance Monitoring 10:27 - Setup Steps 20:55 - Running R1 22:34 - First Output 23:50 - Live Output 24:57 - Comparing Test Results 26:55 - Testing Output 28:50 - Closing Thoughts Can you really run the full 671B parameter DeepSeek R1 model locally? In this video, we take on the challenge of running this massive model offline and on local hardware—all thanks to Unsloth AI's dynamic quantization technique, which compresses the model by up to 80% for more efficient execution. We start by explaining how the dynamic quantization process works and how it allows us to run DeepSeek R1 on an enthusiast-grade system with at least 80GB of combined system memory. Next, we monitor performance, analyzing how the model utilizes system RAM and VRAM. Then, we walk through the full setup process using llama.cpp, covering key configurations that can be tricky for first-time users. Once everything is ready, we put the model to the test—running it live, comparing outputs, and analyzing performance to see how well DeepSeek R1 performs locally.

Comment