Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

DeepSeek’s distilled new R1 AI model can run on a single GPU


Deepseek’s Updated R1 Reasoning AI model This week may attract the attention of the AI ​​Society. However, a smaller, “distilled” version of the new R1, DeepSeek-R1-0528-Qwen3-8B, a smaller, “distilled” version of the “distilled” version of the “distilled” sizes in certain labels.

R1 of smaller updated R1 Qwen3-8b model Alibaba launched a foundation in May, performing better than Google Twins 2.5 Flash In 2025, a collection of difficult math questions.

DeepSEEK-R1-0528-QWEN3-8B ​​also adapts that Microsoft has recently been released PHI 4 justification plus Model in testing another math skills, HMMT.

The so-called distilled models such as DeepSEEK-R1-0528-QWEN3-8B ​​are generally less levels of colleagues. On the top, they require less reports. According to The Cloud platform requires a GPU to run NODESHIFT, QWEN3-8B, 40GB-80GB RAM (eg Nvidia H100). Full-sized New R1 needs around a few 80gb GPUs.

DeepSeek, Taking the text created by the updated R1, trained DeepSEEK-R1-0528-QWEN3-8B ​​and taking the text using the Gwen3-8B to delicate. AI Dev Platform FACE’s Hugging Face, DeepSeek-in DeepSeek-R1-0528-Qwen3-8b describes DeepSeek-R1-0528-Qwen3-8B for academic research on “information and industrial development.

DeepSEEK-R1-0528-QWEN3-8B ​​is available under a commercially used permission license without permission. Including several hosts, including LM studiooffer the model through an API already.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *