Videos
https://huggingface.co/deepseek-ai/DeepSeek-R1
So, it looks like Sambanova is going to be removing access to Llama 3.1 Instruct 405B for free soon, and with the release of deepseek R1, and the wide array of models they have released, it makes me wonder how many paramters the model in the app is using.
I cant find a clear answer - albeit I didn't look for TOO long. Sambanova was clearly flexing their tech by offering Llama 3.1 Instruct 405B for free at over 100 token/second - a marketing ploy. Makes sense, because to offer a model that big for free would take serious resources.
Resources I'm not sure Deepseek has, in spite of their impressive model and hedgefund daddies.
OR maybe i'm wrong, and they want to throw some weight around and put the big 671B model out for free for the whole world to see in the app. I don't think they want to burn cash like that... but maybe i'm wrong...
Anybody have any insight into how many parameters the models on the deepseek app are, that are available for public use in their free offering?
excuseme if this is dumb question, im complete amateur in this, but im curious: i know you can download different models of DeepSeek-R1 localy and they ranging from 1,5gb size up to 402gb.. but what of this models use online version of DeepSeek? Thank you.