🇨🇦 🇧🇬 🇪🇺
* fosstodon.org/@kfet
* threads.net/@kalinf
* github.com/kfet
Not sure if they used Mistral’s work though.
Deepseek only got a good MoE model way later. I think that’s why I got the impression they followed Mistral’s work.
Not sure if they used Mistral’s work though.
Deepseek only got a good MoE model way later. I think that’s why I got the impression they followed Mistral’s work.