Lightnews — Scholar-powered news

Light up
your news

About Privacy Terms Help

Masooka

Masooka

@masooka.bsky.social

32 followers 11 following 3 posts

Machine learning enthusiast.

Posts Replies Media Videos

Masooka

@masooka.bsky.social

Exploring Rust for competitive programming but finding stdin a bit slow. 🤔 Tempted to dive into the world of unsafe for that speed boost.

August 10, 2023 at 5:26 AM

Masooka

@masooka.bsky.social

This is an excellent starting point to grasp LLM concepts. The visualizations of a sequence-to-sequence model with attention were very helpful.

https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)

Translations: Chinese (Simplified), French, Japanese, Korean, Persian, Russian, Turkish Watch: MIT’s Deep Learning State of the Art lecture referencing this post May 25th update: New graphics (RNN animation, word embedding graph), color coding, elaborated on the final attention example. Note: The animations below are videos. Touch or hover on them (if you’re using a mouse) to get play controls so you can pause if needed. Sequence-to-sequence models are deep learning models that have achieved a lot of success in tasks like machine translation, text summarization, and image captioning. Google Translate started using such a model in production in late 2016. These models are explained in the two pioneering papers (Sutskever et al., 2014, Cho et al., 2014). I found, however, that understanding the model well enough to implement it requires unraveling a series of concepts that build on top of each other. I thought that a bunch of these ideas would be more accessible if expressed visually. That’s what I aim to do in this post. You’ll need some previous understanding of deep learning to get through this post. I hope it can be a useful companion to reading the papers mentioned above (and the attention papers linked later in the post). A sequence-to-sequence model is a model that takes a sequence of items (words, letters, features of an images…etc) and outputs another sequence of items. A trained model would work like this: Your browser does not support the video tag.

jalammar.github.io

May 20, 2023 at 2:04 PM

Masooka

@masooka.bsky.social

Just started exploring Mojo, a new programming language for AI. It looks quite promising.

https://docs.modular.com/mojo/

Modular Docs - Mojo🔥

A new programming language that bridges the gap between research and production by combining the best of Python with systems and metaprogramming.

docs.modular.com

May 12, 2023 at 1:58 AM