Lightnews — Scholar-powered news

Marco

@mcognetta.bsky.social

2.3K followers 1.3K following 710 posts

Language and keyboard stuff at Google + PhD student at Tokyo Institute of Technology.

I like computers and Korean and computers-and-Korean and high school CS education.

Georgia Tech → 연세대학교 → 東京工業大学.

https://theoreticallygoodwithcomputers.com/

Posts Replies Media Videos

Pinned

Marco @mcognetta.bsky.social · Jan 5

A lot of you followed me due to #NLP, but I like to post about #chess (especially computer chess), #programming (especially puzzles, code golf, etc), and machine learning.

And some less technical stuff like #Korean, #Esperanto, and #trains (mostly in Japan, just due to proximity).

Reposted by Marco

Recurse Center

@recursecenter.bsky.social

The Recurse Center is a self-directed retreat for programmers, coming to make for the joy of making, collaborate with kind peers, and of course— become a dramatically better programmer. We don’t charge tuition, since we’re fully funded by our integrated recruiting team.

Applications are now open!

November 22, 2025 at 10:04 PM

Reposted by Marco

Michael Saxon

@saxon.me

And here is the presentation I gave on networking, self-promo, and how to make the most out of a conference. Hope this helps for everyone at NeurIPS!

www.youtube.com/watch?v=B9hG...

Conferencemaxxing: How to grow your profile and network as a scientist

YouTube video by Michael Saxon (NLP & Generative AI research)

www.youtube.com

November 19, 2025 at 11:59 PM

Reposted by Marco

Marco

@mcognetta.bsky.social

My strength is the breadth of my opening repertoire.

December 17, 2024 at 12:16 PM

Reposted by Marco

Jeff Dean

@jeffdean.bsky.social

I’m really excited about our release of Gemini 3 today, the result of hard work by many, many people in the Gemini team and all across Google! 🎊

blog.google/products/gem...

Gemini 3 performs quite well on a wide range of benchmarks.

November 19, 2025 at 2:53 AM

Reposted by Marco

Gasper Begus

@begus.bsky.social

A whale conversation in whale vowels. Pinchy the whale and her conversant.

The vowels are so clear that they can be transcribed with our human letters.

aye, aye!

November 19, 2025 at 12:22 AM

Reposted by Marco

Marisa Hudspeth

@marisahudspeth.bsky.social

(1/2) 🎉 New preprint: "Contextual Morphologically-Guided Tokenization for Latin Encoder Models"
w/ @diyclassics.bsky.social @brenocon.bsky.social

November 14, 2025 at 8:02 PM

Reposted by Marco

mulboyne.bsky.social

@mulboyne.bsky.social

Through November, JR East is hosting live music performances in a Green Car running on the Chuo Line. www.traicy.com/posts/202511...

JR東日本、中央線快速グリーン車で生演奏イベント　11月の土休日に、各日1往復 - TRAICY（トライシー）

JR東日本は、中央線快速グリーン車内で生演奏イベントを11月1日から30日までの土休日に実施する。中央線快速グリーン車5号車1階にて、プロの演奏者による生演奏を行う。実施列車・区間は、豊田駅午後1時41分発の快速東京駅 […]

www.traicy.com

November 18, 2025 at 2:41 AM

Reposted by Marco

Marco

@mcognetta.bsky.social

I wrote a short blog post about masked softmax layers in PyTorch (i.e., when you have structural constraints that tell you some classes _must_ have probability zero).

This was based on a real bug I found in a neural chess model implementation.

Masked Softmax Layers in PyTorch

Correctly computing masked softmax layers.

mcognetta.github.io

November 3, 2025 at 7:39 PM

Marco

@mcognetta.bsky.social

Do we think Pepsi is trying to replicate "AI co-created" Coke or GenAI Coke Christmas commercials?

Job posting for an AI Engineer at Pepsi.

All the different wheel configurations in the Coke Christmas ad (partially generated by AI).

November 17, 2025 at 1:36 AM

Marco

@mcognetta.bsky.social

My apartment in Tokyo was too small for an espresso machine, but I'm back at it now.

November 16, 2025 at 11:46 PM

Reposted by Marco

Brittany Ellich

@brittanyellich.com

Are we still doing starter packs?

Put this one together because I love seeing things that lovely folks write on the internet, and I'm sure there are more people to meet and add to this list.

go.bsky.app/AnM2t7r

November 15, 2025 at 7:23 PM

Reposted by Marco

hardmaru

@hardmaru.bsky.social

Great to see Tarin Clanuwat featured for her amazing work. She has a deep love for Japanese classical literature and is using AI to build bridges to that past for everyone.

www.tokyoupdates.metro.tokyo.lg.jp/post-1670/

We’re lucky to have her driving this at Sakana AI.

November 14, 2025 at 3:55 AM

Marco

@mcognetta.bsky.social

Incredible figure for the first page. Just brutal.

November 13, 2025 at 10:15 PM

Reposted by Marco

Owen Lacey

@owenlacey.dev

Really happy to have published this post that I've been working on for a few months now 🥰

Safe to say I enjoy these side quests - I'd like to think it's the first of many!

blog.owenlacey.dev/posts/are-yo...

"Are you the one?" is free money

blog.owenlacey.dev

November 10, 2025 at 2:35 PM

Marco

@mcognetta.bsky.social

A side channel attack on streaming LLMs where one can recover conversation topics while only seeing encrypted packet response streams.

arxiv.org/abs/2511.03675

Whisper Leak: A novel side-channel attack on remote language models | Microsoft Security Blog

Understand the risks of encrypted AI traffic exposure and explore practical steps users and cloud providers can take to stay secure. Learn more.

www.microsoft.com

November 10, 2025 at 6:11 AM

Reposted by Marco

Marco

@mcognetta.bsky.social

I was struck with an incredible thought: The Subword Tolkienizer.

The One Ring inscription, but after subword tokenization.

November 8, 2025 at 7:58 AM

Marco

@mcognetta.bsky.social

I was struck with an incredible thought: The Subword Tolkienizer.

November 8, 2025 at 7:58 AM

Reposted by Marco

EMNLP

@emnlpmeeting.bsky.social

🎉 Congratulations to all #EMNLP2025 award winners 🎉

Starting with the ✨Best Paper award ✨:

"Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index"
by Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, and Hannaneh Hajishirzi
aclanthology.org/2025.emnlp-m...

1/n

An image of the best paper slide at the EMNLP2025 conference, with the audience in the background

November 7, 2025 at 10:29 PM

Reposted by Marco

Sam Rose

@samwho.dev

Got to the part of "temperature" and I'm aware that a higher temperature == less predictable but never knew why.

Turns out it's very simple. Before the "score" for a set of tokens is turned into a probability distribution it's divided by the temperature. Higher values "flatten" the distribution.

November 6, 2025 at 5:47 PM

Reposted by Marco

Taylor Smith

@taylorjsmith.bsky.social

Just added my book, "Theory of Computing: An Open Introduction" to OER Commons, and working on getting it listed in Canadian repositories too. One step closer to making education more open and accessible to everyone!
oercommons.org/courses/theo...

Theory of Computing: An Open Introduction

This book is suitable for courses on the theory of computing at both the undergraduate and graduate levels, and for self-study. Topics are introduced in a logical order: we begin with the simple finit...

oercommons.org

November 6, 2025 at 6:12 PM

Reposted by Marco

Tomer Ullman

@tomerullman.bsky.social

It’s grad school application season, and I wanted to give some public advice.

Caveats:
-*-*-*-*

 > These are my opinions, based on my experiences, they are not secret tricks or guarantees
 > They are general guidelines, not meant to cover a host of idiosyncrasies and special cases

November 6, 2025 at 2:55 PM

Marco

@mcognetta.bsky.social

This is very high on my list of advice for PhD applicants.

I've written two SoP (masters and PhD) and the similarities between the things I wrote about in the SoP and the things I wrote my theses on ends roughly at "written in English".

Tomer Ullman @tomerullman.bsky.social · 16d

Mistake 3, cont': people worry they narrow down by proposing specific questions ("What if this is not the EXACT thing I want to work on in grad school?").

But an SoP is not a *contract*, it will not be waved in front of you when starting grad school.

November 7, 2025 at 12:20 AM

Reposted by Marco

Tomer Ullman

@tomerullman.bsky.social

November 6, 2025 at 2:55 PM

Reposted by Marco

Bluesky

@bsky.app

y'all seem to really like baseball bsky.social/about/blog/1...

The World Series Was Electric — So Was Bluesky - Bluesky

“How can you not be romantic about baseball?” — Moneyball 2011

bsky.social

November 6, 2025 at 9:58 PM

Reposted by Marco

Gabriele Sarti

@gsarti.com

Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the Machine Translation morning session (Room A301, 11:45 China time). See you there! 🤗

Paper: aclanthology.org/2025.emnlp-m...
Slides/video/poster: underline.io/events/502/s...

Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement

Gabriele Sarti, Vilém Zouhar, Malvina Nissim, Arianna Bisazza. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.

aclanthology.org

November 6, 2025 at 1:19 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news