Marco
banner
mcognetta.bsky.social
Marco
@mcognetta.bsky.social
Language and keyboard stuff at Google + PhD student at Tokyo Institute of Technology.

I like computers and Korean and computers-and-Korean and high school CS education.

Georgia Tech → 연세대학교 → 東京工業大学.

https://theoreticallygoodwithcomputers.com/
Reposted by Marco
I will be presenting our paper about tokenizer inequities at the main conference on Dec 4th at 11am (Poster Session 3) bsky.app/profile/cath...
Our #NeurIPS2025 paper shows that even comparable monolingual tokenizers have different compression rates across languages. But by getting rid of whitespace tokenization and using a custom vocab size for each language, we can reduce token premiums. Preprint out now!
November 24, 2025 at 5:20 PM
The cost to me is basically "how much time did it take to write the prompt" + "how much time did it take to verify the output".

The initial start up time of getting the agent to write the code is negligible.
November 23, 2025 at 10:47 PM
2) is much more interesting imo, since the agent could still take a long time to complete each task individually if it is the automation.

But if it is able to write a program that basically completes the task as fast as possible, then I really don't care how long it takes to write the program.
November 23, 2025 at 10:47 PM
This can be interpreted in two ways:

1) the amount of time that it takes the agent to complete the task

but more importantly

2) the amount of time it takes the agent to complete the task of "writing a program to automate this task"
November 23, 2025 at 10:47 PM
Reposted by Marco
The Recurse Center is a self-directed retreat for programmers, coming to make for the joy of making, collaborate with kind peers, and of course— become a dramatically better programmer. We don’t charge tuition, since we’re fully funded by our integrated recruiting team.

Applications are now open!
November 22, 2025 at 10:04 PM
100% worth it. The cities are extremely English friendly also.
November 19, 2025 at 6:28 AM
I used to get this 부추국수 that had a preposterous amount of 부추 and 유부 and it was like 6000₩. It was awesome.
November 19, 2025 at 6:28 AM
Reposted by Marco
My strength is the breadth of my opening repertoire.
December 17, 2024 at 12:16 PM
But TBH I don't really think it's an LLM problem because they could probably do a pretty good job of synthesizing your thoughts if you gave it to them as stream of consciousness or bullet points.

It's more the system that has incentivized and allowed things like this to happen that is the problem.
November 17, 2025 at 8:53 AM
I'm personally pretty bummed out about LLM usage in reviews. I've seen it so much lately (in the papers I have written and on the other reviews of papers I have reviewed).
November 17, 2025 at 8:53 AM
It's definitely a grey area and I think a lot of people have good intentions when using it, so I would be hesitant to lay down a blanket ruling.

Sucks that they been used so blatantly and badly (well, maybe it's actually good since it prompted the the community to start having this conversation).
November 17, 2025 at 8:46 AM
I'd say no to rephrasing but yes to "point out ambiguities and places to rephrase, etc." and then rewriting it yourself.

I use LLMs in very very limited cases for things like reviews. Mostly as supercharged spell check, but never anything that rewrites more than ~3 words in a row.
November 17, 2025 at 8:46 AM
Sydney and Melbourne are on my list for the coffee.
November 17, 2025 at 4:48 AM
Having a daily coffee spot was a really nice for me. I like being a regular at places and even though I really enjoy making coffee every morning (and the savings), I miss having "that spot".
November 17, 2025 at 3:06 AM
Good setup! It's tough in apartments in Asia!

Changing my tastes to like pour overs would have been the more prudent option financially, but in the end I just got a moka pot and then still went to my neighborhood spot for espresso.
November 17, 2025 at 3:06 AM