Delip Rao
banner
deliprao.bsky.social
Delip Rao
@deliprao.bsky.social
Building. Affiliations: @JHU, @Penn, @UCSC, @Amazon, @Twitter || Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈
Pinned
Did you just get your BlueSky invite? great! Now, help me complete my threads graph. 😘

https://www.threads.net/@delip.rao
Reposted by Delip Rao
Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social
November 7, 2025 at 9:11 PM
Reposted by Delip Rao
Yeah, posting something that big for us 2mn before the we in the US and late in the evening in France is so not ideal right before a 4 day week-end here, lol so we'll redo it again and tell you guys much more.. #TrainingTragedy
Tbh the only visual allegory possible is this...
November 7, 2025 at 10:51 PM
Reposted by Delip Rao
😳 WithdrarXiv 🙏

- Dataset of 14K+ withdrawn arXiv papers
- associated retraction comments
- entire history through 09/24
- taxonomy of retraction reasons, from critical errors to policy violations
- WithdrarXiv-SciFy, enriched version w/ scripts for parsed full-text PDFs

arxiv.org/abs/2412.03775
WithdrarXiv: A Large-Scale Dataset for Retraction Study
Retractions play a vital role in maintaining scientific integrity, yet systematic studies of retractions in computer science and other STEM fields remain scarce. We present WithdrarXiv, the first larg...
arxiv.org
December 15, 2024 at 6:34 PM
Reposted by Delip Rao
Stumbled across this post on Substack by
@deliprao.bsky.social today that I really appreciated as someone trying to break into the field. Simple categorizations can seem trite at times, but they can be deceptively profound in breaking down complex problems.

substack.com/home/post/p-...
Juicy Research Ideas and How to Find them?
How do people come up with research ideas in AI? Will the "AI Scientist" finally make me work full-time on my chicken farm?
substack.com
December 9, 2024 at 1:04 AM
anyone on my TL can endorse me for cs.DL (digital libraries) on arXiv? 🙏
December 4, 2024 at 10:56 PM
Reposted by Delip Rao
Releasing: a dataset of two million Bluesky posts.

This dataset has been collected using Bluesky's API, and I hope it will be useful for all the researchers out there!
November 27, 2024 at 7:13 PM
May I propose beets
November 23, 2024 at 2:35 AM
Did you just get your BlueSky invite? great! Now, help me complete my threads graph. 😘

https://www.threads.net/@delip.rao
July 6, 2023 at 3:09 AM
Posts here are called beets. I don’t make the rules.
April 28, 2023 at 4:31 AM
Reposted by Delip Rao
get in loser

we’re re-territorializing the hilbert space
April 28, 2023 at 1:17 AM