Dallas Card
banner
dallascard.bsky.social
Dallas Card
@dallascard.bsky.social
Assistant professor at https://si.umich.edu/ working in computational social science, machine learning, and NLP | https://dallascard.github.io
Ah yes, good point! That's a careless misuse of language on my part. What I meant is that I think the resulting estimate should be correct in expectation (with respect to the sample of labeled/unlabeled data), regardless of the amount of labeled data, but with lower variance for larger samples.
November 18, 2025 at 4:25 PM
I think you're right, although I also cynically expect that a unique first author requirement would lead to a lot of fake first authors (depending on the venue) 😅
November 15, 2025 at 11:22 PM
Excellent thread! This also reminds me of David Bamman's work on films as data, which, if I understand correctly, you *are* legally allowed to use for research, as long as you own and retain a physical copy, and as long as you don't enjoy watching it : )

www.pnas.org/doi/abs/10.1...
Measuring diversity in Hollywood through the large-scale computational analysis of film | PNAS
Movies are a massively popular and influential form of media, but their computational study at scale has largely been off-limits to researchers in ...
www.pnas.org
November 14, 2025 at 10:12 PM
I wonder what would happen if a major conference made a rule that each author is only allowed to submit one paper per cycle? Obviously total submissions would be much smaller, and many papers would be redirected elsewhere, but could they convince people to only send their best work?
November 14, 2025 at 10:00 PM