Amanvir Parhar
banner
amanvir.bsky.social
Amanvir Parhar
@amanvir.bsky.social
building azigy.com • studying @ umd cs • making things amanvir.com • writing venusgirdle.com
Thanks Xavier, really glad you like the site! 😄
June 18, 2025 at 12:44 AM
If you’d like to check out some obscure islands that I find interesting... :)
Obscure Islands I Find Interesting
Explore this curated map of interesting yet relatively unknown islands.
amanvir.com
May 26, 2025 at 10:40 AM
Feels surreal seeing a creative project of mine in such a major publication!
Summer Culture Preview
What’s happening this season in TV, movies, music, art, theatre, and dance.
www.newyorker.com
May 26, 2025 at 10:40 AM
If you'd like to run the visualization and play around with it, please feel free to check it out on my website: amanvir.com/gpt-2-attention!
GPT-2's Attention Weights, Visualized
A tool to visualize attention patterns in the GPT-2 model as it generates text.
amanvir.com
April 20, 2025 at 5:31 PM
I ended up writing an entire article on my website that explains what Dawkins' weasel is and how it can be used to create the cool effect you see in the video below: amanvir.com/blog/animati...

Feel free to check it out, and let me know what you think!
Animating Text with Dawkins' Weasel
Using Richard Dawkins' famous evolutionary simulation to create a text scramble effect.
amanvir.com
March 18, 2025 at 11:02 AM
Haha, that sounds awesome! Truly a win-win 😄
February 28, 2025 at 8:07 PM
If you want to check out the post, please visit: amanvir.com/blog/turning-my-esp32-into-a-dns-sinkhole
Turning My ESP32 into a DNS Sinkhole to Fight Doomscrolling
A fun project that taught me a lot about DNS and low-level programming.
amanvir.com
February 28, 2025 at 10:35 AM
Currently, this is a pretty simple program, but I hope that I (or anyone else in the community!) can expand on it by adding functionality to tackle typos, gibberish text, and other issues found within public domain literature.

Here’s the GitHub repo for this project:
github.com/amanvirparha...
GitHub - amanvirparhar/gempress: A script to fix basic typesetting and formatting issues in public domain eBooks.
A script to fix basic typesetting and formatting issues in public domain eBooks. - amanvirparhar/gempress
github.com
February 12, 2025 at 11:33 PM
These indices are used to create a new ePub that *only* includes meaningful parts of the book's text - the parts that we're interested in reading!
February 12, 2025 at 11:33 PM
The model is asked to respond with JSON that contains the indices for the first and last paragraphs of each chapter. Gemini is able to accurately return the index of any given paragraph, because each “paragraph” from the book’s raw text is wrapped with a numbered tag which the model can reference:
February 12, 2025 at 11:33 PM
The script I ended up writing works like this: the model first takes in a prompt and the book’s content. It's not asked to return a corrected version of the entire book (output limit = ~8k toks), but is instead asked to return JSON that aligns with a provided schema (structured generation).
February 12, 2025 at 11:33 PM
The model also performs pretty well at “needle In a haystack”-type tests, so it seemed like the perfect choice! (some tweets by @jeffdean.bsky.social!)
February 12, 2025 at 11:33 PM
I started thinking about fixing some of these typesetting/formatting issues using an LLM, and I quickly realized that Gemini would be perfect for this project - the 1M+ token context window would allow me to paste the text for an entire book in one shot.
February 12, 2025 at 11:33 PM
In my experience, most books on Project Gutenberg are unfortunately formatted in a non-standard way, and a lot of books are actually plagued with much bigger issues than the one I was looking for.

This is essentially the reason why initiatives like standardebooks.com exist.
Standard Ebooks
Free and liberated ebooks, carefully produced for the true book lover. Download free ebooks with professional-quality formatting and typography, in formats compatible with your ereader.
standardebooks.com
February 12, 2025 at 11:33 PM
The ePub I found wasn’t properly formatted: it contained a duplicate, malformed copy of the table of contents spanning several pages, and the worst part was that every single poem was displayed in monospace for some reason.
February 12, 2025 at 11:33 PM
This project was born out of a personal need: I was looking for a poetry collection by Robert Frost on gutenberg.org, a repository with thousands of free eBooks, but when I downloaded the ePub for this particular book, I was rather disappointed.
February 12, 2025 at 11:33 PM
Hey, I'm the creator of this site!

Thank you so much for your kind words, I really appreciate it!
February 11, 2025 at 6:05 AM