Nick Wan
banner
nickwan.bsky.social
Nick Wan
@nickwan.bsky.social
Just fyi yall, I retired from content creation and am going all in back to my shitposting roots. So if you’re here for like… data science or neuroscience or whatever… that ain’t me anymore yall.

Like, the only pic of anything remotely data related is this plot of Jon Lester’s pitch mix splits
November 15, 2025 at 2:31 AM
Twitter dms are so f’d they are not only impossible to sift through unread but I have to turn off any notos for dms now because I’m getting fake spammed new dms. sajj

I only answer here and discord from now on :(
November 14, 2025 at 4:54 AM
You may be thinking “how can I really tell if nw took this pic? Could this be a stock photo?” but I assure you I took this pic you are just going to have to trust me
November 4, 2025 at 5:07 PM
went to 10 and added shyvana over sej. what a silly comp
November 4, 2025 at 12:42 AM
lmfao NO MONDAY NIGHT FOOTBALLMAKES ME ANGRY
November 4, 2025 at 12:39 AM
i'm in set revival going gigaweird
November 4, 2025 at 12:34 AM
✌️
October 30, 2025 at 3:47 AM
From @nytimes.com, imagine this game cracking 7 hrs ish if this was 2022
October 28, 2025 at 6:45 AM
They are really big
October 28, 2025 at 6:12 AM
September 28, 2025 at 10:07 PM
tried googling for a pitch trajectory function in python but didn't find one easily. made a gist for everyone

gist.github.com/nickwan/ce5c...
September 11, 2025 at 4:06 AM
really this was just a methodological dive into the subtraction method. take two models, one with a feature (in this case, venue ID) and one without. subtract the outputs. the thing causing the difference is the feature. my top 10 and savant's top 10 are pretty close.
September 7, 2025 at 5:33 PM
then i did game applebaum's "whiff factor" project, which is a take on park factors that includes more factors. this was also a project i didn't take completely 1:1 for replication -- mainly because i wanted to show chat how to do park factors without the typical park factors equation
September 7, 2025 at 5:14 PM
so since my model has different features, there's clearly a difference in distribution and prediction spread. i went on to show what happens if you included pitch location into my model and the distributions did start looking much better + the correlation was stronger
September 7, 2025 at 4:55 PM
then worked on replicating ben resnic's swing decision model. this one led to a fairly large discussion on stream about what features should go into a model about decision making. one issue i had was pitch location in this model, since the batter doesn't know that until after they commit to swinging
September 7, 2025 at 4:41 PM
chat distracted me again, so we started talking about whether swing accuracy (if the estimated pitch type = the actual pitch type) made sense. so i took a look at the top players in "swing accuracy" and their respective xwOBAcon. the diff should be converting pitch recognition into positive value
September 7, 2025 at 4:37 PM
he goes on to do some clustering but i personally didn't see why since i would assume the pitch types on their own were good enough to predict. so went with the simple feature set to predict pitch types based on how a batter swings
September 7, 2025 at 4:34 PM
next thing i did was steve's swing process project. he talks about how swing path tilt and contact depth have signal in where a batter thinks the ball will be. ended up replicating the same image with ohtani's data (didn't filter for hard hit). i preferred the hexbin again, but that's just me
September 7, 2025 at 4:31 PM
they went into how they model expected power and contact. i went a little off the rails at that point because chat was distracting me and tried modeling bip% instead. not as good correlation, but the features i made are probably off since i asked gemini to do the trig for me. but overall, was fun!
September 7, 2025 at 4:27 PM
i went on to say rather than plotting three correlating variables in two ways, i personally think plotting a third var as color can help a ton. here are two plots that depict the barrel% x swstr% x bip% relationship. same data, the hexbin just aggregates a little for some viz smoothness
September 7, 2025 at 4:24 PM
they went on to demonstrate the contact RV and power RV correlation to BIP%. i got similar correlations with barrel% and swstr%
September 7, 2025 at 4:21 PM
they used contact run value and power run value as somewhat of a base to discuss "what makes up good contact and power?". i didn't know if this was proprietary RV models or just the savant-provided RV in some aggregate, i went with proxies (barrel% and swstr%) which have similar correlation
September 7, 2025 at 4:09 PM
First presentation I tried to replicate was "Path Finder" from the driveline guys. they presented a swing path model that is dependent on attack angle and attack direction. here's their plot and here's mine. pretty straightforward and easy to replicate. didn't do the poly fit because i was lazy
September 7, 2025 at 4:06 PM
Had a pigeon friend at work today
August 12, 2025 at 1:01 AM
thoughts?
June 19, 2025 at 9:36 AM