Lightnews — Scholar-powered news

Stephen Martin

@stephensrmmartin.bsky.social

Llama 3.1 was an excellent local model for its time. That was very well received in the local model world.
But Gemma, Mistral models, and now Qwen models have far surpassed it.

Llama 4 was DOA. Too big for most local model users. Too small if you're going to use an API anyway.
Qwen has the 👑 now.

August 11, 2025 at 5:59 PM

Stephen Martin

@stephensrmmartin.bsky.social

It lets you move up a level of abstraction for better or worse. I am more able now to be a broader ideas person, which is heavily informed by years of experience and expertise in bespoke stats, analysis, modeling, and coding, etc. Still solving problems, but in a less nitty gritty, more support way.

August 5, 2025 at 3:52 AM

Stephen Martin

@stephensrmmartin.bsky.social

Some memory issues are too stubborn for post-boot memtesters. Though I had one that was awful - Never manifested in memtest86, it only manifest when the machine was under heavy load, so I had to boot minimally and stress test while checking.
Do you have XMP/EXPO/DCOP enabled?

February 4, 2025 at 4:41 AM

Stephen Martin

@stephensrmmartin.bsky.social

Have you done a proper memtest?
Filesystem errors + freezing sounds like memory could be a culprit.

February 4, 2025 at 3:52 AM

Stephen Martin

@stephensrmmartin.bsky.social

Bah nvm. As soon as I posted this I saw you took the same screenshot haha.

January 25, 2025 at 8:38 PM

Stephen Martin

@stephensrmmartin.bsky.social

From their own GitHub - I suspect it doesn't use one.

January 25, 2025 at 8:38 PM

Stephen Martin

@stephensrmmartin.bsky.social

It's still my gold standard. If it doesn't work in Stan, it is unlikely to work at all.
It's the perfect level of abstraction. Not so abstract that it's hard to know what it's really doing, not so low level that you have to use a ton of boilerplate and specify every little thing. It's transparent.

January 11, 2025 at 3:23 AM

Stephen Martin

@stephensrmmartin.bsky.social

But seriously, I cannot imagine a world where Disney would be ok with a model trained on Disney content being used to generate Disney like content. It could actually spur some changes in what is acceptable use of copyrighted works.
I'm not anti AI, but obviously there are problems with copyright.

December 29, 2024 at 6:03 PM

Stephen Martin

@stephensrmmartin.bsky.social

MELSM**

November 24, 2024 at 8:17 PM

Stephen Martin

@stephensrmmartin.bsky.social

Likewise - use location scale modeling methods if homogeneity is wrong. LSMs, MELMS, LM-MELSMs, etc. But this is much easier with Bayes. Use wide-tail distributions. There are a lot of ways that models can be improved to better match the apparent DGP without even getting into process changes.

November 24, 2024 at 8:09 PM

Stephen Martin

@stephensrmmartin.bsky.social

This was why I wrote about "DOCOs" - "Data otherwise considered outliers". People will remove outliers; when really you should only remove outliers if it's truly a data encoding error, otherwise change your model to accommodate the process that produce the DOCO.

November 24, 2024 at 8:09 PM

Stephen Martin

@stephensrmmartin.bsky.social

I think this mentality is very common in Bayesian methods, perhaps in part because switching assumptions doesn't require a deep dive into how to produce accurate test statistics; the posterior is still tractable and equally interpretable.

November 24, 2024 at 8:09 PM

Stephen Martin

@stephensrmmartin.bsky.social

Well this was incredibly helpful, thank you. Still trying to reconnect with all my twitter circle...

November 24, 2024 at 2:32 AM

Stephen Martin

@stephensrmmartin.bsky.social

And wouldn't most answer "Not well", which is consistent with their attitude toward the efficiency commission?

November 13, 2024 at 6:47 PM

Stephen Martin

@stephensrmmartin.bsky.social

Does anyone in academia claim that academia is run efficiently?
I recall one lab took the better part of a year to get a trash can for their lab because the department was unwilling to secure the funds from their own grant.

November 13, 2024 at 6:45 PM

Stephen Martin

@stephensrmmartin.bsky.social

1. I dunno? I see a lot of docs including info about returned objects. Certainly better than python anyway.
2. I still am not sure why to include both. But I'm more concerned about how to make it seamless if many devs and users used it. So you can do what you want, but attr is better for all.

September 23, 2024 at 5:02 AM

Stephen Martin

@stephensrmmartin.bsky.social

By contrast if everyone added metadata attributes and by doing so can announce themselves as inheriting from a metadata type, then anyone can return any type, devs wouldn't need monads or to handle metadata outputs, etc.
Makes it all more functional without diving into monads.

September 21, 2024 at 7:02 PM

Stephen Martin

@stephensrmmartin.bsky.social

If everyone were to adopt this method, then everyone would have to either export lists or export a list of metadata plus their actual object. We'd then want monads to handle this, but most R users won't understand monads.
What of people who use s4? Output matrices? Vectors?
Etc.

September 21, 2024 at 7:01 PM

Stephen Martin

@stephensrmmartin.bsky.social

Yes, please do this. It's a much better practice that doesn't kill composability.
Use attr for metadata. Make a class for things with that metadata. Add methods for those classes.

September 20, 2024 at 11:26 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news