appenz.bsky.social
@appenz.bsky.social
And this is fully fine of Flux dev on Replicate with two LoRAs and a good amount of parameter optimizations. It's a different style, but also a whole different level of quality.
January 20, 2025 at 6:29 AM
For comparison, here is what Krea AI generates with a photo and an style to give it the animated movie look. This does look a little like a me.
January 20, 2025 at 6:29 AM
6/6 Full text of the "Ensuring U.S. Security and Economic Strength in the Age of Artificial Intelligence" below. I recommend loading it into an LLM and use it to find stuff.

Full text: public-inspection.federalregister.gov/2025-00636.pdf

Just don't trust the LLM to do the math. This is GPT-4o.
January 13, 2025 at 6:23 PM
5/6 No license is required for: Australia, Belgium, Canada, Denmark, Finland, France, Germany, Ireland, Italy, Japan, the Netherlands, New Zealand, Norway, Republic of Korea, Spain, Sweden, Taiwan, the
United Kingdom.

Singapore, Switzerland and Israel are missing.
January 13, 2025 at 6:23 PM
4/6 GPUs are covered if either their "TPP" exceeds 4,800.

TPP is defined as TOPs * Bit Legth * 2 w/ sparsity. So for example:
- H100: 1,000 TOPS * 16 bit = 16,000 TPP 🚫
- A100: 312 TOPS * 16 bit = 4,990 TPP 🚫

Full details in the CCL: www.bis.doc.gov/index.php/d...
January 13, 2025 at 6:23 PM
3/6 There are no restrictions for open weight models. This will certainly help open source models as they are now much easier to handle.
January 13, 2025 at 6:23 PM
2/6 It regulates models trained > 10^26 operations. Quick math:
- 70b Model : 70 billion × 6 × 15 trillion = 6*10^24 ✅
- 405b Model: 405 billion × 6 × 15 trillion = 3.6×10^25 ✅
So the cutoff is around 1T weights trained on 15T tokens for one epoch.
January 13, 2025 at 6:23 PM
5/5 One more thing that surprised me was that all of the top quant trading firms from wall street had large recruiting booth at NeurIPS. Crazy times.
December 17, 2024 at 6:59 PM
4/5 The debate of auto-regressive vs. diffusion continues. I thought for images diffusion had won, but the NeurIPS best paper (below) and the new @GroqInc image model are auto-regressive. Diffusion for LLMs also are a thing now. 🤷‍♂️

arxiv.org/abs/2404.02905
December 17, 2024 at 6:59 PM
3/5 Inference-time compute. With models topping out, this is the next frontier for improving AI performance. Good intro on the @huggingface blog:

huggingface.co/spaces/Hugg...

And there is a lot more we can do, e.g. prompt optimization (DSPy/ TextGrad), workflow and UI.
December 17, 2024 at 6:59 PM
2/5 Our assumption since early 2023 has been that pre-training of LLMs would stall as we are out of data. Ilya's Test-of-Time acceptance speech may end the debate. LLM performance will converge, which likely helps OSS models. The emphasis shifts to higher layers including...
December 17, 2024 at 6:59 PM