jim
jim
@jimoof.bsky.social
We need to consider more deeply:
- what 2% time horizons of 90h mean for mathematics
- autonomous ml research
- acceleration in harnesses which accelerate ai progress and general economic output
- how the market will react to job losses + increased productivity
But I'm tired for now.
January 19, 2026 at 5:48 AM
have a 2% time horizon of roughly 90h.
January 19, 2026 at 5:19 AM
Whereas for amplifying programmer productivity the 50% time-horizon is kind of the crucial statistic, for breakthroughs / cutting-edge research the 2% horizon might be just as relevant a figure. We can assume 2% time horizons are roughly 3x the length of 50% time horizons, so at this stage we'd...
January 19, 2026 at 5:18 AM
LLM problem solvers have the advantage of being familiar with ~all the relevant literature. It will be more like having an army of Terry Taos who will work with you for a few days then leave forever, than it will be having an army of, say, grad students who will work with your for a few days.
January 18, 2026 at 11:58 PM
So, LLMs are already near the level where they can make meaningful intellectual contributions to mathematics. At 30h (June) we can expect them to be solidly making intellectual contributions. The hardest problems, it seems, have lengths significantly greater than 30h, but...
January 18, 2026 at 11:57 PM
At the beginning of 2026, LLMs are at the brink of being able to solve interesting open mathematics problems (they have done impressive work on various open Erdos Problems). This is with model time-horizons of ~5 hours. And time-horizons are more-or-less a measure of model intellectual capabilities.
January 18, 2026 at 11:54 PM
... problems which machine learning engineers tend to get stuck on?
January 18, 2026 at 11:41 PM
... but what would be really helpful is if it can operate at the cutting-edge of human knowledge. Can it accelerate the development of algorithmic improvements in machine learning? Can it detect the kinds of mistakes that machine learning engineers tend to make? Can it solve the kinds of technical..
January 18, 2026 at 11:40 PM
But we need to zoom out further still. This is not yet the full picture. What matters for the sub-question at hand (time-horizons through 2026) is whether 'reason 2' (AI contributing to AI) will be a factor by H2 2026. AI showing some more agency in autonomously selecting coding tasks is nice...
January 18, 2026 at 11:33 PM
This may sound a little troubling, but it is of course a very good thing for the S&P 500. Massively reduced compensation? In exchange for massively increased output? And this is the worst it will ever be? (OK, it's unclear whether market participants will ever be able to get that last point).
January 18, 2026 at 11:27 PM
those who are junior or not particularly bright are likely not to be trusted with the amount of compute an individual at this point will be able to wield (and will be worse than SOTA LLMs at shorter-horizon tasks).
January 18, 2026 at 11:25 PM
Intuitively, I feel that this may be the point at which job-losses start to be seriously felt amongst programmers. Particularly experienced or intelligent programmers will be able to work at a level of productivity which is extremely high compared to pre-LLM levels, but...
January 18, 2026 at 11:23 PM
30 hours is something like a work-week. We can envision a Claude Code like system which instead of doing individual programming tasks reliably, moves up a layer of abstraction and determines for itself which tasks ought be carried out in order to accomplish a broader goal.
January 18, 2026 at 11:19 PM
Obviously, we'd like the figures all the way to the end of the year. But we should first stop to think about the implications of having a model with a ~30 hr 50% time-horizon and consider whether it could affect the trend.
January 18, 2026 at 11:09 PM
Updated projection:
Jan'26: 10 hr 12 min
Feb'26: 12 hr 59 min
Mar'26: 16 hr 33 min
Apr'26: 21 hr 06 min
May'26: 26 hr 53 min
Jun'26: 34 hr 15 min
January 18, 2026 at 11:05 PM
If we assume that the doubling time as of mid 2025 was 3.4 months, and we assume a constant halving-time of 24 months through 2025, we should expect a doubling-time of roughly 2.86 months as of the beginning of 2026.
January 18, 2026 at 10:57 PM
So we had a halving time of 48 months? Let's say that fell to 24 months as of beginning of 2025? (vibes-based inference from stuff that's not in your context, related to reason 1). What should we expect the doubling-time to have fallen to by now, then?
January 18, 2026 at 10:49 PM
Already, we have see doubling times fall from more than seven months to 3.4 months (over the course of perhaps four years). This is likely due to a combination of reasons 1 & 3 (which are not entirely distinct anyway) as well as noise. We should therefore expect this trend to accelerate.
January 18, 2026 at 10:36 PM
There are three reasons to expect hyperexponentiality in time-horizons:
1. Humans are relatively weak at long time-horizon tasks
2. AI will increasingly contribute to AI progress
3. Time-horizons tend to infinity
January 18, 2026 at 10:27 PM
Then, we extrapolate:
Jan'26: 9 hr 49 min
Feb'26: 12 hr 03 min
Mar'26: 14 hr 48 min
Apr'26: 18 hr 10 min
May'26: 22 hr 18 min
Jun'26: 27 hr 22 min

But, progress is not merely exponential. We can expect faster progress than this.
January 18, 2026 at 9:54 PM
Dec'24: o1, 41 minutes
Dec'25: speculative, 8 hours
We get a doubling time of 3.4 months for 2025.
January 18, 2026 at 9:42 PM
Let's first look at this new, 'true' time-horizon trend, then look into the evidence for an imminent, major OpenAI model release.
January 18, 2026 at 9:35 PM
So, I expect that taking the 50% time-horizon as of late November, 2025 as something closer to eight hours will be more indicative of future model capabilities. Now, let's see what this means for doubling-times and OpenAI's imminent (?) model release.
January 18, 2026 at 9:30 PM
And this is a lower-bound! This is what we should expect, assuming OpenAI phones it in and doesn't manage to make any improvements on the system they had in July. Sure, that system was likely very compute-intensive and experimental, but still.
January 18, 2026 at 9:28 PM