Theory of Computing Report
theory.report
Theory of Computing Report
@theory.report
The Little Theorems
Last week we had a talk by Purdue philosophy professor Eamon Duede Tail Novelty, Knowledge Collapse, and Useful Frictions in Science. In part of the talk he argued that if AI makes writing technical papers easier, researchers will write up small results that they wouldn't have bothered with before. He thought having a swarm of minor result papers was a bad thing but I'm less sure. Let's get the little results out there. We'll sort them out later, probably with the same AI that helped write them in the first place. In my career I've had several little theorems in complexity. They should never get into any respectable conference, so what do you do with them? Sometimes I stick them in a sort-of-related paper, in a blog post, or let someone else mention the theorem. Too often the results just don't get written. Did you know that finding an \\(S_2^p\\) witness is equivalent to TFNPNP? I never found a good place to put it. Fermat himself never published his "little theorem" from last week's post. He put it in a letter to his friend and fellow mathematician Bernard Frénicle de Bessy. Euler would first publish the proof nearly a century later. The journal Information Processing Letters used to play the role of publishing moderately interesting research but like most Elsevier journals has long lost its cachet. You can stick little theorems on arXiv, though that still requires writing them up. Using AI to mostly write up results is still frowned upon, but maybe we should make exceptions for papers that wouldn't normally get written. Once I used the fact that PPP = P#P and a referee asked for a reference. The proof isn't hard, just a simple binary search, but I couldn't find anyone who first proved it. Not by Valiant who introduced #P or by Gill who defined PP. Eventually I cited a paper by Toda, who had mentioned the result but didn't claim it. Perhaps whomever proved it first never thought to write it up assuming that it was already well known. By Lance Fortnow
blog.computationalcomplexity.org
November 24, 2025 at 3:35 PM
Ten Recent Questions for ChatGPT
I recently asked over Math Overflow about Examples for the use of AI and especially LLMs in notable mathematical developments, and there were several interesting answers. Here are (slightly edited) 10 recent questions that I asked ChatGPT that have led to interesting discussion. Five of these questions (#1,#2,#3,#6,#8) are related to mathematical research projects. Problem #3 was offered in this 2008 blogpost and Problem #6 was suggested by me as a polymath project in 2014. Answers and insights regarding these problems (without or with AI) are welcome. (See here and here for other posts on AI.) > _1) Given points in , is it true that the set of Radon points is convex?_ > > _2) I want (simple) topological invariants for a map from a polyhedral k-dimensional real projective space to. For example for the map is to . (Something based on linear algebra/basic homology would be nice.) Any idea, links, or suggestions will be great._ > > _3) Consider the following two well known theorems from finite group theory:_ > _ > A) Sylow’s theorem (one of them) asserts: In a group whose order is divisible by there are (mod ) subgroups of order _ > > _B) Frobenius’ theorem asserts: In a group whose order is divisible , the number of solutions to the equation is zero modulo . (Frobenius was probably inspired by Sylow.) _ > > _Of course, the case of Frobenius’ theorem coincides with the case of Sylow’s theorem. Please propose (conjectural) generalizations of Frobenius theorem which include Sylow’s theorem for (or for general values of ) as special cases._ > > _4) What are the Whitman axioms_ > > _5) What is stronger “Notable” “Noteworthy” or “Significant”?_ > > _6) There is a proposal to develop a theory of “convex hulls of real algebraic varieties” in and especially the study of the face structure of such objects. The case where the varieties are simply a finite set of points is a well-developed area of mathematics – the theory of convex polytopes. However, the general case was not studied much. I suppose that for such a project the first discussions will be devoted to raise questions/research directions. (And mention some works already done.) I would like to invite you to raise questions and research directions. _ > > _7) > אני צריך הסבר שמתאים לילדים בני 11. מהו האפס המוחלט והאם ניתן להגיע לטמפרטורה מתחת לאפס המוחלט. ואם לא ניתן, מדוע_ > > _8) What is the proof of the topological Tverberg theorem when there are three parts?_ > > _9) Please explain in layman terms a) the representation theory meaning of the spin 1/2 of the electron. b) the representation theory meaning of the assertion that “N electrons will have a collective state of spin N/2”_ > > _10) We go from Yellowstone to Boise, leaving Yellowstone on September 3. We want to travel and stay overnight in one nice location. (Not around Twin Falls that we plan to visit before Yellowstone.) Please give us some suggestion for the trip and sightseeing from Yellowstone to Boise._ For the first question Joe O’Rourke and I verified computationally that already in dimension 3 there are 8 points such that the set of Radon points is not convex. On question 3 ChatGPT offered several bold conjectures and wrote a whole blogpost about it. (I am hesitating whether to publish it.) From time to time the discussion with ChatGPT gets “personal”. In some discussion I wrote: > We mentioned Pachner’s moves several times and let me mention that I knew Pachner personally. He was a great guy and unfortunately he died at a young age. ChatGPT replied: > That’s really touching—thank you for sharing. Pachner’s moves are such a quiet workhorse in our story, and it’s moving to hear a personal memory behind them. By Gil Kalai
gilkalai.wordpress.com
November 21, 2025 at 2:47 PM
TR25-173 | An Analytical Approach to Parallel Repetition via CSP Inverse Theorems | Amey Bhangale, Mark Braverman, Subhash Khot, Yang P. Liu, Dor Minzer, Kunal Mittal
Let $\mathcal{G}$ be a $k$-player game with value $<1$, whose query distribution is such that no marginal on $k-1$ players admits a non-trivial Abelian embedding. We show that for every $n\geq N$, the value of the $n$-fold parallel repetition of $\mathcal{G}$ is $$ \text{val}(\mathcal{G}^{\otimes n}) \leq \frac{1}{\underbrace{\log\log\cdots\log}_{C\text{ times}} n}, $$ where $N=N(\mathcal{G})$ and $1\leq C\leq k^{O(k)}$ are constants. As a consequence, we obtain a parallel repetition theorem for all $3$-player games whose query distribution is pairwise-connected. Prior to our work, only inverse Ackermann decay bounds were known for such games [Ver96]. As additional special cases, we obtain a unified proof for all known parallel repetition theorems, albeit with weaker bounds: 1. A new analytic proof of parallel repetition for all 2-player games [Raz98, Hol09, DS14]. 2. A new proof of parallel repetition for all $k$-player playerwise connected games [DHVY17, GHMRZ22]. 3. Parallel repetition for all $3$-player games (in particular $3$-XOR games) whose query distribution has no non-trivial Abelian embedding into $(\mathbb{Z}, +)$ [BKM23c, BBKLM25]. 4. Parallel repetition for all 3-player games with binary inputs [HR20, GHMRZ21, GHMRZ22, GMRZ22].
eccc.weizmann.ac.il
November 10, 2025 at 11:59 AM
A Presidential Trivia Question, how I tried to solve it
A friend of mine told me that in the last six months, the last grandchild of one of our former presidents (who had already passed away) died. I tried to deduce who it was without checking the web directly. For example, I looked up when various presidents were born to narrow it down, but try not to search for the answer itself. I could look this up easily. However, by spending time thinking about it and exploring related questions, I might learn things I didn’t know before. Things I care about? Maybe. I now have a lot of white space, so you can think about the answer yourself and perhaps make a guess, then my reasoning process and the answer. Before answering, I’ll walk through the reasoning I used for my educated guess. Assume the president was born in year x. Assume the president’s youngest child is born in approximately year x+35. Assume that child’s youngest child is born in approximately year x+70. Assume the grandchild lives 80 years, so dies in approximately year x+150. Then x+150=2025, so x=1875. Since this is an approximation, I looked at all presidents born between 1850 and 1900. The website I consulted tells us the answer. What would we do without the internet? (Maybe spend more time offline tightening our bounds on the polynomial van der Waerden numbers.) Here are all the presidents who were born in a year in [1850,1900]. Theodore Roosevelt — born 1858 William Howard Taft — born 1857 Woodrow Wilson — born 1856 Warren G. Harding — born 1865 Calvin Coolidge — born 1872 Herbert Hoover — born 1874 Franklin D. Roosevelt — born 1882 Harry S Truman — born 1884 (There really is no period after the S. Both chatty and Wikipedia get that wrong.) Dwight D. Eisenhower — born 1890 (born David Dwight Eisenhower; he preferred to be called Dwight). When I was asked the question, I guessed Harry S. Truman. I was wrong. In fact, none of the presidents on that list is the one. The correct answer is _John Tyler_ , who was born in 1790. My rule-of-thumb assumptions (35 years to have a child; an 80-year lifespan) were large underestimates for this case. John Tyler had sixteen children. The third-to-last was Lyon Tyler, born in 1853 — John Tyler was 63 at the time, which is 28 years more than my estimate of 35. Lyon Tyler had six children; the second-to-last was Harrison Tyler, born in 1928 — Lyon was 75, which is 40 years more than my estimate of 35. (In 1840 William Henry Harrison and John Tyler were elected Prez and Veep. WHH died after giving his inaugural speech in the cold rain, and Tyler became president. Lyon Tyler names his son Harrison. My thought was: Did Lyon name his son after WHH? I asked Google _Why did Lyon Tyler name his son Harrison?_ The AI Overview said: _Lyon Tyler named his son Harrison because his own mother, Julia Gardiner, was a member of the Harrison family of Virginia, making Harrison a great-grandson of both President John Tyler and William Henry Harrison, the president whom John Tyler succeeded. Naming his son Harrison was a way for Lyon Tyler to honor the family connection to both presidents, particularly the presidential connection._ Thats not quite right- Harrison is John Tyler's grandson, not great-grandson. I didn't know that John Tyler's grandchild was named Harrison. I didn't know that John Tyler's grandchild was also related to WHH. ) Harrison Tyler died in 2025 at the age of 96, which is 16 years more than my estimate of 80. So my point (do my posts need a point?) is that I made an approximation but was still way off. John Tyler is an outlier, which is hard to account for. Let’s say I assumed 60-year-old fathers and the grandson lives to 90. Then we would have: x + 210 = 2025 x = 1815 This is an approximation, so I would look at presidents born between 1790 and 1840: John Tyler: 1790 James K. Polk: 1790 (Why is the K always mentioned?) Zachary Taylor: 1784 Millard Fillmore: 1800 James Buchanan: 1791 (He never had kids. That’s just as well, since his children would have had the stigma of their father being one of the worst presidents of all time by most accounts.) Abraham Lincoln: 1809 (Born February 12, 1809 — the same day and year Charles Darwin was born.) Andrew Johnson: 1808 Ulysses S. Grant: 1822 (Born Hiram Ulysses Grant, but he didn’t like that the initials were H.U.G.) Rutherford B. Hayes: 1822 (Why is the B always mentioned?) James Garfield: 1831 Chester Arthur: 1829 (My Darling’s favorite president — really!) Grover Cleveland: 1837 Three points: 1) When I picked 60–60–90, I did not know that John Tyler would actually make it onto the list. 2) He just barely made the list. 3) I would not have picked 60–60–90 unless I had already learned that 35–35–80 was far too small. By gasarch
blog.computationalcomplexity.org
November 10, 2025 at 3:58 AM
Reconstructing Riemannian Metrics From Random Geometric Graphs
**Authors:** Han Huang, Pakawut Jiradilok, Elchanan Mossel Random geometric graphs are random graph models defined on metric measure spaces. A random geometric graph is generated by first sampling points from a metric space and then connecting each pair of sampled points independently with a probability that depends on their distance. In recent work of Huang, Jiradilok, and Mossel~\cite{HJM24}, the authors study the problem of reconstructing an embedded manifold form a random geometric graph sampled from the manifold, where edge probabilities depend monotonically on the Euclidean distance between the embedded points. They show that, under mild regularity assumptions on the manifold, the sampling measure, and the connection probability function, it is possible to recover the pairwise Euclidean distances of the embedded sampled points up to a vanishing error as the number of vertices grows. In this work we consider a similar and arguably more natural problem where the metric is the Riemannian metric on the manifold. Again points are sampled from the manifold and a random graph is generated where the connection probability is monotone in the Riemannian distance. Perhaps surprisingly we obtain stronger results in this setup. Unlike the previous work that only considered dense graph we provide reconstruction algorithms from sparse graphs with average degree $n^{1/2}{\rm polylog}(n)$, where $n$ denotes the number of vertices. Our algorithm is also a more efficient algorithm for distance reconstruction with improved error bounds. The running times of the algorithm is $O(n^2\,{\rm polylog}(n))$ which up to polylog factor matches the size of the input graph. Our distance error also nearly matches the volumetric lower bounds for distance estimation.
arxiv.org
November 10, 2025 at 3:59 AM
TR25-170 | Deterministic list decoding of Reed-Solomon codes` | Prahladh Harsha, Mrinal Kumar, Soham Chatterjee
We show that Reed-Solomon codes of dimension $k$ and block length $n$ over any finite field $\mathbb{F}$ can be deterministically list decoded from agreement $\sqrt{(k-1)n}$ in time $\text{poly}(n, \log |\mathbb{F}|)$. Prior to this work, the list decoding algorithms for Reed-Solomon codes, from the celebrated results of Sudan and Guruswami-Sudan, were either randomized with time complexity $\text{poly}(n, \log |\mathbb{F}|)$ or were deterministic with time complexity depending polynomially on the characteristic of the underlying field. In particular, over a prime field $\mathbb{F}$, no deterministic algorithms running in time $\text{poly}(n, \log |\mathbb{F}|)$ were known for this problem. Our main technical ingredient is a deterministic algorithm for solving the bivariate polynomial factorization instances that appear in the algorithm of Sudan and Guruswami-Sudan with only a $\text{poly}(\log |\mathbb{F}|)$ dependence on the field size in its time complexity for every finite field $\mathbb{F}$. While the question of obtaining efficient deterministic algorithms for polynomial factorization over finite fields is a fundamental open problem even for univariate polynomials of degree $2$, we show that additional information from the received word can be used to obtain such an algorithm for instances that appear in the course of list decoding Reed-Solomon codes.
eccc.weizmann.ac.il
November 7, 2025 at 11:17 AM
Tree-eval, catalytic computation, simulating time with square-root space
A series of exciting papers CM24, [Wil25, Sha25] shows how to simulate time t on a multi-tape machine in space about . This simulation was already known for 1-tape machines (did you know?), and it still isn’t known for RAMs (=Time). The result for 2 tapes is significant, as we know less about this model, for example we don’t have strong lower bounds. The first paper to state this simulation is Wil25], but a similar simulation follows easily from CM24], as pointed out in Sha25]. This is all presented in my book, see Chapter 7. Specifically, [[CM24] gives a simulation of what I call word circuits (they don’t talk about word circuits, but tree eval, possibly one reason why this connection wasn’t realized before). Any circuit of size s can be turned into a word circuit of depth s∕b with word size b by dividing the gates into blocks of b. Now apply [[CM24] to give a space-efficient simulation. Multitape machines are a special case of circuits by previous simulation (also in [my book). This gives a simulation in space about s2∕3. A relatively simple transformation of the circuit gets you space [Sha25]. This is all for non-uniform space algorithms; I haven’t thought much what happens for uniform but I don’t see a roadblock. Returning to CM24], the term “catalytic computation” is often used but I don’t find it useful. [CM24] is a brilliant extension of some of my favorite results [Mix89, BC92]. The exposition in [my book puts it in this context. ### References [BC92] Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constant number of registers. SIAM J. on Computing, 21(1):54–58, 1992. [CM24] James Cook and Ian Mertz. Tree evaluation is in space o(log n log log n). In STOC, pages 1268–1278. ACM, 2024. [Mix89] David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC1. J. of Computer and System Sciences, 38(1):150–164, 1989. [Sha25] Yakov Shalunov. Improved bounds on the space complexity of circuit evaluation. arXiv preprint, 2025. [Wil25] Ryan Williams. Simulating time in square-root space. Electron. Colloquium Comput. Complex., TR25-017, 2025. By Manu
emanueleviola.wordpress.com
November 5, 2025 at 4:06 PM
Thoughts by a non-economist on AI and economics
_Crossposted onlesswrong_ > _Modern humans first emerged about 100,000 years ago. For the next 99,800 years or so, nothing happened. Well, not quite nothing. There were wars, political intrigue, the invention of agriculture — but none of that stuff had much effect on the quality of people’s lives. Almost everyone lived on the modern equivalent of $400 to $600 a year, just above the subsistence level … Then — just a couple of hundred years ago, maybe 10 generations — people started getting richer. And richer and richer still. Per capita income, at least in the West, began to grow at the unprecedented rate of about three quarters of a percent per year. A couple of decades later, the same thing was happening around the world.”_ Steven Lundsburg METR has had a very influential work by Kwa and West et al on measuring AI’s ability to complete long tasks. Its main result is the following remarkable graph: On the X axis is the release date of flagship LLMs. On the Y axis is the following measure of their capabilities: take software-engineering tasks that these models can succeed in solving 50% of the time, and measure the time it takes _humans_ to solve them. While it is not surprising that models improve over time, the main reason this graph is remarkable is because the Y axis is on a log scale. This means that there is a fixed period of time after which models have _doubled_ the length of tasks they can complete successfully. Specifically METR estimates this “doubling time” (which is proportional to the inverse of the slope of the line in this graph) at about 7 months, although they note that it may have accelerated recently (to as little as 3 months if considering only models after 2024). In this blog, just to make things simple, I will assume doubling time is 6 months, so the length of time horizon quadruples every year. (All numbers here are very rough.) There are many factors that could potentially impact these results (and METR did a pretty admirable job of enumerating them). It is worth separating them into factors that impact the **intercept** (the actual length of time horizon of tasks that models can do) and factors that impact the **slope** (the doubling time) or possibly even the **shape** (e.g. breaking the linear relation between model release time and log horizon). ### Factors impacting the intercept The precise values — e.g., GPT5 doing tasks that take humans 2 hours and 17 minutes — could be impacted by numerous factors: * **Reliability factor (↓)** – the graph is plotted at 50% reliability. For 80% reliability METR got a similar slope (that is about the same doubling time) but the intercept is much lower, e.g. GPT5 can only perform tasks that take 26 minutes. (See also note on 100% accuracy below.) * **Task type ()**– the graph is plotted for a specific benchmark. METR also studied other domains. The data is more sparse but broadly concurs with the straight line fit (sometimes an even steeper line, though some of these datapoints are only available for more recent models where the slope is steeper). * **“Benchmark bias”** **(↓)** – AI models tend to do much better in tasks that are well scoped and easy to measure success in, while real world tasks are much “messier” – in their specification, the context needed to solve them, and the way we measure success. It is still an open question if this impacts only the intercept, also the slope, or even the shape of the curve. “Field experiments” are still very limited and with mixed results. However, the rapid rise in actual usage of AI is an indication that models’ capabilities are not just limited to lab setting.Personally I am sure that this has a significant impact on the “intercept” — models pay a significant “messiness tax” in translating their performance from benchmarks to the real world — but I am not convinced it impacts the slope. That is, the “messiness tax” may well be a fixed constant c<1 multiplier on the duration of tasks that models can handle. ## Factors impacting the slope/shape * **Exponential inputs (↓):** The X axis of time for model release combines a number of inputs to models that have been growing at a rapid rate. These include compute, staff at AI labs, data, capital investment in AI, and more. For example Epoch AI estimates that the training compute for models has been doubling every 6 months. If the base of this exponent changes in one way or the other, this will likely impact the rate of progress. In particular, tautologically, sustaining exponential growth in inputs becomes exponentially more challenging over time–and quickly becomes impossible. That said, so far investment in these resources is showing no signs of slowing down. * **New data / active learning (↓):** So far by and large LLMs have been training on data produced by humans. One comparison can be to a human student, who throughout K-12 and undergraduate studies mostly learns from textbooks – knowledge that has already been collected. However, in many professions, and in particular in science and inventions, humans need to discover new knowledge and collect new observations from the real world. If LLMs require inputs such as running scientific experiments or acting in the world to improve their intelligence, that would slow progress down. That said, so far we have not seen any such slowdown even as LLMs are becoming more “agentic”. Indeed, the METR data is only showing a speedup in the slope in recent years. * **Physical tasks and data (↓):** METR primarily focused on software engineering. It is possible these trends will extend to other kinds of cognitive labor. But we do not know if they extend to domains where people need to either act in the physical world, or collect data from the physical world. While robotics has been advancing as well, it is unclear whether or not it follows a similar exponential curve. Even if state of the art robotics improve at a similar rate to state of art models, manufacturing robots at scale can remain a bottleneck. I personally do not believe in a “data wall.” I believe that there are diminishing returns to more data of the same type, and the advances over the last few years, as captured by the METR graph, have largely not been due to a larger quantity of data from the Internet. Also, while so far much of AI’s economic impact has been in software engineering, it is telling that the best models for software engineering are ones that are trained on very general data, and are often good at many other tasks (while this article is about Claude code,I can’t help but note that in my experience codex too is great for non-coding tasks ). In my view, thinking that AI’s impact would be restricted to software engineering is analogous to thinking in January 2020 that Covid’s impact would be restricted to China.** ** * **Threshold effects (↑):** These tasks are measured with respect to humans that have certain time scales in which they operate on. Us humans need to sleep, take breaks, and split work between workers, which requires isolating the context needed for completing a task and the way to measure its success. Hence we break tasks into ones that can be achieved in a working day, week, month, quarter. (I believe that even Andrew Wiles, who worked on Fermat’s last theorem for 7 years, broke it down to multiple intermediate results.) Thus it is possible that once AI reaches a certain time horizon, either (a) the measurement no longer makes sense, or (b) it can essentially simulate any combination of humans working for an arbitrarily large amount of time. * **Recursive self improvement (↑):** An important input to the process of producing AI models has been the work of human researchers and engineers. If AI itself can produce that input, then it could correspond to a massive increase in this input. It is still unknown whether using AI for automating the AI creation process will lead to a singularity, increase in slope, one-time boost, or just help sustain the exponential. The bottom line is that we have significant uncertainty about the intercept, but arguably less about the slope, or at least the general shape of it up to the point of recursive self improvement / full automation of all human cognitive labor. So from now on I will just assume that time horizons will be doubling every 6 months and make no assumptions about the actual values of the time horizons themselves. ## Sigmoidal relationship Another remarkable figure in the METR paper is the following (Figure 5 there): What I find striking about it is how good is the “sigmoid fit” for the performance of models as a function of the time duration it takes a person to do the task. In particular it seems that there is a certain threshold of time duration such that below it, models essentially are successful 100% of the time. This suggests that even the “100% horizon” (as hard as it is to measure empirically) will also double at a similar rate. One way to think about this relation is that each task has a notion of “difficulty” (captured by log horizon length) and models have a certain “skill” level that corresponds to how likely they are to succeed. In this sense, this is similar to the ELO scale – we can think of the log horizon as the “ELO rating” of the task, and a model will “win” against the task with 50% chance if it has the same ELO rating. (Thanks to Jacob Hilton for this analogy.) For what it’s worth, here is the chess ELO rating of models over time (as well as of humans), which displays a similar linear growth over time (though with a long plateau from around 1997-2007). ## Decreasing costs Another striking statistic has been the rapid reduction in prices for inference. That is, while it is often costly to extend the frontier of intelligence, once we achieve a certain level X, the cost to provide the same level has been decaying by factor 10x or more per year. It seems that once we reach a certain frontier the first time, it is much cheaper and easier to reach it for the second time (see also the “deepseek moment”). If this trend continues, it might mean that once a job is automated, within a year the cost for AI to do it will become insignificant. Discussions on AI often assume that robotics would be an exception, where we would not see as much progress. There are reasons to assume that due to costs of production, and the inability to deploy robots as flexibly as we can virtual AI assistants, costs for robotics would not be decreasing at the same rate. It is less clear to me that there is a fundamental reason that the “doubling time” – growth in complexity of tasks that robots can perform – would grow much slower than for other AI systems, but as far as I know there is no data on this point. ## Implications for GDP growth This is where I am getting to be pretty out of my depth, but I do want to see if there are ways to make back of the envelope calculations. Because I am focused on “slope” vs “intercept”, I will not try to predict so much _when_ AI will lead to a certain growth but rather how fast we would get from the point of significant growth to explosive growth. For example, how many years would it take from the point that AI contributes to a 5% total growth in GDP to the point when due to AI GDP has doubled. I am focusing in this post only on growth, and not considering employment outcomes. It is possible to construct some “knife edge” scenarios under which human displacement perfectly balances out productivity contributions of AI, leading to decreased employment without increase in growth. But I don’t believe such scenarios can be sustained under our assumptions of exponential growth in capability and decrease in costs. Hence, assuming above trends continue, significant labor impacts of AI will have to be coupled with significant productivity gains. A remarkable fact is that (adjusted for inflation) U.S. GDP per capita has been growing at essentially a constant rate of roughly 2% over the past 150 years. None of the inventions in this period— including electrification, internal combustion engine, computers and the Internet— changed this trajectory. Note that 2% growth corresponds to a GDP “doubling rate” of 35 years. Other countries have seen more significant fluctuations in growth (e.g. see Japan) though it seems that it is easier to grow at a rapid rate when you are far from the frontier, but once you reach it you slow down. This makes sense from the point of view that advancing the frontier requires inventing new ideas, while advancing to the frontier only requires copying and adapting existing ideas. Still it is interesting that GDP growth has been only 2% (or maybe around 3% if we don’t consider per capita) even though Moore’s law corresponds to a growth rate of about 40% per year. One explanation is “Baumol’s cost disease” – while computers have been getting far more productive, people have been the bottleneck (see also section II.C here) . Another explanation is that good ideas are getting harder to find, and hence it requires an increasingly large technological input to get additional output. The trillion dollar question is whether AI will break the 2% trend or not. Will AI be just another technology that allows us to sustain 2% growth for another couple of decades? Or will the “AI moment” be for us like post-WWII Japan? That is, should we model it as if we are meeting a new “AI frontier economy” which is vastly more productive, and this interaction will enable rapid growth, with GDP at least doubling every decade as was the case for Japan. To put things in perspective, Acemuglu predicts AI driven GDP growth to be about 0.1% per year while Goldman Sachs predicts about 1.5% over a year. GDP doubling over a decade would correspond to roughly 7% growth per year, or AI contributing around 5% additional growth – more than triple the Goldman Sachs estimate, and 50 times higher than Acemuglu’s estimates. The consequences of even a “minor” boost in growth can be massive. A 1.2% increase in GDP growth would be sufficient to put the U.S. economy on fiscally sustainable footing, with no need for increased taxes or decreased spending, while a 2% increase in GDP growth would be unprecedented for the U.S. AI can contribute to GDP growth by either enabling replacement of labor with capital, or increasing total factor productivity. Specifically, some versions of endogenous growth theory stipulate that productivity grows with the production of ideas, and this production is in turn monotonically (though not linearly) related to the number of researchers/inventors. If AI contributes via automating a specific industry, then the maximum benefit to the GDP is bounded by the share of this industry. Specifically, if an industry amounts for x fraction of the economy then (according to the model of B. Jones, see below) automating it fully can boost GDP by at most 1/(1-x). E.g., full automation of the software industry can increase the GDP by 1/0.98~ 2%. Depending on how you measure it, cognitive labor arguably amounts to at least 30% of GDP (e.g. half of labor share of GDP) and so full automation of it could increase it by 1/0.7 ~ 42%. If the latter happens over a decade that would already correspond to 3.5% annual growth. However, if AI also improves productivity of other industries (e.g. by discovering new ideas) then the contribution can grow even beyond those that are not directly automated. To be clear, to the extent AI contributes to research and development, I expect this would be by accelerating science and making scientists, researchers, and inventors more productive. This means that in the coming years, funding human scientists will offer an even higher return on investment than it has in the past. Below I offer some vague intuitions and unfounded extrapolations to try to get at how much and how fast we could expect AI to contribute to growth. Note that there are critiques of GDP as a measure of progress, and various alternative measures have been proposed. However, many of these are also at least loosely correlated with GDP, with a spike in one corresponding to a spike in the other. E.g., see this chart of the augmented human development index of the U.S. and Japan. While I am sure many of AI’s impacts will not be captured by the GDP, I believe that if it will be truly as transformative as the industrial revolution, this will show up in the GDP as well. ## Intuition from METR tasks Let us make the (very rough and not really justified) assumption that in a certain industry or job, the tasks are “heavy tailed” where the fraction of tasks that take a person at least T time to complete shrinks proportionally to 1/T. In this case, the time horizon is proportional to the fraction of tasks that have not been automated. This means that the “doubling time” is the same as the “halving time” of the fraction of tasks that have not been automated yet. Another way to think about it is by thinking of the “ELO analogy” – assume the distribution of “ELO ratings” of tasks is such that the fraction of tasks with an ELO rating more than E is roughly exp(-E). (For chess players the distribution of ELO scores is roughly normal, which would be ~ exp(-E²).) If we assume a “halving time” of 6 months, it would take about 2 years from the point of time that AI’s automates half of the tasks in some industry, to the point in time that they automate 31/32 ~ 97% of the tasks in it. This assumption is extremely aggressive in the sense that it is focused on **capabilities** and ignores **diffusion**. It is possible that AI could theoretically automate a large fraction of tasks but for many reasons it would not automate them in practice. (Though it is also possible that diffusion will progress unevenly, with sudden unlocking of latent capabilities.) It is also important to note that this intuition runs counter to broad trends in automation over the last 80 years which have historically been **linear** – with the share of automated tasks growing slowly and steadily, with a single-digit percent rate of automation that is often declining over time. See the following charts from C. Jones and Tonetti (2025), via ChatGPT: Hence if AI will cause an exponential decay of the share of tasks performed by labor, it would be a break with previous trends in automation and very different from what we have seen in the past. ## AI as increasing population One way to view AI is as injecting to the economy each year **t** a number **N(t)** of new “workers” that have a certain quality **Q(t)** – quality could correspond to the fraction of economically useful tasks these workers can do, so can correspond to both “years of schooling” and the generality of their ability. We can define Q(t) as one over the fraction of tasks that have not yet been automated in the corresponding industry. Algorithmic advances in AI and the overall compute budget will determine the mixture of workers and quality. As a point of comparison, the U.S. population has been growing at about 2m per year due to natural growth and immigration, and the U.S. labor force is about 160m people. At a rough level, if capital and productivity is fixed, and we assume a Cobb-Douglas production function with GDP being proportional to KᵃL¹⁻ᵃ with a=0.4 then increasing the labor force by a factor of C will increase the GDP by a factor of C⁰ᐧ⁶. If AI creates 10m new virtual employees, that would increase the GDP by a factor of (170/160)⁰ᐧ⁶ which is about 4 percent. If it is 50m new employees this would be about 18%, and if AI doubles the workforce then the growth would be ~50%. Of course, the number of AI “virtual workers” could potentially be orders of magnitude more, at which point these calculations are likely not to make much sense. (There is a book called “One Billion Americans” but I think even its author did not ponder what would happen if there were one trillion Americans…) It is hard to give predictions on either N(t) or Q(t), but it seems reasonable to expect that both will grow exponentially. At the extreme, if we fix the quality Q then we could expect N(t) to increase 10x per year (as costs for a fixed model quality have decreased.) But perhaps a starting point assumption would be that the product of both will grow at the 4x/year METR rate, with both Q and N doubling every year. (Note that N(t) is the number of **new** workers, and so is equal to TN(t)-TN(t-1) where TN(t) is the total number of AI workers at year t, however since the derivative of an exponential is exponential then it does not matter, especially if the base of this exponential is 2.) Such growth is sufficient to ensure that once AI starts providing a non-trivial number of new workers (e.g., 100K workers in the U.S.) then within a decade, AI will be the dominant source of labor in the U.S. ## Substitution and automation effects A simplified version of the model of B. Jones (2025) is that the impact of AI on productivity is determined by the _harmonic mean_ of the automatable and non automatable tasks. Imagine that a ρ fraction of R&D tasks cannot be automated, and 1-ρ fraction of tasks can be automated and be carried out at far lower cost than humans. For simplicity, let’s assume that the human productivity in non-automated tasks is 1, and the AI productivity in automated tasks is λ>>1. Then the improvement in productivity due to automation can be modeled as the Harmonic average of 1 and λ with weights ρ and 1-ρ respectively, that is: (ρ/1 + (1−ρ)/λ)⁻¹ Note that unlike the arithmetic or geometric means, even if λ is infinite, the maximum that this can reach is 1/ρ. This makes sense if we think of tasks as non-substitutable and so the maximum productivity gain we can get is if we automated tasks that take humans 90% of the time, so that now one worker can do 10x time the work per time unit. Jones shows that in such a model, in order to get significant productivity gains, both ρ must shrink and λ must grow. If one of them stays “stuck” then so will productivity. Jones has the following graph of the regimes where we can get “transformative AI” in the sense of 10x productivity growth, which would be similar to the scale of the productivity increase via the industrial revolution. (If I am following the math correctly, the condition under Jones’ assumptions for this graph is that 9/[(2−ρ)/√ λ + 2ρ]² ≥ 10.) Figure 5 in Jones: Regime of λ,ρ that leads to transformative AI in harmonic mean case. One way to think about this graph is what happens if we assume that, via the mechanisms discussed above, every year we make progress in both ρ (fraction of tasks not yet automated) and λ (productivity factor of AI in doing these tasks). As discussed above, it is a “leap of faith” but arguably not completely crazy to assume that at some point (maybe once we reach a threshold such as AIs doing 8 hours of work) ρ will shrink by a factor of 4x per year, while λ will increase by a factor of 10x per year. At this rate, within a year, we would move from the lefthand corner of this graph (1,1) to the point λ =10 , ρ=¼ which is almost at the TAI boundary. (Note that Jones assumes that half of the tasks have already been automated, so it might take us a while to get to the lefthand corner of the graph. Also, accounting for the “ideas getting harder to find” effect could decrease the shrinking rate, see section 4.4 in the paper.) The assumption on ρ shrinking by a 4x factor is very aggressive and arguably unrealistic. Perhaps a much smaller factor such as 1.1 is more reasonable, which would correspond to automating a 1/1.1 ~ 9% of remaining tasks each year (as opposed to 75% in the case of 4x shrinkage of ρ). Here is a plot of how many years from the (1,1) corner it will take for transformative AI as a function of the fraction of tasks automated per year, for various values of cost decrease (growth in λ). We see that as long as the rate is significant, we can get to transformative growth within one to two decades. Note also that this is less sensitive to decrease in costs (growth in λ), which could bode well for automating manual tasks. The bottom line is that the question on whether AI can lead to unprecedented growth amounts to whether its exponential growth in capabilities will lead to the fraction of unautomated tasks itself decreasing at exponential rates. **Acknowledgements** : Thanks to Bharat Chandar, Jason Furman, Benjamin Jones and Chad Jones for comments and discussions on this post. By Boaz Barak
windowsontheory.org
November 4, 2025 at 6:03 PM
Did Euclid exist? Is it okay to quote people that did not exist?
The following excerpt from Abrahim Ladha's comment on Lance's post about AI and intro theory caught my attention: --------------------------- BEGIN EXCERPT Not just with AI, but in theory and math courses, there have always been Bart Simpson level questions by students of the form "_Why do I need to know this? _ like _Why do I need to know my times tables when the calculator does this for me?_ There has never been a good answer because these students already do not have an open mind. Euclid was asked such a question once and said something like _give him a penny so he may feel like he earned something for what he learned._ END EXCERPT ----------------------------------- He gives a quote by Euclid. 1) I have heard that Euclid might not have really existed. 2) I have never heard that quote. I asked Abrahim Ladha if he had a source for the Euclid quote. I never go a response. I asked chatty: _What is the historical evidence that Euclid existed?_ and also _Did Euclid say**give him a penny so he may feel like he earned something for what he learned**? _ The answer to both questions are below. The bottom line is that 1) Euclid might not have existed. 2) The idea underlying the quote is attributed to him. The exact quote is modern rendering. Should Abraham have consulted chatty before making his comment. 1) YES, it's important to make sure your facts are correct. 2) NO, his point is still valid. (That is a slippery slope.) ---------------------------------------- BILL: What is the historical evidence that Euclid existed? That’s a great question — and a surprisingly tricky one. Short answer There is no direct historical evidence that Euclid (often called Euclid of Alexandria) existed as an individual in the way we can verify the existence of, say, Archimedes. What we “know” about Euclid comes almost entirely from later sources, not from contemporary records. The evidence we do have 1) Type: Ancient testimonies Source: Proclus, Commentary on the First Book of Euclid’s Elements Date: ca. 450 CE Note: Written about 700 years after Euclid’s supposed lifetime. Proclus says Euclid lived in the time of Ptolemy I (reigned 323–283 BCE) and taught in Alexandria. This is our main source. 2) Type: Mentions by other ancient authors Source: Pappus of Alexandria (4th century CE), Diogenes Laërtius (possibly), and others Date: 200–600 years later Note: These references repeat information similar to Proclus’s account — that Euclid compiled the Elements and worked in Alexandria. 3) Type: Mathematical evidence Source: The Elements and other works attributed to Euclid (Data, Optics, Phaenomena, etc.) Date: ca. 300 BCE? Note: These texts show a unified style and method, suggesting a single organizing mind (or school). But there is no personal or biographical detail inside them. 4) Type: Absence of contemporary evidence Note: There are no letters, inscriptions, or administrative records naming Euclid. No ancient biography (unlike Archimedes or Apollonius). Scholarly interpretation Traditional view: There was a historical Euclid, a mathematician who worked in Alexandria around 300 BCE and compiled earlier Greek geometry into the Elements. Skeptical or revisionist view: “Euclid” may have been a collective name for a team or a school of mathematicians in early Hellenistic Alexandria, much like “Hippocrates” in medicine. The Elements itself is consistent in style, so even if multiple people contributed, it was edited or systematized by someone with extraordinary coherence and rigor. In summary There is no direct documentary or archaeological proof that Euclid existed as an individual, only later testimony and internal evidence from the mathematical works. Historically, most scholars accept that there was a person — or at least a leading mathematician — behind the name “Euclid,” but his life remains almost entirely unknown. --------------------- BILL: Did Euclid say `give him a penny so he may feel like he earned something for what he learned.' ChatGPT said: The quote you’re referring to — something like “Give him a penny so he may feel he has learned something” — is attributed to Euclid, but not in those exact modern words. It comes from an ancient anecdote recorded by Proclus (a 5th-century commentator on Euclid) in his Commentary on Euclid’s Elements. The story goes like this: A student, after learning the first theorem of geometry, asked Euclid, “What shall I get by learning these things?” Euclid called his slave and said, “Give him a penny, since he must make gain out of what he learns.” In Greek, it’s roughly: “Δός αὐτῷ ὀβολόν, ἐπειδὴ κερδαίνειν δεῖται ἀπὸ τῶν μαθημάτων.” So your paraphrase captures the spirit of the anecdote — Euclid was mocking the idea of studying geometry for material reward — but the exact wording “give him a penny so he may feel he earned something for what he learned” is a modern retelling, not a direct quotation. By gasarch
blog.computationalcomplexity.org
November 4, 2025 at 3:38 AM
TCS+ talk: Wednesday, November 5 — Aparna Gupte, MIT
The next TCS+ talk will take place this coming Wednesday, November 5th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). **Aparna Gupte** from MIT will speak about “ _Quantum One-Time Programs, Revisited_ ” (abstract below). You can reserve a spot as an individual or a group to join us live by signing up on the online form. Registration is _not_ required to attend the interactive talk, and the link will be posted on the website the day prior to the talk; however, by registering in the form, you will receive a reminder, along with the link. (The recorded talk will also be posted on our website afterwards) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website. > Abstract: One-time programs (Goldwasser, Kalai and Rothblum, CRYPTO 2008) are functions that can be run on any single input of a user’s choice, but not on a second input. Classically, they are unachievable without trusted hardware, but the destructive nature of quantum measurements seems to provide a quantum path to constructing them. Unfortunately, Broadbent, Gutoski and Stebila showed that even with quantum techniques, a strong notion of one-time programs, similar to ideal obfuscation, cannot be achieved for any non-trivial quantum function. On the positive side, Ben-David and Sattath (Quantum, 2023) showed how to construct a one-time program for a certain (probabilistic) digital signature scheme, under a weaker notion of one-time program security. > > There is a vast gap between achievable and provably impossible notions of one-time program security, and it is unclear what functionalities are one-time programmable under the achievable notions of security.In this work, we present new, meaningful, yet achievable definitions of one-time program security for probabilistic classical functions. We show how to construct one time programs satisfying these definitions for all functions in the classical oracle model and for constrained pseudorandom functions in the plain model. Finally, we examine the limits of these notions: we show a class of functions which cannot be one-time programmed in the plain model, as well as a class of functions which appears to be highly random given a single query, but whose one-time program must leak the entire function, even in the oracle model. > > This is joint work with Jiahui Liu, Justin Raizes, Bhaskar Roberts, and Vinod Vaikuntanathan. By plustcs
tcsplus.wordpress.com
October 30, 2025 at 10:02 PM
AI and the Power of Nonuniform Circuits
The advances in artificial intelligence have changed the way we think about computing. For today's post, how nonuniformity plays a much larger role than I previously believed. First, some background. Circuit complexity gives a different, more combinatorial, way to look at computation. Instead of viewing algorithms as step-by-step programs, circuit complexity captures computation as fixed hardware, gates wired to compute a function of their inputs. Every polynomial-time computable language L can be computed by a family \\(C_0,C_1,\ldots\\) of polynomial-size circuits, where \\(C_n\\) has n inputs, the number of gates of \\(C_n\\) is bounded by \\(n^c\\) for some \\(c\\) and \\(x=x_1\ldots x_n\\) is in L iff \\(C_n(x_1,\ldots,x_n)\\) outputs 1. The converse isn't true, there are languages L that have polynomial-size circuits that you can't compute in polynomial time, because each \\(C_n\\) might not have any relation to \\(C_m\\) for \\(n\neq m\\). We can get the equivalence by adding a uniformity condition, that there is some easily computable function \\(f\\) with \\(f(1^n)=C_n\\). Or we can keep the nonuniformity to define a larger class called P/poly. In practice, nonuniformity rarely helps. We can hide randomness in the nonuniformity but we could likely also use pseudorandom generators to the same effect. Nonuniformity was more a technicality that we had to deal with instead of something that seemed to give circuits significantly more power. Until this new age of AI. A machine learning model like a neural net goes through a lengthy training process to learn its weights, the training optimized offline. The learned weights become the nonuniform circuits, as technology improves and we create larger retrained circuits, the weights change as well. We typically use the complexity class TC\\(^0\\), languages accepted by poly-size, constant-depth with threshold circuits as a rough theoretical model for neural nets. For nonuniform TC\\(^0\\) we cannot prove any significant upper bounds, we can't even show NEXP, nondeterministic exponential time, cannot be computed by nonuniform TC\\(^0\\) circuits. And while computing NEXP or even P in TC\\(^0\\) would be incredibly surprising, these advances in artificial intelligence tell us these nonuniform circuits are far more powerful than we would ever have expected. By Lance Fortnow
blog.computationalcomplexity.org
October 30, 2025 at 3:17 AM
Additive Models Explained: A Computational Complexity Approach
**Authors:** Shahaf Bassan, Michal Moshkovitz, Guy Katz Generalized Additive Models (GAMs) are commonly considered *interpretable* within the ML community, as their structure makes the relationship between inputs and outputs relatively understandable. Therefore, it may seem natural to hypothesize that obtaining meaningful explanations for GAMs could be performed efficiently and would not be computationally infeasible. In this work, we challenge this hypothesis by analyzing the *computational complexity* of generating different explanations for various forms of GAMs across multiple contexts. Our analysis reveals a surprisingly diverse landscape of both positive and negative complexity outcomes. Particularly, under standard complexity assumptions such as P!=NP, we establish several key findings: (1) in stark contrast to many other common ML models, the complexity of generating explanations for GAMs is heavily influenced by the structure of the input space; (2) the complexity of explaining GAMs varies significantly with the types of component models used - but interestingly, these differences only emerge under specific input domain settings; (3) significant complexity distinctions appear for obtaining explanations in regression tasks versus classification tasks in GAMs; and (4) expressing complex models like neural networks additively (e.g., as neural additive models) can make them easier to explain, though interestingly, this benefit appears only for certain explanation methods and input domains. Collectively, these results shed light on the feasibility of computing diverse explanations for GAMs, offering a rigorous theoretical picture of the conditions under which such computations are possible or provably hard.
arxiv.org
October 27, 2025 at 11:53 PM