The question of what it "really is" will be very difficult to answer or even ask precisely. The question of how to most effectively model it to get results, is easier.
The question of what it "really is" will be very difficult to answer or even ask precisely. The question of how to most effectively model it to get results, is easier.
I have a colleague who is a native French speaker, but can only think about science in English. Not because the are not enough French words, but because the thoughts don't come in French.
I have a colleague who is a native French speaker, but can only think about science in English. Not because the are not enough French words, but because the thoughts don't come in French.
This is not about what they *really are*, it is about what gets good results, for this human user.
This is not about what they *really are*, it is about what gets good results, for this human user.
The key question is what mental model should we humans have of LLMs, to allow us to get the best out of them. ("LLM psychology")
The key question is what mental model should we humans have of LLMs, to allow us to get the best out of them. ("LLM psychology")
Maybe this is saying the same as you but in anthropomorphic language.
Maybe this is saying the same as you but in anthropomorphic language.
Solution: we asked it to write in numpy, then had yet another LLM instance translate numpy to JAX!
Solution: we asked it to write in numpy, then had yet another LLM instance translate numpy to JAX!
1. We tried asking the LLM to do both of these things in one prompt. It didn't work as well. Perhaps when the LLM "thought" it had to also write a fitting function, it became too "conservative" to creatively come up with weird functions.
1. We tried asking the LLM to do both of these things in one prompt. It didn't work as well. Perhaps when the LLM "thought" it had to also write a fitting function, it became too "conservative" to creatively come up with weird functions.
But the LLM never had to write code to do the final parameter search.
But the LLM never had to write code to do the final parameter search.
We externally fit each cell's parameters with gradient descent; this fitting code wasn't written by the LLM
We externally fit each cell's parameters with gradient descent; this fitting code wasn't written by the LLM
I think what it did here was assemble several things that were used before, in a new combination, for a new purpose.
But that's what most science progress is anyway!
I think what it did here was assemble several things that were used before, in a new combination, for a new purpose.
But that's what most science progress is anyway!
en.wikipedia.org/wiki/Stretch...
en.wikipedia.org/wiki/Stretch...
1. Combinatorial SR seems impractical because evaluating each function needs a non-convex gradient descent parameter search. We had the LLMs write functions estimating gradient search startpoints, which ablation tests showed was essential. Combinatorial SR couldn’t have done this.
1. Combinatorial SR seems impractical because evaluating each function needs a non-convex gradient descent parameter search. We had the LLMs write functions estimating gradient search startpoints, which ablation tests showed was essential. Combinatorial SR couldn’t have done this.