2 Comments
⭠ Return to thread

I somewhat disagree about that. I don't think the model infers physical laws. Those are all expressed in words online, and sophisticated word filling-in using statistics is good enough to look like physical reasoning. (Not downplaying statistics in any way...I think it is cool). I don't think physical laws are inferable from words, but ways of filling in words describing physical laws can be learned from text describing physical laws.

What the model has is interesting generalization patterns for sequences, and when applied over language it looks smart and looks like it is inferring things. It learns small automata that can mix slots and content, because the architecture has biases for forcing it that way. I will write more about this soon...hopefully as a paper.

Expand full comment

Interested to hear your reasoning! In particular, whether you feel the model architecture aren't able to theoretically store these sorts of models or whether they just aren't achievable to be built using gradient descent etc.

Expand full comment