It’s still inscrutable, but it makes more sense if you think of all these as arbitrary function approximation on higher dimension manifolds. The reason we can’t generate traditional numerical solvers for these problems is because the underlying analytical models fall apart when you over-parameterize them. Backprop is very robust at extreme parameter counts, and comes with much weaker assumptions compared to things like series decomposition, so it really just looks like a generic numerical method which can scale to absurd levels.
It’s still inscrutable, but it makes more sense if you think of all these as arbitrary function approximation on higher dimension manifolds. The reason we can’t generate traditional numerical solvers for these problems is because the underlying analytical models fall apart when you over-parameterize them. Backprop is very robust at extreme parameter counts, and comes with much weaker assumptions compared to things like series decomposition, so it really just looks like a generic numerical method which can scale to absurd levels.