There is a lot of intellectual dishonesty around transformers because we don't even know how they work. We have the model and the weights, which are the solution to a problem represented by a polynomial equation. Like solving the Pythagorean theorem, AI figures out the variables and weights. However, we are given the equation, but not the weights. We still don't know how transformers work.