The original paper itself, for those who are interested.
Overall, this is really interesting research and a really good “first step.” I will be interested to see if this can be replicated on other models. One thing that really stood out, though, was that certain details are obfuscated because of Sonnet being proprietary. Hopefully follow-on work is done on one of the open source models to confirm the method.
One of the notable limitations is quantifying activation’s correlation to text meaning, which will make any sort of controls difficult. Sure, you can just massively increase or decrease a weight, and for some things that will be fine, but for real manual fine tuning, that will prove to be a difficulty.
I suspect this method is likely generalizable (maybe with some tweaks?), and I’d really be interested to see how this type of analysis could be done on other neural networks.
Except this is a state case. Unless he also has the Supreme Court of the State of New York under his control…he’s going to have a rough road ahead I think.
The truly unfortunate part is that this may be the only conviction we get. The documents case is being destroyed by the judge, and SCOTUS is likely going to tank the January 6th case.