Mech Interp
- Can you unembed from any point in the model?
- I think not because the dimensions are not constant
- but would it work for any point in the resid stream
- How do we measure if a feature is monosemantic?
- What if I use SAE activations for the steering vectors I am trying to transfer from one model to another? Like this tutorial