Interp Explorer

MNIST-SAE Log 4

Work Done

Loaded in CIFAR-10 data
ran it through MNIST + the first sae
cached all the activations
found max activating images

Confusions

struggling to understand does the dataset havfe a constat image that it pulls from or is it randomized every time?
- need to validate that the images are not random or that I can control them
how do I actually visualize the clean image from CIFAR-10?

Next Steps

Visualize the images [x]
Visualize the images with ImageNet or another image set that has way more diverse images
Repeat the process for the meta-sae
utilize VLM to do labeling (maybe CLIP?)
go really meta and have a loop that continously constructs meta-saes - trying to find if there is a point at which you cannot get more fine-grained

Cool Stuff

Hugo Fry's experiments with Vision SAEs
Hugo Fry et al. work with medical saes : Ranjani could be interested in this