MNIST-SAE Log 11

Work Done

  • Established rudimentary interp via EMNIST classification labels
  • Fixed up some of the experiment code to sagve the indices
  • got preliminary findings that there is a optimal depth for meta-saes

  • Found that the average amount of activations do increase the deeper you go which implies some level of fine-grained-ness

  • The amount of activaitons of the max hit their peak at 2 depths in.

Confusions

  • Am I actually doing what I think I am doing?
    • what I am trying to do: Take a trained MNIST model, feed it MNIST data, train an SAE on those. Track the values with EMNIST data to see what patterns there are outside of numbers ( in this case a balanced dataset of letters and numbers)
    • the max activatoins are seeing if tehre are any neurons taht fire exclusively on one label OR if there are some basic trends like letters and numbers

Next Steps

  • Plot some of the data using the techniques from GPT
  • Create a preliminary writeup using the charts and any other ones to show what i have so far
    • REPLICATE, REPLICATE, REPLICATE with other datasets (like fruit or other EMNISTs) --> basically take all the classification datasets
  • Also need to look at the analysis for the meta-sae paper

Cool Stuff

links

social