Icebreaker post! What’s something related to ML that you are in the process of learning more about or just learned?

  • ShadowAetherOPM
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    Working on now: Learning more about how machine learning models are evaluated beyond just loss/accuracy/confusion matrices, like how things like generalization are tested and the impact different testing methods have (like 10-fold vs 1-fold cross validation).

    Learned last: I learned how to make 1D convolutional autoencoders with keras. I also learned that autoencoders may not be a good choice for my dataset.

  • planish
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    I am trying to learn the fine art of training models and WTF I am supposed to do with the squiggly lines on Tensorboard.

    But I am doing this in the context of training Stable Diffusion LoRas so my sources are all highly suspicious text dumps from angry channers. And consequently there is lots of trial and error.

    Important and possibly correct learnings:

    • The LoRas are really meant to be used at weight 1 apparently. But everybody overfits them and then tells you to back them off by downweighting them.
    • More training is not better, with like 30 or 50 training images throwing more than like half an hour of compute at it can overfit a lot.
    • You probably want a small LoRa, and you want to give it more repeats of the images and actually only a few epochs.
    • The loss is indeed supposed to go down over time, if it is not doing that you have broken something.
    • The learning rate scheduler is important, but I have yet to really get it right. I think starting fast and then slowing down is sort of required, but I think “fast” and “slow” vary by data set.
    • The model learns to rely on prompt structure to produce image structure. If the training prompts always say someone is making some kind of facial expression, but you don’t add any facial expression to the prompt when you use the model, it will magically lose its ability to draw a face. This is presumably overfiting but I don’t know how to fix it.
    • The tools for captioning data sets are terrible. Everything wants to make tag soup with no real sentences, which is not how people prompt the models.

    Things I am still trying to learn:

    • How to take all this and make a LoRa that actually teaches the model something useful at weight 1.
    • ShadowAetherOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Very relatable, some days I feel like my work is just staring at loss graphs going “wtf does it all mean” lol. Training models feels more like an art than a science to me most of the time (but I’m sure the meta-learning people are working on that).