I want to extract and process the metadata from PNG images and the first line of .safetensors files for LLM’s and LoRA’s. I could spend ages farting around with sed or awk but formats of files are constantly changing. I’d like a faster way to see a summary of training and a few other details when they are available.

  • eldavi
    link
    fedilink
    arrow-up
    2
    ·
    5 months ago

    i’m assuming that command line means bash; in which case jq and regex are your friends.

    • j4k3@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      I found a Python project that does enough for my needs. Jq looks super powerful though. Thanks. I managed to get yq working for PNG’s, but I had trouble with both jq and yq with safetensor files. I couldn’t figure out how to parse a string embedded in an inconsistent starting binary, and with massive files. I could get in and grab the first line with head. I tried some stuff with expansions, but that didn’t work and sent me looking for others that have solved the issue better than myself.