Hey everyone! I have a bunch of untagged audio files in different directories that all follow the same naming format: "artist - album with spaces - 2-digit-tracknum title with spaces". The final separator is a space instead of a dash, which is causing me some issues when trying to process these filenames. I'm looking for advice on how to differentiate between the space that comes after the two-digit track number and other spaces in the filename, especially since there might be other occurrences of two digits in different fields. I've gotten started with a script, but I'm not sure if it's the best approach. Any guidance would be greatly appreciated!
2 Answers
Alternatively, you could read the filenames using a while loop with null as a separator. Something like: `while IFS= read -r -d '' FILE_NAME; do ...done < <(find /path/to/directory -type f -name "*.mp3" -print0)`. This way, you can manipulate strings directly without relying on too many external tools, which could make your process more efficient.
You might want to use regex with parenthesized sub-patterns to extract the different parts of the filename. For example, a pattern like `"^(.*) - (.*) - ([0-9][0-9]) (.*)\.mp3$"` could be really helpful. This would let you capture the artist, album, track number, and title separately. Just be careful with filenames that don’t fit this pattern; you'll need to handle those cases specifically. Also, if your collection is large, using tools like Beets could help streamline everything by pulling metadata automatically!

That regex is clever, but it might be too greedy with `.*`. Consider changing it to something like `S+` to avoid capturing too much.