Skip to main content


Released a new article after having to rename a batch of files.

Want to rename files in bulk, but looking for a good tool that can be used on Linux? #files #howto #linux #regex-rename #rename #rnr

Link: https://linux-audit.com/linux-tools-to-bulk-rename-files/

Feedback and boosts welcome 🚀

in reply to Michael Boelen

Your examples contain wrong code, but using the wrong syntax, so they accidentally break in a way that gives the results you want.

Your first example has a search pattern of (\d+)-(\d+)-(\d+)- and a replacement part of ${3}.
The pattern means: Match one or more digits and capture them as group 1, then a hyphen, then one or more digits (captured as group 2), then a hyphen, then one or more digits (captured as group 3), then a hyphen.
The replacement part means: Insert the contents of capture group 3, which (since we're matching dates) should be the day number.
For example, ./2014-10-25-how-to-create-custom-tests-in-lynis.md would become ./25how-to-create-custom-tests-in-lynis.md (note the 25).

However, since you're passing these arguments in double quotes on the command line, the shell interprets them first, and ${3} is the syntax for inserting the contents of the third parameter to the shell—which, for a login/interactive shell, is normally empty. So the shell turns "${3}" into "" (the empty string), which is the correct replacement string for what you're trying to do in the first place.

The correct invocation is:

rnr --dry-run '\d+-\d+-\d+-' '' ./*.md<br>

Note the use of single quotes to avoid interpretation by the shell. (Since we effectively just delete parts of the input string, we don't need to capture anything, so no ( ) are needed in the pattern.)

Similarly, your second example has s/(\d+){4}-(\d+){2}-(\d+){2}-/$1/. Apart from the regex being slightly bizarre, it tries to use $1 (the contents of the first capture group) as the replacement. Now, (\d+){4} is basically a nested loop: It matches one or more digits, repeated 4 times in a row. Which is the same as matching 4 or more digits (since each iteration of the inner loop consumes at least one digit), except for the capturing part: Since the capturing parens ( ) are inside the {4} loop, they only capture the contents of the fourth and last iteration, which is always the last digit. In other words, (\d+){4} is equivalent to \d+\d\d(\d). (Similarly, (\d+){2} is \d+(\d).)

Normally this would turn ./2014-03-02-lynis-stuck-during-testing.md into ./4lynis-stuck-during-testing.md (note the 4).

However, double quotes and shell processing kick in again: The shell expands $1 to the contents of its first command-line parameter, which is normally empty, so again, the replacement part (as seen by the rename tool) is empty.

The correct invocation is:

# to mimic the first example exactly:<br>rename --nono 's/\d+-\d+-\d+-//' ./*.md<br># alternatively, to match dates more precisely:<br>rename --nono 's/\d{4}-\d{2}-\d{2}-//' ./*.md<br>

#regex #shell #sh
in reply to Kalamata Hari

@barubary
Thanks for the feedback. Not sure why we get different results. Will check it and see what improvements can be made. Cheers!
⇧