The content below is taken from the original ( Linux Fu: Bash Strings), to continue reading please visit the site. Remember to respect the Author & Copyright.
If you are a traditional programmer, using bash
for scripting may seem limiting sometimes, but for certain tasks, bash
can be very productive. It turns out, some of the limits of bash
are really limits of older shells and people code to that to be compatible. Still other perceived issues are because some of the advanced functions in bash
are arcane or confusing.
Strings are a good example. You don’t think of bash
as a string manipulation language, but it has many powerful ways to handle strings. In fact, it may have too many ways, since the functionality winds up in more than one place. Of course, you can also call out to programs, and sometimes it is just easier to make a call to an awk
or Python script to do the heavy lifting.
But let’s stick with bash
-isms for handling strings. Obviously, you can put a string in an environment variable and pull it back out. I am going to assume you know how string interpolation and quoting works. In other words, this should make sense:
echo "Your path is $PATH and the current directory is ${PWD}"
The Long and the Short
Suppose you want to know the length of a string. That’s a pretty basic string operation. In bash
, you can write ${#var}
to find the length of $var
:
#/bin/bash echo -n "Project Name? " read PNAME if (( ${#PNAME} > 16 )) then echo Error: Project name longer than 16 characters else echo ${PNAME} it is! fi
The “((” forms an arithmetic context which is why you can get away with an unquoted greater-than sign here. If you don’t mind using expr
— which is an external program — there are at least two more ways to get there:
echo ${#STR} expr length "${STR}" expr match "${STR}" '.*'
Of course, if you allow yourself to call outside of bash
, you could use awk
or anything else to do this, too, but we’ll stick with expr
as it is relatively lightweight.
Swiss Army Knife
In fact, expr
can do a lot of string manipulations in addition to length and match. You can pull a substring from a string using substr
. It is often handy to use index
to find a particular character in the string first. The expr
program uses 1 as the first character of the string. So, for example:
#/bin/bash echo -n "Full path? " read FFN LAST_SLASH=0 SLASH=$( expr index "$FFN" / ) # find first slash while (( $SLASH != 0 )) do let LAST_SLASH=$LAST_SLASH+$SLASH # point at next slash SLASH=$(expr index "${FFN:$LAST_SLASH}" / ) # look for another done # now LAST_SLASH points to last slash echo -n "Directory: " expr substr "$FFN" 1 $LAST_SLASH echo -or- echo ${FFN:0:$LAST_SLASH} # Yes, I know about dirname but this is an example
Enter a full path (like /foo/bar/hackaday
) and the script will find the last slash and print the name up to and including the last slash using two different methods. This script makes use of expr
but also uses the syntax for bash
‘s built in substring extraction which starts at index zero. For example, if the variable FOO contains “Hackaday”:
- ${FOO} -> Hackaday
- ${FOO:1} -> ackaday
- ${FOO:5:3} -> day
The first number is an offset and the second is a length if it is positive. You can also make either of the numbers negative, although you need a space after the colon if the offset is negative. The last character of the string is at index -1, for example. A negative length is shorthand for an absolute position from the end of the string. So:
- ${FOO: -3} -> day
- ${FOO:1:-4} -> ack
- ${FOO: -8:-4} -> Hack
Of course, either or both numbers could be variables, as you can see in the example.
Less is More
Sometimes you don’t want to find something, you just want to get rid of it. bash
has lots of ways to remove substrings using fixed strings or glob-based pattern matching. There are four variations. One pair of deletions remove the longest and shortest possible substrings from the front of the string and the other pair does the same thing from the back of the string. Consider this:
TSTR=my.first.file.txt echo ${TSTR%.*} # prints my.first.file echo ${TSTR%%.*} # prints my echo ${TSTR#*fi} # prints rst.file.txt echo $TSTR##*fi} # prints le.txt
Transformation
Of course, sometimes you don’t want to delete, as much as you want to replace some string with another string. You can use a single slash to replace the first instance of a search string or two slashes to replace globally. You can also fail to provide a replacement string and you’ll get another way to delete parts of strings. One other trick is to add a # or % to anchor the match to the start or end of the string, just like with a deletion.
TSTR=my.first.file.txt echo ${TSTR/fi/Fi} # my.First.file.txt echo ${TSTR//fi/Fi} # my.First.File.txt echo ${TSTR/#*./PREFIX-} # PREFIX-txt (note: always longest match) echo ${TSTR/%.*/.backup} # my.backup (note: always longest match)
Miscellaneous
Some of the more common ways to manipulate strings in bash
have to do with dealing with parameters. Suppose you have a script that expects a variable called OTERM
to be set but you want to be sure:
REALTERM=${OTERM:-vt100}
Now REALTERM
will have the value of OTERM
or the string “vt100” if there was nothing in OTERM
. Sometimes you want to set OTERM
itself so while you could assign to OTERM
instead of REALTERM
, there is an easier way. Use := instead of the :- sequence. If you do that, you don’t necessarily need an assignment at all, although you can use one if you like:
echo ${OTERM:=vt100} # now OTERM is vt100 if it was empty before
You can also reverse the sense so that you replace the value only if the main value is not empty, although that’s not as generally useful:
echo ${DEBUG:+"Debug mode is ON"} # reverse -; no assignment
A more drastic measure lets you print an error message to stderr and abort a non-interactive shell:
REALTERM=${OTERM:?"Error. Please set OTERM before calling this script"}
Just in Case
Converting things to upper or lower case is fairly simple. You can provide a glob pattern that matches a single character. If you omit it, it is the same as ?, which matches any character. You can elect to change all the matching characters or just attempt to match the first character. Here are the obligatory examples:
NAME="joe Hackaday" echo ${NAME^} # prints Joe Hackaday (first match of any character) echo ${NAME^^} # prints JOE HACKADAY (all of any character) echo ${NAME^^[a]} # prints joe HAckAdAy (all a characters) echo ${NAME,,] # prints joe hackaday (all characters) echo ${NAME,] # prints joe Hackaday (first character matched and didn't convert) NAME="Joe Hackaday" echo ${NAME,,[A-H]} # prints Joe hackaday (apply pattern to all characters and convert A-H to lowercase)
Recent versions of bash
can also convert upper and lower case using ${VAR@U}
and ${VAR@L}
along with just the first character using @u
and @l
, but your mileage may vary.
Pass the Test
You probably realize that when you do a standard test, that actually calls a program:
if [ $f -eq 0 ] then ...
If you do an ls on /usr/bin
, you’ll see an executable actually named “[” used as a shorthand for the test program. However, bash
has its own test in the form of two brackets:
if [[ $f == 0 ] then ...
That test built-in can handle regular expressions using =~ so that’s another option for matching strings:
if [[ "$NAME" =~ [hH]a.k ]] ...
Choose Wisely
Of course, if you are doing a slew of text processing, maybe you don’t need to be using bash
. Even if you are, don’t forget you can always leverage other programs like tr, awk
, sed
, and many others to do things like this. Sure, performance won’t be as good — probably — but if you are worried about performance why are you writing a script?
Unless you just swear off scripting altogether, it is nice to have some of these tricks in your back pocket. Use them wisely.