The famed text-processing abilities of the Bourne shell (sh) have been vastly expanded in the Bourne-Again shell (bash). These abilities are efficient to a fault and the code can be even more cryptic. This final part of the article series covers some important behaviour of bash, particularly that relating to text expansions and substitutions.
Introduction
This final part of the article series covers some other important behaviour of bash, particularly those relating to text expansions and substitutions.
The famed text-processing abilities of the Bourne shell (sh
) have been vastly expanded in the Bourne-Again shell (bash
). These abilities are efficient to a fault and the code can be even more cryptic. This makes the bash
command interpreter and programming language very powerful but also a minefield of errors.
Before going further, let us review what we learned in the first part of this article:
sh
and bash
are not the same if
statements check for exit codes (0 for true
and non-zero for false
), not boolean values [
, true
and false
are programs, not keywords - Presence/absence of space in variable comparisons and assignments make quite a difference
[[
is not the fail-safe version of [
- Arithmetic operations are not straight-forward as in other languages
- Array operations use cryptic operators
- By default,
bash
will not stop for errors
In this part, we will focus on how bash
performs text expansions and substitutions. I will only cover what I think are the most important text-processing features. For comprehensive information, you will have to study the Bash Reference Manual. Bash and several commands such as sed
and grep
also use regular expressions to perform text processing. Regular expressions is a separate topic on its own and I will not cover them either.
History Expansion Character (!)
This feature is available when typing commands at the shell prompt. It is used to access commands stored in bash
history file.
!n | Execute nth command in bash history |
!! | Execute last command (Equivalent to !-1 ) |
!leword | Execute last command beginning with ‘leword ’ |
!?leword? | Execute last command containing ‘leword ’ |
^search^replace | Execute last command after replacing first occurrence of 'search ' with 'replace ' |
You can modify the history search using certain word designators, preceded by a colon (:).
!?leword?:0 | Execute with 0th word (usually the command executable) in last command containing ‘leword ’. |
!?leword?:2 | Execute with second word of last command containing ‘leword ’. |
!?leword?:$ | Execute with last word in last command containing ‘leword ’. |
!?leword?:2-6 | Execute with second word to sixth word in last command containing ‘leword ’. |
!?leword?:-6 | Execute with all words up to 6th word in last command containing ‘leword ’ (Equivalent to !?leword?:0-6) |
!?leword?:* | Execute with all words of last command (but not the 0th word) containing ‘leword ’ (Equivalent to !?leword?:1-$ ) |
!?leword?:2* | Execute with the second word to the last word in last command (but not the 0th word) containing ‘leword ’ (equivalent to !?leword?:2-$) |
!?leword?:2- | Execute with all words from the 2nd position to last but-one word and not the 0th word in the command containing ‘leword ’. |
Remember that bash
will execute whatever you have retrieved from the history with whatever you have already typed at the prompt. You can also use any number modifiers, each preceded by a colon (:).
!?leword?:p | Display (but not execute) last command containing ‘leword ’ |
!?leword?:t | Execute with last command containing ‘leword ’ after removing all pathname of last argument (i.e., leave the tail containing the file name) |
!?leword?:r | Execute with last command containing ‘leword ’ after removing the file extension from the last argument |
!?leword?:e | Execute with last command containing ‘leword ’ after removing pathname and file name from the last argument (leaving just the extension) |
!?leword?:s/search/replace | Execute last command containing ‘leword ’ after replacing the first instance of ‘search ’ with ‘replace ’ |
!?leword?:as/search/replace | Execute last command containing ‘leword ’ after replacing all instances of ‘search ’ with ‘replace ’ |
If you omit the search text ('leword
') and use the history expansion character with the word designators and the modifiers, bash
will search the last command. Until you become proficient in using the history expansion character, use the modifier :p
to display the command before you actually execute it.
Text Expansions and Substitutions
These features are available at the shell prompt and in shell scripts.
- Tilde (
~
): In your commands, bash
will expand instances of ~
with the value of the environmental variable $HOME
, that is, your home directory. ?
and *
: These are metacharacters. In file descriptors, ?
matches for any one character while *
matches any number of any characters. If they do not match any file names, bash
will use their literal values. - Brace expansion: You can use comma-separated text strings within curly brackets to generate combination of strings with their suffixes and/or prefixes. When I start a new book, I create its folders like this.
mkdir -p NewBook/{ebook/images,html/images,image-sources,isbn,pub,ref}
This command creates folders like this:
NewBook
NewBook/ref
NewBook/pub
NewBook/isbn
NewBook/image-sources
NewBook/html
NewBook/html/images
NewBook/ebook
NewBook/ebook/images
- Parameter expansion: When bash executes a script, it creates these special variables for the script.
Shell variable | Use |
$0 | Name of the shell script |
$1, $2,… | Positional parameters or arguments passed to the script |
$# | Total count of arguments passed to the script |
$? | Exit status of last command |
$* | All arguments (double-quoted) |
$@ | All arguments (individually double-quoted) |
$$ | Process ID of current shell/script |
At the terminal, $0
will usually expand to the shell program (/bin/bash
) On a terminal, you can use the set
command to specify parameters to the current shell ipso facto.
echo $
echo $*
set -- hello world
echo $
echo $*
set --
echo $
The option -- (two hyphens) represents the end of options and implies that whatever following it must be command parameters.
- Command substitution: Instead of backquotes, you can use the form
$(commands)
to capture the output of those commands for use in some other commands or variables. It makes quoting and escaping much more easier. - Variable substitution: You can use these substitutions with command parameters (created by
bash
for a shell script) or with variables that you have created.
Substitution | Effect |
${var1:-var2} | If var1 is null or does not exist, var2 is used |
${var1:=var2} | If var1 is null or does not exist, value of var2 is used and set to var1 |
${var1:?msg} | If var1 is null or does not exist, msg is displayed as error |
${var1:+var2} | If var1 exists, var2 is used but not set to var1 |
${var:offset} | Everything of var after offset number of characters |
${var:offset:length} | length number of characters of var after offset number of characters |
${!prefix*} ${!prefix@} | All variables names beginning with prefix |
${!var[@]} ${!var[*]} | All indexes of array variable var |
${#var} | Length of value of var |
${var#drop} | Value of var without prefix matching Regex pattern drop |
${var##drop} | Empty string if prefix matches Regex pattern drop |
${var%drop} | Value of var without suffix matching Regex pattern drop |
${var%%drop} | Empty string if suffix matches Regex pattern drop |
${var^letter} | Changes first letter of var to uppercase if it matches letter (any alphabet, * or ?)
If letter is not specified, all first letter(s) of var will be changed to uppercase |
${var^^letter} | Changes any letter of var to uppercase if it matches letter (any alphabet, * or ?)
If letter is not specified, all letter(s) of var will be changed to uppercase |
${var,letter} | Changes first letter of var to lowercase if it matches letter (any alphabet, * or ?)
If letter is not specified, all first letter(s) of var will be changed to lowercase |
${var,,letter} | Changes any letter of var to lowercase if it matches letter (any alphabet, * or ?)
If letter is not specified, all letter(s) of var will be changed to lowercase |
${var/find/replace} | Value of var with instances of find replaced with replace . If find begins with '# ', then a match is made at the beginning. A '% ' makes it match at the end. |
Escaping
You can escape:
- special characters using the backslash (
\
). To escape the backslash character, use double backslashes (\\
). - literal text strings by wrapping them in single quotation marks (
' '
). Bash will not perform any expansions or substitutions. The single-quoted string should not have any more single-quotation marks. Bash will not perform any backslash-escaping either. - literal text strings by wrapping them in double-quotation marks (" ") but allowing for
$
-prefixed variables, expansions and substitutions - backslash-escaped characters
- backquoted (` `) command strings
- history-expansion character
a=World; echo "Hello $a"
a=World; echo 'Hello $a'
a=World; echo "Hello '$a'"
Printer's Error
In several places, the Bash Reference Manual (or maybe even this article) uses wrong characters for quotation marks. The apostrophe or u+0027
used in single-quoted strings can get replaced with the right single quotation mark or u+2019
. The grave accent or u+0060
used in backquoted strings may be replaced with left single quotation mark or u+2018
. The quotation mark or u+0022
used in double-quoted strings may also be replaced with left and right double quotation marks. They look similar but will result in an error if used in a shell script or in the command-line. I write my books and articles in CommonMark (a standardized MarkDown
dialect about which I got to write the first book ever) and output them as HTML, ODT, EPUB and PDF documents. Such documents will not have these quotation mark errors. When someone edits the document (before it goes to print) in a rich-text editor or page layout software such as LibreOffice Writer or Microsoft Word or Adobe Indesign, that program's autocorrect feature will change ordinary quotation marks and backquotes with inverted quotation marks. Just be aware that this can happen. To avoid mistakes, type the commands by hand. Do not copy-paste them.
Summary
I am sure you will also conclude that bash
code can be very cryptic. A lot of production code (industrial-strength shell scripts) are hundreds of lines long. If bash
was not so succinct and powerful, it would take forever to write the lines. If you are doing any kind of serious shell scripting, then it is best if you know all about bash
's myriad secrets. I think I have covered enough of them to kindle your interest. You are on your own now.
Notes
- This article was originally published in the Open Source For You magazine in 2022. I have re-posted it on CodeProject in 2023.
- This article has been sourced from my book Linux Command-Line Tips & Tricks. It is available for free in many ebook stores.
- My book on the standardized MarkDown dialect, the CommonMark Ready Reference, is also free in many ebook stores. If you write a coding-related article or book (hopefully using
CommonMark
), add a note to your editor/printsetter/publisher to first disable the autocorrect feature in their software.
History
- 6th March, 2023: Initial version