Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Bash

Shell Programming Secrets Nobody Talks About (Part 2)

5.00/5 (3 votes)
6 Mar 2023CPOL9 min read 8.8K  
This final part of the article series covers some other important behaviour of bash, particularly that relating to text expansions and substitutions.
The famed text-processing abilities of the Bourne shell (sh) have been vastly expanded in the Bourne-Again shell (bash). These abilities are efficient to a fault and the code can be even more cryptic. This final part of the article series covers some important behaviour of bash, particularly that relating to text expansions and substitutions.

Introduction

This final part of the article series covers some other important behaviour of bash, particularly those relating to text expansions and substitutions.

The famed text-processing abilities of the Bourne shell (sh) have been vastly expanded in the Bourne-Again shell (bash). These abilities are efficient to a fault and the code can be even more cryptic. This makes the bash command interpreter and programming language very powerful but also a minefield of errors.

Before going further, let us review what we learned in the first part of this article:

  • sh and bash are not the same
  • if statements check for exit codes (0 for true and non-zero for false), not boolean values
  • [, true and false are programs, not keywords
  • Presence/absence of space in variable comparisons and assignments make quite a difference
  • [[ is not the fail-safe version of [
  • Arithmetic operations are not straight-forward as in other languages
  • Array operations use cryptic operators
  • By default, bash will not stop for errors

In this part, we will focus on how bash performs text expansions and substitutions. I will only cover what I think are the most important text-processing features. For comprehensive information, you will have to study the Bash Reference Manual. Bash and several commands such as sed and grep also use regular expressions to perform text processing. Regular expressions is a separate topic on its own and I will not cover them either.

History Expansion Character (!)

This feature is available when typing commands at the shell prompt. It is used to access commands stored in bash history file.

!n Execute nth command in bash history
!! Execute last command (Equivalent to !-1)
!leword Execute last command beginning with ‘leword
!?leword? Execute last command containingleword
^search^replace Execute last command after replacing first occurrence of 'search' with 'replace'

You can modify the history search using certain word designators, preceded by a colon (:).

!?leword?:0 Execute with 0th word (usually the command executable) in last command containing ‘leword’.
!?leword?:2 Execute with second word of last command containing ‘leword’.
!?leword?:$ Execute with last word in last command containing ‘leword’.
!?leword?:2-6 Execute with second word to sixth word in last command containing ‘leword’.
!?leword?:-6 Execute with all words up to 6th word in last command containing ‘leword’ (Equivalent to !?leword?:0-6)
!?leword?:* Execute with all words of last command (but not the 0th word) containing ‘leword’ (Equivalent to !?leword?:1-$)
!?leword?:2* Execute with the second word to the last word in last command (but not the 0th word) containing ‘leword’ (equivalent to !?leword?:2-$)
!?leword?:2- Execute with all words from the 2nd position to last but-one word and not the 0th word in the command containing ‘leword’.

Remember that bash will execute whatever you have retrieved from the history with whatever you have already typed at the prompt. You can also use any number modifiers, each preceded by a colon (:).

!?leword?:p Display (but not execute) last command containing ‘leword
!?leword?:t Execute with last command containing ‘leword’ after removing all pathname of last argument (i.e., leave the tail containing the file name)
!?leword?:r Execute with last command containing ‘leword’ after removing the file extension from the last argument
!?leword?:e Execute with last command containing ‘leword’ after removing pathname and file name from the last argument (leaving just the extension)
!?leword?:s/search/replace Execute last command containing ‘leword’ after replacing the first instance of ‘search’ with ‘replace
!?leword?:as/search/replace Execute last command containing ‘leword’ after replacing all instances of ‘search’ with ‘replace

If you omit the search text ('leword') and use the history expansion character with the word designators and the modifiers, bash will search the last command. Until you become proficient in using the history expansion character, use the modifier :p to display the command before you actually execute it.

Text Expansions and Substitutions

These features are available at the shell prompt and in shell scripts.

  • Tilde (~): In your commands, bash will expand instances of ~ with the value of the environmental variable $HOME, that is, your home directory.
  • ? and *: These are metacharacters. In file descriptors, ? matches for any one character while * matches any number of any characters. If they do not match any file names, bash will use their literal values.
  • Brace expansion: You can use comma-separated text strings within curly brackets to generate combination of strings with their suffixes and/or prefixes. When I start a new book, I create its folders like this.
    mkdir -p NewBook/{ebook/images,html/images,image-sources,isbn,pub,ref}
    This command creates folders like this:
    NewBook
    NewBook/ref
    NewBook/pub
    NewBook/isbn
    NewBook/image-sources
    NewBook/html
    NewBook/html/images
    NewBook/ebook
    NewBook/ebook/images 
  • Parameter expansion: When bash executes a script, it creates these special variables for the script.
    Shell variable Use
    $0 Name of the shell script
    $1, $2,… Positional parameters or arguments passed to the script
    $# Total count of arguments passed to the script
    $? Exit status of last command
    $* All arguments (double-quoted)
    $@ All arguments (individually double-quoted)
    $$ Process ID of current shell/script

    At the terminal, $0 will usually expand to the shell program (/bin/bash) On a terminal, you can use the set command to specify parameters to the current shell ipso facto.

    Bash
    # Displays 0
    echo $#
    
    # Displays an empty string and causes a new line
    echo $*
    
    # Sets hello and world as parameters to current shell
    set -- hello world
    
    # Displays 2 (the number of parameters)
    echo $#
    
    # Displays hello world
    echo $*
    
    # Remove parameters to current shell
    set --
    
    # Displays 0 (as earlier)
    echo $#

    The option -- (two hyphens) represents the end of options and implies that whatever following it must be command parameters.

  • Command substitution: Instead of backquotes, you can use the form $(commands) to capture the output of those commands for use in some other commands or variables. It makes quoting and escaping much more easier.
  • Variable substitution: You can use these substitutions with command parameters (created by bash for a shell script) or with variables that you have created.
    Substitution Effect
    ${var1:-var2} If var1 is null or does not exist, var2 is used
    ${var1:=var2} If var1 is null or does not exist, value of var2 is used and set to var1
    ${var1:?msg} If var1 is null or does not exist, msg is displayed as error
    ${var1:+var2} If var1 exists, var2 is used but not set to var1
    ${var:offset} Everything of var after offset number of characters
    ${var:offset:length} length number of characters of var after offset number of characters
    ${!prefix*} ${!prefix@} All variables names beginning with prefix
    ${!var[@]} ${!var[*]} All indexes of array variable var
    ${#var} Length of value of var
    ${var#drop} Value of var without prefix matching Regex pattern drop
    ${var##drop} Empty string if prefix matches Regex pattern drop
    ${var%drop} Value of var without suffix matching Regex pattern drop
    ${var%%drop} Empty string if suffix matches Regex pattern drop
    ${var^letter} Changes first letter of var to uppercase if it matches letter (any alphabet, * or ?)
    If letter is not specified, all first letter(s) of var will be changed to uppercase
    ${var^^letter} Changes any letter of var to uppercase if it matches letter (any alphabet, * or ?)
    If letter is not specified, all letter(s) of var will be changed to uppercase
    ${var,letter} Changes first letter of var to lowercase if it matches letter (any alphabet, * or ?)
    If letter is not specified, all first letter(s) of var will be changed to lowercase
    ${var,,letter} Changes any letter of var to lowercase if it matches letter (any alphabet, * or ?)
    If letter is not specified, all letter(s) of var will be changed to lowercase
    ${var/find/replace} Value of var with instances of find replaced with replace. If find begins with '#', then a match is made at the beginning. A '%' makes it match at the end.

Escaping

You can escape:

  • special characters using the backslash (\). To escape the backslash character, use double backslashes (\\).
  • literal text strings by wrapping them in single quotation marks (' '). Bash will not perform any expansions or substitutions. The single-quoted string should not have any more single-quotation marks. Bash will not perform any backslash-escaping either.
  • literal text strings by wrapping them in double-quotation marks (" ") but allowing for
    • $-prefixed variables, expansions and substitutions
    • backslash-escaped characters
    • backquoted (` `) command strings
    • history-expansion character
Bash
# Displays Hello World
a=World; echo "Hello $a"

# Displays Hello $a
a=World; echo 'Hello $a'

# Displays Hello 'World'
a=World; echo "Hello '$a'" 

Printer's Error

In several places, the Bash Reference Manual (or maybe even this article) uses wrong characters for quotation marks. The apostrophe or u+0027 used in single-quoted strings can get replaced with the right single quotation mark or u+2019. The grave accent or u+0060 used in backquoted strings may be replaced with left single quotation mark or u+2018. The quotation mark or u+0022 used in double-quoted strings may also be replaced with left and right double quotation marks. They look similar but will result in an error if used in a shell script or in the command-line. I write my books and articles in CommonMark (a standardized MarkDown dialect about which I got to write the first book ever) and output them as HTML, ODT, EPUB and PDF documents. Such documents will not have these quotation mark errors. When someone edits the document (before it goes to print) in a rich-text editor or page layout software such as LibreOffice Writer or Microsoft Word or Adobe Indesign, that program's autocorrect feature will change ordinary quotation marks and backquotes with inverted quotation marks. Just be aware that this can happen. To avoid mistakes, type the commands by hand. Do not copy-paste them.

Summary

I am sure you will also conclude that bash code can be very cryptic. A lot of production code (industrial-strength shell scripts) are hundreds of lines long. If bash was not so succinct and powerful, it would take forever to write the lines. If you are doing any kind of serious shell scripting, then it is best if you know all about bash's myriad secrets. I think I have covered enough of them to kindle your interest. You are on your own now.

Notes

  • This article was originally published in the Open Source For You magazine in 2022. I have re-posted it on CodeProject in 2023.
  • This article has been sourced from my book Linux Command-Line Tips & Tricks. It is available for free in many ebook stores.
  • My book on the standardized MarkDown dialect, the CommonMark Ready Reference, is also free in many ebook stores. If you write a coding-related article or book (hopefully using CommonMark), add a note to your editor/printsetter/publisher to first disable the autocorrect feature in their software.

History

  • 6th March, 2023: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)