Shell Strings and Arrays


  • Description: Quoting rules, parameter expansion (length, substring, pattern removal, replacement, defaults, case), and indexed + associative arrays
  • My Notion Note ID: K2A-E-3
  • Created: 2020-06-03
  • Updated: 2026-05-18
  • License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io

Table of Contents


1. Quoting Rules

Form Expands $var? Expands `cmd` / $(cmd)? Backslash escapes? Preserves literal '?
'literal' (single) no no no no — must close quote first
"literal" (double) yes yes only before $, `, \, ", \n yes
\c (backslash) n/a n/a escapes a single char n/a
name=Alice
echo 'hello $name'    # hello $name      (literal)
echo "hello $name"    # hello Alice      (expanded)
echo "path: \$HOME"   # path: $HOME      (escaped)
echo "$(date)"        # captures stdout of `date`
  • Quote everything you expand. rm $file is broken if $file contains a space — it becomes two args. rm "$file" is correct.
  • A single quote cannot appear inside single quotes; close, escape, reopen: 'it'\''s'it's.
  • Backslash at end of line continues the line (line continuation), useful for long commands.

2. Concatenation and $'...'

greet="hello, "
who="world"
msg="$greet$who"             # concatenate by adjacency
msg2="${greet}${who}"        # same, brace form

path=/usr/local
echo "$path/bin"             # /usr/local/bin

tabbed=$'col1\tcol2\tcol3'   # ANSI-C quoting — \t, \n, \xHH, \uXXXX work
echo "$tabbed"
  • No + operator for strings — string concatenation is just placing values next to each other.
  • $'...' (ANSI-C quoting, bash) — like single quotes, but processes backslash escapes (\n, \t, \0, \xHH, \uXXXX). The only clean way to embed literal control characters.

3. Length and Substring

s="Hello, World!"

echo ${#s}                # 13 — length in characters
echo ${s:7}               # "World!"     — from offset 7 to end
echo ${s:7:5}             # "World"      — offset 7, length 5
echo ${s: -6}             # "World!"     — negative offset (note space!)
echo ${s: -6:5}           # "World"
  • ${#var} — length.
  • ${var:offset[:length]} — substring. Offset is 0-based; length is optional.
  • Pitfall: negative offset needs a space (or parens): ${s: -6} or ${s:(-6)}. ${s:-6} is the default-value expansion (§6) and means "if s is unset, use -6".

4. Pattern Removal

Glob-pattern prefix/suffix stripping. *, ?, [abc] are shell globs (not regex).

Syntax Meaning
${var#pat} Strip shortest match from the front
${var##pat} Strip longest match from the front
${var%pat} Strip shortest match from the back
${var%%pat} Strip longest match from the back
f="/home/yu/notes/shell-basics.en.md"

echo "${f##*/}"     # shell-basics.en.md   — basename
echo "${f%/*}"      # /home/yu/notes        — dirname
echo "${f##*.}"     # md                    — extension
echo "${f%.*}"      # /home/yu/notes/shell-basics.en   — strip last ext
echo "${f%%.*}"     # /home/yu/notes/shell-basics      — strip all exts
  • Mnemonic: # is on the left of the $ key → strips left. % is on the right → strips right. Doubled = greedy.
  • Useful for pure-shell dirname/basename/extension extraction without forking a subprocess.

5. Pattern Replacement

s="foo bar foo baz"

echo "${s/foo/qux}"      # qux bar foo baz    — first match only
echo "${s//foo/qux}"     # qux bar qux baz    — all matches (//)
echo "${s/#foo/qux}"     # qux bar foo baz    — anchor to start (#)
echo "${s/%foo/qux}"     # foo bar foo baz    — anchor to end (%) — no match here
echo "${s/%baz/qux}"     # foo bar foo qux
echo "${s/foo/}"         #  bar foo baz       — empty replacement = delete
  • ${var/pat/repl} — first; ${var//pat/repl} — all.
  • ${var/#pat/repl} / ${var/%pat/repl} — anchored to start / end.
  • Patterns are globs, not regex. For regex, use [[ $var =~ regex ]] or pipe through sed.

6. Default-Value Expansions

For optional values, fallbacks, and required-arg checks:

Syntax When var is unset OR empty
${var:-default} Yields default, doesn't assign
${var:=default} Yields default AND assigns to var
${var:?error} Prints error and exits with non-zero (script) or returns (interactive)
${var:+alt} Yields alt if var is set (inverse of :-)

Drop the : to test only "unset" (set-but-empty is treated as set):

echo "${LOG:-info}"            # info if LOG empty or unset
out="${OUT:=/tmp/result}"      # set OUT if empty; $out and $OUT both /tmp/result
: "${CONFIG:?CONFIG must be set}"   # guard early in a script
flag=${VERBOSE:+--verbose}     # flag is "--verbose" or empty
  • The leading : in : "${CONFIG:?...}" is the null command — runs nothing but triggers the expansion (and its error).
  • These are the standard idioms for "config defaults" and "required env vars" in shell scripts.

7. Case Modification

s="hello World"

echo "${s^}"      # "Hello World"  — uppercase first char
echo "${s^^}"     # "HELLO WORLD"  — uppercase all
echo "${s,}"      # "hello World"  — lowercase first char (already lowercase here)
echo "${s,,}"     # "hello world"  — lowercase all
echo "${s~~}"     # "HELLO wORLD"  — toggle each char

# With a pattern: only modify matching chars
echo "${s^^[aeiou]}"   # "hEllO WOrld" — uppercase vowels
  • Bash 4+. macOS's system bash is still 3.2 → these don't work; install via Homebrew (brew install bash).

8. Indexed Arrays

arr=(apple banana cherry)         # declare + init
arr[3]=date                       # set by index (sparse OK)

echo "${arr[0]}"                  # apple
echo "${arr[@]}"                  # apple banana cherry date — all elements
echo "${#arr[@]}"                 # 4 — length
echo "${!arr[@]}"                 # 0 1 2 3 — indices

arr+=(elderberry)                 # append
arr[10]=fig                       # sparse — indices skip 4..9
echo "${!arr[@]}"                 # 0 1 2 3 4 10

# Slicing
echo "${arr[@]:1:2}"              # banana cherry

unset 'arr[2]'                    # remove one element (creates a hole)
unset arr                         # remove the whole array

# Iterate — always quote and use [@]
for item in "${arr[@]}"; do
  echo "$item"
done
  • ${arr[@]} vs ${arr[*]} — same word-splitting rule as $@ vs $*. Inside double quotes, "${arr[@]}" gives one word per element; "${arr[*]}" joins everything into one word.
  • Arrays are indexed from 0. ${arr} without subscript means ${arr[0]} — easy to forget.
  • Pitfall: quoting matters. for x in ${arr[@]} (no quotes) word-splits each element on spaces. for x in "${arr[@]}" is correct.

9. Associative Arrays

Bash 4+. Hash-map / dict semantics with string keys.

declare -A user                   # MUST declare with -A first
user[name]=alice
user[role]=admin
user[email][email protected]

echo "${user[name]}"              # alice
echo "${user[@]}"                 # alice admin [email protected]  — values
echo "${!user[@]}"                # name role email             — keys
echo "${#user[@]}"                # 3                           — count

# Iterate keys → values
for key in "${!user[@]}"; do
  echo "$key=${user[$key]}"
done

unset 'user[email]'               # remove one entry
[[ -v user[name] ]] && echo "name is set"   # test key existence
  • Must declare -A before assigning. Without it, user[name]=alice treats user as an indexed array, and name (an unset variable) evaluates to 0 → everything ends up at index 0.
  • Iteration order is unspecified (bash uses internal hash order). Don't rely on it.

10. References