Sonntag, 14. August 2011

Yet Another Bash Introduction


A shell is the command line user interface where you type in. It provides the prompt, e.g.
username@pc87 ~ $
where you type commands. These commands are then executed. The first shell was called sh. Many different improvements have been published, like ksh, csh, tcsh, zsh, bash, ash, dsh, ... The bash nowadays is the most commonly used shell on Linux. For a complete documentation please refer to its amazingly extensive man page (on my machine it has 5375 (!) lines).

User Interface

The bash uses readline for a comfortable look and feel. readline is a library also used by other programs for keyboard input. It gives you many features over a simple input line. With the arrow keys [Left] and [Right] you can move the cursor back and forth. [Backspace] and [Del] are used to delete characters before or at the cursor, respectively. Use the arrow keys [Up] and [Down] to get command lines you have previously typed (so called history).

A very impressive feature of the bash is file name completion. Just type the beginning of a file name and then press the [Tab] key. The file name will be completed. If it is not unique, the shell will only complete the common part and beep. Pressing [Tab] twice gives you a list of all possibilities. This command line completion is very smart and extensible. Try it and you will miss it whenever it is not available.

Another very useful feature is the (reverse) history search with the ^R key combination. Just type [Ctrl]+[R] and then any part of a previously entered command line. This will be presented. Pressing [Enter] immediately invokes this command again. You can also use the arrow keys to modify the old command line before execution.

To get the last argument (the last word of the previous history entry) for your current command line use [Alt]+[.]. Successive usage move back through the history list, inserting the last argument of each line in turn.

There are some useful navigation key combination:
KeyFunction
[Alt]+[F] move cursor one word forward
[Alt]+[B] move cursor one word bbackward
[Ctrl]+[K] delete until end of input line
[Ctrl]+[U] delete everything before of the cursor
[Ctrl]+[C] don't execute command line and print a new prompt
[Ctrl]+[L] clear the screen keeping the command line

Note that some keys are not transmitted properly via some terminal connections and thus don't work.

Pressing [Ctrl]-[C] will terminate the currently running program.

Shell commands I: history, alias, echo

With the builtin command history all previous command lines are shown. Usually only the last 500 are stored when the shell is exited into the file .bash_history.

The builtin command alias is used to create command aliases. Use it e.g. for
alias dir='ls -l --color=auto'
if you like the command dir.

The builtin command echo just prints its parameters to stdout. It always prints a newline at the end. This can be revoked with the -n option. Also have a look at the builtin printf which is more powerful than echo.

Environment Variables

Additionally to the command line, every program receives its environment variables. These can be defined in the bash by
VARIABLE=content
Usually these variables are not inherited by any executed program. To export a variable to all future child processes, use
export VARIABLE
You can combine the assignment and the export statement
export VARIALBE=content
Important and commonly used variables are
EDITOR your favorite editor, this is used by svn, crontab -e, ...
PAGER your favorite pager, usually less, this is used by man, ...
PATH a list of all directories where executables should be searched
LD_LIBRARY_PATH a list of all directories where libraries should be searched
DISPLAY the display for X programs
HOME the current user's home directory
IFS the Internal Field Separator that is used for word splitting after expansion, only used internally by the bash
USER the currently logged in user
UID his user ID
TERM the terminal type you are using


Environment variables can also be used in command lines with the $ prefix. Usage:
echo $PATH
echo ${PATH}
The curly braces { } are optional but recommended and obligatory when you concatenate with certain characters, e.g. echo "I'm the ${UID}th user!".

Most environment variables which are a list (e.g. ${PATH}, ${LD_LIBRARY_PATH}, ...) are a colon separated list (':').

With environment variables you can even do string processing.
${parameter:-word} Use Default Values. If parameter is unset or null, the expansion of word is substituted.
${parameter:offset}
${parameter:offset:length} Substring Expansion. Expands to up to length characters of parameter starting at the character specified by offset.
${#parameter} The length in characters of the value of parameter is substituted.
${parameter#pattern} Delete beginning of parameter matched by pattern
${parameter%pattern} Delete end of parameter matched by pattern
${parameter/pattern/string} Substitute pattern by string in parameter


Command line expansion

When you enter a command line it is preprocessed by the bash before the command is executed. All environment variables are substituted by their value. All file name wildcards are expanded. That means that if you type
ls *.txt
the command ls does not get the string '*.txt' but it gets the expanded list of files matching the wildcard, e.g. file1.txt file2.txt file3.txt. Try this with echo *.txt.

Note that all command line parameters are split up (using ${IFS}) and then supplied as a list (remember int main(int argc, char* argv[])). Thus, the command
echo This      is a       string with lots     of    spaces
will print
This is a string with lots of spaces
To have a string as a single argument, put it between double quotes, e.g.
echo "This      is a       string with lots     of    spaces"
Double qoutes also prevent file name expansion but still allow environment variable substitution. Try this with
echo "${HOME} = *"
To specify a string without environment variable substitution, use single qoutes '', e.g.
echo '${HOME} = *'

Command execution

When a command is entered, the bash first tries to apply an alias. Then it checks if the command is a builtin. To get a list of all builtins, use
help
With help command you get a short description of the particular command. If the command is not a builtin, the bash searches the ${PATH} for an executable with the name. Note that it does not search the current directory! If you also want the current directory to be searched, add it to the path with export PATH=${PATH}:.

Every program returns an 8 bit exit code (the return value of its int main() routine). Contrary to C a value of 0 means true (no error) and any other value is false (an error has occurred). The latest exit code is always available in the special environment variable $?.

Control Operators: ; && || ( )

To execute more than one command in a single command line, you can use the control operators. Use the ';' operator to execute the commands one after another. If you want the following command to to depend on the exit code of the previous, use && or ||. If the exit code is 0 (i.e. no error), && will execute the following command, otherwise it stops. If you want the following command to be executed on a non-zero exit code (i.e. on an error), use ||.
grep -qs pattern && echo "The file contains the pattern!"
grep -qs pattern || echo "The file doesn't contain the pattern!"
You can execute a sub-shell by putting a list of commands in parenthesis. This is especially useful in shell scripts
(
  echo "First line"
  cat /path/to/file
  grep pattern /path/to/*
  echo "Last line" 
) > outfile

Processes: &, ^Z, bg, fg, jobs

When a program is executed, it takes over control of the console. That means that all input to the console is directed to the program and the shell is paused. If you want a program to be executed in the background (with detached stdin), append its command line by the & operator.
xclock &
This is only useful for programs without interactivity at the console like most X programs and daemons. Note that the program's output is still printed to the console.

If you forgot to append the & operator, you can interrupt any program by pressing [Ctrl]+[Z] at the console. The program is put on hold and the prompt of the shell is displayed. The type bg to continue execution of the program in the background. To continue execution in the foreground, enter fg. With jobs you get a list of all currently running or suspended processes of the current shell.

Redirection

Every program uses stdin, stdout and stderr for input and output. These file descriptors can be redirected by the bash. To redirect the output of a program (stdout) to a file, use the '>' character
echo abc > filename
after the command and all its parameters. To append to an existing file, use '>>'.
echo abc >> filename
To redirect stderr, use
grep pattern /* 2> filename
To redirect stdin use the '<' character
tr abc ABC < filein > fileout
after the command and all its parameters.

To redirect the output of a program to the input of another program (pipe) use the '|' character:
tr abc ABC < filein | grep X | grep -v y
You can make long chains of commands connected through these pipes. The data is passed from one program to the next.

Command Substitution

The output of a program can even be redirected to the command line of another program with back quotes or the $(command) syntax (the latter is recommended), e.g.
grep pattern $(find . -iname '*.c')

Mathematics

To calculate (with integers) use the following expression
echo "Next usable user ID is $(( $UID + 1 ))"

Control Structures: if, case, while, for, function

The bash is even more powerful, similar to full features programming languages. Use control structures like if
if [ -f /etc/passwd ] ; then cat /etc/passwd ; else echo "not found" ; fi
It wants a true value (i.e. exit code 0) to execute the first branch. Note the fi (reverse of if) as endif tag.
The command '[ ... ]' is a builtin and is equivalent to the builtin test. Its option '-f' checks whetjer the given file exists and is a regular file (see help test). It returns an exit code of 0 if the condition is true.

A case looks like this
case ${VAR} in
  pattern1)
    commands
    ;;
  pattern2)
    commands
    ;;
esac
Note the ;; which are obligatory. The pattern are glob patterns. Note the esac (reverse of case) as end, Unix programmers are lazy but like to make fun.

This while loop
while true ; do ls -l filename ; sleep 1 ; done
shows the file (including its size) and then sleeps for 1 second. Then the loop is repeated (infinitely). Press [Ctrl]-[C] to terminate the loop. A loop to utilize 100% CPU looks like this
while : ; do : ; done
':' is equivalent to true and does nothing except returning an exit code of 0.

Another loop type are for loops. Contrary to popular programming languages they don't count an integer value but they iterate through a list.
for i in *.txt ; do echo "I found a file named $i" ; done
Here the *.txt is expanded to the list of all matching files. You can also use command substitution
for i in $(find . -iname '*.txt') ; do echo "I found a file named $i" ; done
If you need a list of number, use the seq program
for i in `seq 1 12` ; do echo $i ; done
Note: Never forget the do in loops, then for an if and in for a case.

Scripts

Shell scripts have the following structure
#!/bin/bash
#
# Description, author, date, ...
#

commands
The first line always has to be this (so called shebang), because it tells the kernel to execute the script with the interpreter /bin/bash. Comments are introduced by the # character and can start at the begin of the line and every later place. All commands are equivalent to the ones described above. You can put commands in new lines instead of separating them by ;.

Within shell scripts you have some more predefined environment variables.
$0 the path to the script as it was invoked
$1, $2, ... first, seconds, ... command line parameter, see also help shift
$# count of command line parameters
$* All command line parameters. When the expansion occurs within double quotes, it expands to a single word with the value of each parameter separated by the first character of the IFS special variable.
$@ All command line parameters. When the expansion occurs within double quotes, each parameter expands to a separate word.
$? status of the most recently executed foreground process
$$ PID of the current shell (i.e. script)

Keine Kommentare: