§ Text filtering commands
Administrating from scripts and the
command-line often benefit from pipes of text
filtering commands. Here are some that are
easy to overlook or forget.
mmencode converts to and frombase64 and "quoted-printable" formats for email.
Search for the
metamail package.Unfortunately, this has become hard to find.
Alternatively uuencode -m converts to base64,
and uudecode -m converts from base64.
Or decode and encode quoted-printable and
base64 with
perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::decode($_);' |
uniq lets you remove duplicated linesfrom a sorted file.
occurs with
sort | uniq -c | sort -n |
perl -pe 's/\s+/\n/g' |
of words with
paste -s -d" " |
comm lets you suppress lines unique toone or both of two files.
cat -s never prints more than one blankline in a row.
perl -ne 'print if /\S/' |
FOO and ending with one containing BAR.
sed -n '/FOO/,/BAR/p' |
merge and diff3 for merging changesin files edited from a common ancestor.
fold breaks lines to proper width, andfmt will reformat lines into paragraphs.dirname and basename let youextract the directory and filenames from a
full path to a file.
namei breaks a pathname into pieces andfollows symbolic links.
expand and col -x replace tabs byspaces.
col -b removes backspaces from a file.cat -v shows non-printing characters asascii escapes.
sed '1,10d' deletes the first 10lines.
sed -n '3p' and sed -n '3{p;q}' both print the third line, but the latter is
more efficient.
sed '/foo/q' truncates a file afterthe line containing
foo.sed -ne '/foo/,/bar/p' printseverything from the line containing
footo the line containing
bar.columns with
column -t. printf "%40s" "Do you want to delete? [y/N] " |
versa:
dos2unix file.txt |
cat -n and nl numbers lines.but the latter allows more general regular
expressions:
sed -e 's/oldtext/newtext/g' |
Here's how to replace double quotes by single
quotes for TeX:
< in.tex perl -pne 's%\B"\b%``%g' | |
iconv to convert between characterencodings.
in a file:
grep 'pattern' filename [file] [< file] |
line:
awk '{print $1,$3}' |
tr '[A-Z]' '[a-z]' |
deletions may be simplest with
tr. tr -d '\n' # delete newlines |
$ echo 1-2a-3b | tr "[1-9]" "[2-9]" | tr '-' '_' | tr -d 'a' |
read.Here is a complicated way to cat a text file,
piping in and out of a loop.
cat file | while read a; do echo "$a" ; done | cat |
paste file1 file2 | while read a b ; do echo "$a $b" ; done |
tsort.tac and words withrev.§ Files and directories
one of these
\ls | perl -lne 'print if -T' |
The perl algorithm for detecting text files
is very good.
including spaces and dashes, delimit the
files with null characters instead of
whitespace or newlines.
find -type f -print0 | xargs -r0 ls |
contents with
if ls -A "${dir}"/* >& /dev/null ; then echo "has files" ; else echo "no files" ; fi |
See if files of a certain type exist:
if ls *.par >& /dev/null ; then echo "has par files" ; fi |
readlink -f will fully resolve what asymbolic link points to.
Find all bad symbolic links with
find . -type l | |
§ Variables
expression, combine
if and grep. Forexample to see if the name of a file begins
with a dot, try
if echo "$filename" | grep '^[.]' >/dev/null |
expr also has a support for limited
regular expressions.
if [ `expr "$filename" : '[.].*'` -ne 0 ] |
read to avoid tokenizing filenameswith spaces. Here's how to find all files
containing a space, and replace them by
underscores.
find . -iname '* *' | |
expr: N=`expr "$N" + 3` |
math, use
bc -l # Get pi to 10 places with arctangent (bc man page) |
seq 1 100 generates all integersbetween 1 and 100. To iterate a loop 100
times, try
for i in `seq 1 100` ; do ... ; done |
by defining a variable on the same line. The
current shell is not affected.
$ x=doggie sh -c 'echo x=$x' |
if [ -n "$string" ] ; then echo "not empty" ; fi |
The -n is actually the default for a
string expression, so you can omit it:
if [ "$string" ] ; then echo "not empty" ; fi |
values for environmental variables. Many do
this
if [ ! "$VARIABLE" ] ; then VARIABLE="default value" ; fi |
A simple alternative is
: ${VARIABLE:="default value"} |
The colon at the beginning of the line is
necessary as a no-op that allows its
arguments to be evaluated.
defined as an empty string. If so, then omit
the colon before the equals when setting the
default.
: ${VARIABLE="default value"} |
To test whether a string is defined, even if
empty test
if [ "${VARIABLE+x}" ] ; then echo DEFINED ; fi |
echo ${!X*}§ Running commands
"$@" when passing command-linearguments unaltered to subprocesses. This is
equivalent to passing
"$1" "$2" ..., butthe first version works properly for no
arguments.
$ set a 'b c' d |
with
set -o bind -p shopt -p andstty -a.
For example, you can edit a bash by default
in emacs mode. Change to vi with
set -o vi |
In emacs mode, you can edit your command in
your environmental $EDITOR with cntl-x
cntl-e
In vi-mode, use esc-v. See help fc
for more.
command with
!$. Repeat all argumentswithout the command with
!*.outlives the current shell, add extra
parentheses like this:
( command & ) |
Otherwise, your current shell, by exiting X
or ssh, may terminate all processes that have
your shell as the parent process. The extra
parentheses starts a subshell that exits as
soon as the command is spawned in the
background. The background process changes
its parent process ID to 1. This is a
command-line version of the "double fork."
while ! cvs -z 3 -q update -dPA ; do echo -n . ; sleep 60 ; done |
a process)
sleep 10 & while ps -p $! >/dev/null; do echo -n . ; sleep 1 ; done ; echo |
pgrep -f or killall -0 are alternatives to pidof for this purpose.
§ Manipulating paths
tokenizing with the character ':'.
IFS=':' ; for dir in $PATH ; do echo $dir ; done |
version of a command in your PATH:
function checkPath() {IFS=':' ; for dir in $PATH ; do if [ -x "$dir/$1" ] ; |
that I found in
/etc/profile: pathmunge () { |
§ Common script chores
set -x.command with
set -e. for i in "$@" ; do |
if [ $# -lt 1 -o "$1" = "-h" -o "$1" = "-help" -o "$1" = "--help" ] ; then |
Often an error exit is handled most cleanly
with a function.
print_usage_and_exit() { |
containing a script, following symbolic
links. (Taken from the launch script of
FindBugs.) program="$0" |
Ever try to interrupt your script, then
discover that it killed only one command and
continued to the next? Force a complete exit
by adding the following line early in your
script.
trap "exit 1" 1 2 3 15 |
You can also trap normal and error exits:
# force script to exit when any command fails |
Get the process ID of the current shell as$$, of the parent shell with $PPID
and $! for the most recently backgrounded
child process.
Interactively, you get see child PID's
with jobs -p.
with a default of no. It checks whether the
first letter is a y or Y and ignores leading
spaces.
echo -n "Do you want to continue? [y/N]: " |
echoing the characters. The trapping ensures
that an interrupt does not leave the echoing
off.
stty -echo |
Gnome and other frameworks often allow simple
scripting of GUIs:
password=`zenity --entry --text "Enter password:"` |
§ File descriptors
Here are common ways to capture the
standard output and standard error
of a single command in a log file:
command >file.log 2>&1 |
you can have them all write to the same log
file by default:
# save default standard output in file descriptor 10 |
Avoid file descriptor 5, which bash already
uses. (ulimit -n should show many
available file descriptors.)
test -t 1 && echo "Connected to a terminal" |
Associate a file descriptor, say 4, with a
socket, and close with
4< /dev/tcp/$hostname/$port |
A more portable solution is to use nc.
Listen on a port with
nc -l -p 3535 |
Connect to a remote host port like
echo 'GET /' | nc hostname 80 |
An even more general utility is socat,
which also handles Unix sockets.
No comments:
Post a Comment