Convert Tabs to Spaces or Spaces to Tabs
Many operating systems have utilities to convert tabs to spaces and vice-versa, so why did I write my own.
The simple reason is when I first wrote it I did not know about the Unix "expand" utility.
There are two more complicated reasons.
- Expand did not work exactly like I wanted
- I wanted the same utility to convert tabs to spaces and to do the opposite, convert spaces to tabs.
Also, I wanted a demonstration program for my filter package, documented elsewhere.
ctab.c does not contain the complete program. It uses filter package routines to provide
- Command line parsing (ParseOptions.c)
- Displaying Help text (showhelp.c)
- Parsing and looping through the filename arguments on the command line (filter.c).
As with all my filters, ctab.c contains only the main routine, a subroutine to process one file, a subroutine to process individual options (without parsing the command line), and the actual help text that will be displayed. This way the filter package can provide all the infrastructure that is shared by all my filter routines.
The program function is documented in the program itself and displayed by entering ctab -h. This is what will be displayed:
Usage: ctab [-hlqdbc [# t#] [d#] [k#] [files...]
Convert tabs to spaces or spaces to tabs, maintaining the same visual
appearance of each page, similar to Unix utility "expand". However, unlike
"expand", this routine will stop converting at the first non-white space
character on a line if the -l option is specified.
Note that converting tabs to spaces is not a reversible operation: because
the number of spaces represented by a single tab character is indeterminate
it is in general impossible to convert a file's tabs to spaces and then
back to tabs and end up with a file identical to the original.
However, just because it is impossible doesn't mean we can't try! The -d#
option attempts to convert strings of two or more spaces back to tabs. As
usual, the tabsize of the input file can be specified with the -# (or -t#)
options, the output tabsize is indicated by the numeric portion of the -d
One difficulty is that the conversion of short runs of spaces is ambiguous.
If the short of run of spaces crosses a tab boundary should some of the
spaces be replaced with a tab character? If, for example, a string
consisted of two characters spanning a tabstop, the default -d action would
convert this to a tab followed by a space.
This "short run deconversion" can be overridden by the -k option. -k# will
not convert strings of # spaces. Note that the default value is -k1, so
that 1 character strings (ending at a tab stop) will be preserved. To
convert even these strings, specify -k or -k0. To keep all strings of one
and two spaces, specify -k2, and so on.
-d# (de)convert # or more spaces to tabs (default=8, see above)
-k# do not deconvert strings of # spaces (default=1, see above)
-# treat tabstops as occurring every # characters (default=8)
works for converting tabs to spaces and spaces to tabs
-t# same as -#, included for historical compatibility
-b contract indentlevel 8 to indentlevel 4 (same as -lt4d8)
-c expand indentlevel 4 to indentlevel 8 (same as -lt8d4)
-l stop conversion at first non-blank or non-tab on each line
-q does not convert inside quoted strings
-h displays this message and terminates
TabStops = 8, Indent Level = 4, -d4 to convert to Indent Level = 8
TabStops = 8, Indent Level = 8, -t4 -d8 to convert to Indent Level = 4
also specify -l to skip conversion of embedded tabs
The "files" argument indicates the names of files to be converted. If no
files are specified or if the filename is a single '-', standard input is
Converted output is always written to single standard output stream
(which may be redirected to a file).
On specifying options:
Option letters are case-insensitive, e.g. -h and -H are equivalent.
Options must occur before filename arguments and start with a dash
('-') immediately followed by an option character.
For options that take values (example: -t#), the value may optionally
be separated from the option letter by white space.
(ctab $Revision: 1.4 $ $Date: 2007-02-23 13:17:12-05 $)