Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the general syntax of a Unix shell command?

In particular, why is that sometimes the options to some commands are preceded by a + sign and sometimes by a - sign?

for example:

sort -f
sort -nr
sort +4n
sort +3nr
like image 275
Lazer Avatar asked Jan 29 '10 05:01

Lazer


2 Answers

These days, the POSIX standard using getopt() (aka getopt(3)) is widely used as a standard notation, but in the early days, people were experimenting. On some machines, the sort command no longer supports the + notation. However, various commands (notably ar and tar) accept controls without any prefix character - and dd (alluded to by Alok in a comment) uses another convention altogether.

The GNU convention of using '--' for long options (supported by getopt_long(3)) was changed from using '+'. Of course, the X11 software uses a single dash before multi-character options. So, the whole thing is a collection of historic relics as people experimented with how best to handle it.

POSIX documents the Utility Conventions that it works to, except where historical precedent is stronger.


What styles of option handling are there?

[At one time, SO 367309 contained the following material as my answer. It was originally asked 2008-12-15 02:02 by FerranB, but was subsequently closed and deleted.]

How many different types of options do you recognize? I can think of many, including:

  • Single-letter options preceded by single dash, groupable when there is no argument, argument can be attached to option letter or in next argument (many, many Unix commands; most POSIX commands).
  • Single-letter options preceded by single dash, grouping not allowed, arguments must be attached (RCS).
  • Single-letter options preceded by single dash, grouping not allowed, arguments must be separate (pre-POSIX SCCS, IIRC).
  • Multi-letter options preceded by single dash, arguments may be attached or in next argument (X11 programs; also Java and many programs on Mac OS X with a NeXTSTEP heritage).
  • Multi-letter options preceded by single dash, may be abbreviated (Atria Clearcase).
  • Multi-letter options preceded by single plus (obsolete).
  • Multi-letter options preceded by double dash; arguments may follow '=' or be separate (GNU utilities).
  • Options without prefix/suffix, some names have abbreviations or are implied, arguments must be separate. (AmigaOS Shell)

For options taking an optional argument, sometimes the argument must be attached (co -p1.3 rcsfile.c), sometimes it must follow an '=' sign. POSIX doesn't support optional arguments meaningfully (the POSIX getopt() only allows them for the last option on the command line).

All sensible option systems use an option consisting of double-dash ('--') alone to mean "end of options" — the following arguments are "non-option arguments" (usually file names; POSIX calls them 'operands') even if they start with a dash. (I regard supporting this notation as an imperative. Be aware that if the -- is preceded by an option requiring an argument, the -- will be treated as the argument to the option, not as the 'end of options' marker.)

Many but not all programs accept single dash as a file name to mean standard input (usually) or standard output (occasionally). Sometimes, as with GNU 'tar', both can be used in a single command line:

... | tar -cf - -F - | ...

The first solo dash means 'write to stdout'; the second means 'read file names from stdin'.

Some programs use other conventions — that is, options not preceded by a dash. Many of these are from the oldest days of Unix. For example, 'tar' and 'ar' both accept options without a dash, so:

tar cvzf /tmp/somefile.tgz some/directory

The dd command uses opt=value exclusively:

dd if=/some/file of=/another/file bs=16k count=200

Some programs allow you to interleave options and other arguments completely; the C compiler, make and the GNU utilities run without POSIXLY_CORRECT in the environment are examples. Many programs expect the options to precede the other arguments.

Note that git and other VCS commands often use a hybrid system:

git commit -m 'This is why it was committed'

There is a sub-command as one of the arguments. Often, there will be optional 'global' options that can be specified between the command and the sub-command. There are examples of this in POSIX; the sccs command is in this category; you can argue that some of the other commands that run other commands are also in this category: nice and xargs spring to mind from POSIX; sudo is a non-POSIX example, as are svn and cvs.


I don't have strong preferences between the different systems. When there are few enough options, then single letters with mnemonic value are convenient. GNU supports this, but recommends backing it up with multi-letter options preceded by a double-dash.

There are some things I do object to. One of the worst is the same option letter being used with different meanings depending on what other option letters have preceded it. In my book, that's a no-no, but I know of software where it is done.

Another objectionable behaviour is inconsistency in style of handling arguments (especially for a single program, but also within a suite of programs). Either require attached arguments or require detached arguments (or allow either), but do not have some options requiring an attached argument and others requiring a detached argument. And be consistent about whether '=' may be used to separate the option and the argument.

As with many, many (software-related) things — consistency is more important than the individual decisions. Using tools that automate and standardize the argument processing helps with consistency.


Whatever you do, please, read the TAOUP's Command-Line Options and consider Standards for Command Line Interfaces. (Added by J F Sebastian — thanks; I agree.)

like image 61
Jonathan Leffler Avatar answered Oct 04 '22 02:10

Jonathan Leffler


It's completely arbitrary; the command may implement all of the option handling in its own special way or it might call out to some other convenience functions. The getopt() family of functions is pretty popular, so most software written even remotely recently follows the conventions set by those routines. There are always exceptions, of course!

like image 28
Carl Norum Avatar answered Oct 04 '22 02:10

Carl Norum