Command-line processing with getopt()

Conforming To

getopt():

POSIX.2 and POSIX.1-2001, provided the environment variable POSIXLY_CORRECT is set. Otherwise, the elements of argv aren’t really const,
because we permute them. We pretend they’re const in the prototype to be compatible with other systems.

The use of ‘+’ and ‘-‘ in optstring is a GNU extension.

On some older implementations, getopt() was declared in . SUSv1 permitted the declaration to appear in either
or . POSIX.1-2001 marked the use of for this purpose as LEGACY. POSIX.1-2001 does not
allow the declaration to appear in .

getopt_long() and getopt_long_only():
These functions are GNU extensions.

Conforming To

getopt():

POSIX.2 and POSIX.1-2001, provided the environment variable POSIXLY_CORRECT is set. Otherwise, the elements of argv aren’t really const,
because we permute them. We pretend they’re const in the prototype to be compatible with other systems.

The use of ‘+’ and ‘-‘ in optstring is a GNU extension.

On some older implementations, getopt() was declared in . SUSv1 permitted the declaration to appear in either
or . POSIX.1-2001 marked the use of for this purpose as LEGACY. POSIX.1-2001 does not
allow the declaration to appear in .

getopt_long() and getopt_long_only():
These functions are GNU extensions.

RETURN VALUE top

       If an option was successfully found, then getopt() returns the option
       character.  If all command-line options have been parsed, then
       getopt() returns -1.  If getopt() encounters an option character that
       was not in optstring, then '?' is returned.  If getopt() encounters
       an option with a missing argument, then the return value depends on
       the first character in optstring: if it is ':', then ':' is returned;
       otherwise '?' is returned.

       getopt_long() and getopt_long_only() also return the option character
       when a short option is recognized.  For a long option, they return
       val if flag is NULL, and 0 otherwise.  Error and -1 returns are the
       same as for getopt(), plus '?' for an ambiguous match or an
       extraneous parameter.

In Shell

Shell script programmers commonly want to provide a consistent way of providing options. To achieve this goal, they turn to getopts and seek to port it to their own language.

The first attempt at porting was the program getopt, implemented by Unix System Laboratories (USL). This version was unable to deal with quoting and shell metacharacters, as it shows no attempts at quoting. It has been inherited to FreeBSD.

In 1986, USL decided that being unsafe around metacharacters and whitespace was no longer acceptable, and they created the builtin getopts command for Unix SVR3 Bourne Shell instead. The advantage of building the command into the shell is that it now has access to the shell’s variables, so values could be written safely without quoting. It uses the shell’s own variables to track the position of current and argument positions, OPTIND and OPTARG, and returns the option name in a shell variable.

In 1995, was included in the Single UNIX Specification version 1 / X/Open Portability Guidelines Issue 4. Now a part of the POSIX Shell standard, getopts have spread far and wide in many other shells trying to be POSIX-compliant.

getopt was basically forgotten until util-linux came out with an enhanced version that fixed all of old getopt’s problems by escaping. It also supports GNU’s long option names. On the other hand, long options have been implemented rarely in the command in other shells, ksh93 being an exception.

Return Value

If an option was successfully found, then getopt() returns the option character. If all command-line options have been parsed, then getopt()
returns -1. If getopt() encounters an option character that was not in optstring, then ‘?’ is returned. If getopt() encounters an option
with a missing argument, then the return value depends on the first character in optstring: if it is ‘:’, then ‘:’ is returned; otherwise ‘?’ is
returned.

getopt_long() and getopt_long_only() also return the option character when a short option is recognized. For a long option, they return
val if flag is NULL, and 0 otherwise. Error and -1 returns are the same as for getopt(), plus ‘?’ for an ambiguous match or an extraneous
parameter.

Usage

For users

The command-line syntaxes for getopt-based programs is the POSIX-recommended Utility Argument Syntax. In short:

  • Options are single-character alphanumerics preceded by a (hyphen-minus) character.
  • Options can take an argument, mandatory or optional, or none.
  • When an option takes an argument, this can be in the same token or in the next one. In other words, if takes an argument, is the same as .
  • Multiple options can be chained together, as long as the non-last ones are not argument taking. If and take no arguments while takes an optional argument, is the same as , but is not the same as due to the preceding rule.
  • All options precede non-option arguments (except for in the GNU extension). always marks the end of options.

Extensions on the syntax include the GNU convention and Sun’s CLIP specification.

For programmers

The getopt manual from GNU specifies such a usage for getopt:

#include <unistd.h>

int getopt(int argc, char * const argv[],
           const char *optstring);

Here the argc and argv are defined exactly like they are in the C main function prototype, i.e. argc indicates the length of the argv array-of-strings. The optstring contains a specification of what options to look for (normal alphanumerals except W), and what options to accept arguments (colons). For example, «vf::o:» refers to three options: an argumentless v, an optional-argument f, and a mandatory-argument o. GNU here implements a W extension for long option synonyms.

getopt itself returns an integer that is either an option character or -1 for end-of-options. The idiom is to use a while-loop to go through options, and to use a switch-case statement to pick and act on options. See the example section of this article.

To communicate extra information back to the program, a few global variables are referenced by the program to fetch information from :

extern char *optarg;
extern int optind, opterr, optopt;
optarg
A pointer to the argument of the current option, if present. Can be used to control where to start parsing (again).
optind
Where getopt is currently looking at in argv.
opterr
A boolean switch controlling whether getopt should print error messages.
optopt
If an unrecognized option occurs, the value of that unrecognized character.

The GNU extension getopt_long interface is similar, although it belongs to a different header file and takes an extra option for defining the «short» names of long options and some extra controls. If a short name is not defined, getopt will put an index referring to the option structure in the longindex pointer instead.

#include <getopt.h>

int getopt_long(int argc, char * const argv[],
           const char *optstring,
           const struct option *longopts, int *longindex);

Examples

Here is a bash script using getopts. The script prints a greeting, with an optional name, a variable number of times. It takes two possible options: -n NAME and -t TIMES.

#!/bin/bash
NAME=""                                        # Name of person to greet.
TIMES=1                                        # Number of greetings to give. 
usage() {                                      # Function: Print a help message.
  echo "Usage: $0  " 1>&2 
}
exit_abnormal() {                              # Function: Exit with error.
  usage
  exit 1
}
while getopts ":n:t:" options; do              # Loop: Get the next option;
                                               # use silent error checking;
                                               # options n and t take arguments.
  case "${options}" in                         # 
    n)                                         # If the option is n,
      NAME=${OPTARG}                           # set $NAME to specified value.
      ;;
    t)                                         # If the option is t,
      TIMES=${OPTARG}                          # Set $TIMES to specified value.
      re_isanum='^+$'                     # Regex: match whole numbers only
      if ! ] ; then   # if $TIMES not a whole number:
        echo "Error: TIMES must be a positive, whole number."
        exit_abnormal
        exit 1
      elif ; then            # If it's zero:
        echo "Error: TIMES must be greater than zero."
        exit_abnormal                          # Exit abnormally.
      fi
      ;;
    :)                                         # If expected argument omitted:
      echo "Error: -${OPTARG} requires an argument."
      exit_abnormal                            # Exit abnormally.
      ;;
    *)                                         # If unknown (any other) option:
      exit_abnormal                            # Exit abnormally.
      ;;
  esac
done
if ; then                      # If $NAME is an empty string,
  STRING="Hi!"                                 # our greeting is just "Hi!"
else                                           # Otherwise,
  STRING="Hi, $NAME!"                          # it is "Hi, (name)!"
fi
COUNT=1                                        # A counter.
while ; do                # While counter is less than
                                               # or equal to $TIMES,
  echo $STRING                                 # print a greeting,
  let COUNT+=1                                 # then increment the counter.
done
exit 0                                         # Exit normally.

If this script is named greeting, here’s what the output looks like with different options:

./greeting
Hi!
./greeting -n Dave
Hi, Dave!
./greeting -t 3
Hi!
Hi!
Hi!
./greeting -t 4 -n Betty
Hi, Betty!
Hi, Betty!
Hi, Betty!
Hi, Betty!
./greeting -n
Error: -n requires an argument.
Usage: ./greeting  
./greeting -t
Error: -t requires an argument.
Usage: ./greeting  
./greeting -t -1
Error: TIMES must be a positive, whole number.
Usage: ./greeting  
./greeting -t 0
Error: TIMES must be greater than zero.
Usage: ./greeting  

Options

-a, —alternative
Allow long options to start with a single ‘‘.
-h, —help
Output a small usage guide and exit successfully. No other output is generated.
-l, —longoptions longopts
The long (multi-character) options to be recognized. More than one option name may be specified at once, by separating the names with commas. This option
may be given more than once, the longopts are cumulative. Each long option name in longopts may be followed by one colon to indicate it has a
required argument, and by two colons to indicate it has an optional argument.
-n, —name progname
The name that will be used by the (3) routines when it reports errors. Note that errors of getopt(1) are still reported
as coming from getopt.
-o, —options shortopts
The short (one-character) options to be recognized. If this option is not found, the first parameter of getopt that does not start with a ‘
(and is not an option argument) is used as the short options string. Each short option character in shortopts may be followed by one colon to indicate
it has a required argument, and by two colons to indicate it has an optional argument. The first character of shortopts may be ‘+‘ or ‘‘ to
influence the way options are parsed and output is generated (see section SCANNING MODES for details).
-q, —quiet
Disable error reporting by getopt(3).
-Q, —quiet-output
Do not generate normal output. Errors are still reported by (3), unless you also use -q.
-s, —shell shell
Set quoting conventions to those of shell. If no -s argument is found, the BASH conventions are used. Valid arguments are currently
sh‘ ‘bash‘, ‘csh‘, and ‘tcsh‘.
-u, —unquoted
Do not quote the output. Note that whitespace and special (shell-dependent) characters can cause havoc in this mode (like they do with other
getopt(1) implementations).
-T, —test
Test if your getopt(1) is this enhanced version or an old version. This generates no output, and sets the error status to 4. Other
implementations of getopt(1), and this version if the environment variable GETOPT_COMPATIBLE is set, will return ‘‘ and error
status 0.
-V, —version
Output version information and exit successfully. No other output is generated.

DESCRIPTION

getoptparseGNUgetopt

The parameters
getopt is called with can be divided into two parts: options
which modify the way getopt will parse
(options and
-o|—options optstring in the
SYNOPSIS), and the parameters which are to be
parsed
(parameters in the
SYNOPSIS). The second part will start at the first non-option parameter
that is not an option argument, or after the first occurrence of
. If no
-o or
—options option is found in the first part, the first
parameter of the second part is used as the short options string.

If the environment variable
GETOPT_COMPATIBLE is set, or if its first parameter
is not an option (does not start with a
, this is the first format in the
SYNOPSIS), getopt will generate output that is compatible with that of other versions of
getopt(1).
It will still do parameter shuffling and recognize optional
arguments (see section
for more information).

Traditional implementations of
getopt(1)
are unable to cope with whitespace and other (shell-specific) special characters
in arguments and non-option parameters. To solve this problem, this
implementation can generate
quoted output which must once again be interpreted by the shell (usually
by using the
eval command). This has the effect of preserving those characters, but
you must call
getopt in a way that is no longer compatible with other versions (the second
or third format in the
SYNOPSIS). To determine whether this enhanced version of
getopt(1)
is installed, a special test option
(-T) can be used.

RETURN VALUE top

       If an option was successfully found, then getopt() returns the option
       character.  If all command-line options have been parsed, then
       getopt() returns -1.  If getopt() encounters an option character that
       was not in optstring, then '?' is returned.  If getopt() encounters
       an option with a missing argument, then the return value depends on
       the first character in optstring: if it is ':', then ':' is returned;
       otherwise '?' is returned.

       getopt_long() and getopt_long_only() also return the option character
       when a short option is recognized.  For a long option, they return
       val if flag is NULL, and 0 otherwise.  Error and -1 returns are the
       same as for getopt(), plus '?' for an ambiguous match or an
       extraneous parameter.

In other languages

getopt is a concise description of the common POSIX command argument structure, and it is replicated widely by programmers seeking to provide a similar interface, both to themselves and to the user on the command-line.

  • C: non-POSIX systems do not ship in the C library, but gnulib and MinGW (both accept GNU-style), as well as some more minimal libraries, can be used to provide the functionality. Alternative interfaces also exist:
    • The library, used by RPM package manager, has the additional advantage of being reentrant.
    • The family of functions in glibc and gnulib provides some more convenience and modularity.
  • D: The D programming language has a getopt module in the standard library.
  • Go: comes with the package , which allows long flag names. The package supports processing closer to the C function. There is also another package providing interface much closer to the original POSIX one.
  • Haskell: comes with System.Console.GetOpt, which is essentially a Haskell port of the GNU getopt library.
  • Java: There is no implementation of getopt in the Java standard library. Several open source modules exist, including gnu.getopt.Getopt, which is ported from GNU getopt, and Apache Commons CLI.
  • Lisp: has many different dialects with no common standard library. There are some third party implementations of getopt for some dialects of Lisp. Common Lisp has a prominent third party implementation.
  • Free Pascal: has its own implementation as one of its standard units named GetOpts. It is supported on all platforms.
  • Perl programming language: has two separate derivatives of getopt in its standard library: Getopt::Long and Getopt::Std.
  • PHP: has a getopt() function.
  • Python: contains a module in its standard library based on C’s getopt and GNU extensions. Python’s standard library also contains other modules to parse options that are more convenient to use.
  • Ruby: has an implementation of getopt_long in its standard library, GetoptLong. Ruby also has modules in its standard library with a more sophisticated and convenient interface. A third party implementation of the original getopt interface is available.
  • .NET Framework: does not have getopt functionality in its standard library. Third-party implementations are available.

Compatibility

getopt(1)

If the first character of the first parameter of getopt is not a ‘‘, getopt goes into compatibility mode. It will interpret its first parameter as
the string of short options, and all other arguments will be parsed. It will still do parameter shuffling (ie. all non-option parameters are outputted at the
end), unless the environment variable POSIXLY_CORRECT is set.

The environment variable GETOPT_COMPATIBLE forces getopt into compatibility mode. Setting both this environment variable and
POSIXLY_CORRECT offers 100% compatibility for ‘difficult’ programs. Usually, though, neither is needed.

In compatibility mode, leading ‘‘ and ‘+‘ characters in the short options string are ignored.

SCANNING MODES

+POSIXLY_CORRECT

If the first character is
+, or if the environment variable
POSIXLY_CORRECT is set, parsing stops as soon as the first non-option parameter
(ie. a parameter that does not start with a
) is found that
is not an option argument. The remaining parameters are all interpreted as
non-option parameters.

If the first character is a
, non-option parameters are outputed at the place where they are found; in normal
operation, they are all collected at the end of output after a
parameter has been generated. Note that this
parameter is still generated, but it will always be the last parameter in
this mode.

Flag Syntax

Support is provided for both short (-f) and long (—flag) options. A single
option may have both a short and a long name. Each option may be a flag or a
value. A value takes an argument.

Declaring no long names causes this package to process arguments like the
traditional BSD getopt.

Short flags may be combined into a single parameter. For example, «-a -b -c»
may also be expressed «-abc». Long flags must stand on their own «—alpha
—beta»

Values require an argument. For short options the argument may either be
immediately following the short name or as the next argument. Only one short
value may be combined with short flags in a single argument; the short value
must be after all short flags. For example, if f is a flag and v is a value,
then:

For the long value option val:

Values with an optional value only set the value if the value is part of the
same argument. In any event, the option count is increased and the option is
marked as seen.

There is no convience function defined for making the value optional. The
SetOptional method must be called on the actual Option.

Parsing continues until the first non-option or «—» is encountered.

The short name «-» can be used, but it either is specified as «-» or as part
of a group of options, for example «-f-«. If there are no long options
specified then «—f» could also be used. If «-» is not declared as an option
then the single «-» will also terminate the option processing but unlike
«—«, the «-» will be part of the remaining arguments.

Итого

Пользователи UNIX всегда управляли командами при помощи аргументов, особенно это касается утилит, разработанных как часть коллекции небольших утилит, которая является средой оболочки UNIX. Программы должны быть способны обрабатывать опции и аргументы быстро, без больших затрат времени со стороны разработчика. Кроме того, большинство программ разрабатывалось не только для того чтобы обрабатывать параметры команды, но и для выполнения поставленных задач.

является стандартной библиотечной функцией, которая позволяет «пройти» через параметры командной строки и распознать опции (с сопутствующими аргументами или без них), используя простые конструкции while/switch. Похожая функция позволяет обрабатывать сложные опции почти без дополнительных действий, направленных на улучшение функции, что очень нравится разработчикам.

Теперь, когда известно, как легко обрабатывать опции команды, можно сконцентрироваться на улучшении программы путем добавления в нее поддержки длинных опций и любых других опций, которые до этого момента не хотелось добавлять из-за сложности их последующей обработки.

Обязательно нужно документировать где-нибудь все опции или аргументы, с которыми работает программа, и создать встроенную функцию справки для забывчивых пользователей.

Ресурсы для скачивания

  • этот контент в PDF
  • пример getopt()-программы (au-getopt_demo.zip | 23KB)
  • пример getopt_long()-программы (au-getopt_long_demo.zip | 24KB)

Похожие темы

  • Command-line processing with getopt()?: оригинал статьи (EN).
  • What is Eclipse, and how do I use it? (developerWorks, ноябрь 2001): статья о среде разработки Eclipse.
  • Get started now with Eclipse: информация о платформе Eclipse.
  • C/C++ development with the Eclipse Platform (EN) (developerWorks, март 2006): статья об использовании C++ с Eclipse.
  • Network services (EN): статья со сравнением традиционного подхода к разработке ПО и многопоточного подхода.
  • Build UNIX software with Eclipse (developerWorks, март 2006) (EN): статья с примером создания приложения для UNIX в среде Eclipse.
  • Architecture area on developerWorks: необходимая информация, материалы и ресурсы для получения навыков и совершенствования в разработке архитектуры ПО.(EN)
  • Podcasts: подкасты экспертов IBM.(EN)
  • Popular content: популярные материалы об AIX и UNIX.(EN)
  • Раздел developerWorks AIX and UNIX содержит сотни информативных статей для читателей начальной, средней и высокой квалификации.
  • Разделы библиотеки информации по темам AIX и UNIX:(EN)

    • Системное администрирование
    • Разработка приложений
    • Производительность
    • Переносимость
    • Безопасность
    • Подсказки
    • Инструментальные средства и утилиты
    • Java-технологии
    • Linux
    • Open source
  • IBM trial software: ознакомительные версии программного обеспечения для разработчиков, которые можно загрузить прямо со страницы сообщества developerWorks.(EN)
Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *