Как запустить python скрипт с параметрами
Перейти к содержимому

Как запустить python скрипт с параметрами

  • автор:

1. Command line and environment¶

The CPython interpreter scans the command line and the environment for various settings.

CPython implementation detail: Other implementations’ command line schemes may differ. See Alternate Implementations for further resources.

1.1. Command line¶

When invoking Python, you may specify any of these options:

The most common use case is, of course, a simple invocation of a script:

1.1.1. Interface options¶

The interpreter interface resembles that of the UNIX shell, but provides some additional methods of invocation:

When called with standard input connected to a tty device, it prompts for commands and executes them until an EOF (an end-of-file character, you can produce that with Ctrl — D on UNIX or Ctrl — Z, Enter on Windows) is read.

When called with a file name argument or with a file as standard input, it reads and executes a script from that file.

When called with a directory name argument, it reads and executes an appropriately named script from that directory.

When called with -c command , it executes the Python statement(s) given as command. Here command may contain multiple statements separated by newlines. Leading whitespace is significant in Python statements!

When called with -m module-name , the given module is located on the Python module path and executed as a script.

In non-interactive mode, the entire input is parsed before it is executed.

An interface option terminates the list of options consumed by the interpreter, all consecutive arguments will end up in sys.argv – note that the first element, subscript zero ( sys.argv[0] ), is a string reflecting the program’s source.

Execute the Python code in command. command can be one or more statements separated by newlines, with significant leading whitespace as in normal module code.

If this option is given, the first element of sys.argv will be "-c" and the current directory will be added to the start of sys.path (allowing modules in that directory to be imported as top level modules).

Raises an auditing event cpython.run_command with argument command .

Search sys.path for the named module and execute its contents as the __main__ module.

Since the argument is a module name, you must not give a file extension ( .py ). The module name should be a valid absolute Python module name, but the implementation may not always enforce this (e.g. it may allow you to use a name that includes a hyphen).

Package names (including namespace packages) are also permitted. When a package name is supplied instead of a normal module, the interpreter will execute <pkg>.__main__ as the main module. This behaviour is deliberately similar to the handling of directories and zipfiles that are passed to the interpreter as the script argument.

This option cannot be used with built-in modules and extension modules written in C, since they do not have Python module files. However, it can still be used for precompiled modules, even if the original source file is not available.

If this option is given, the first element of sys.argv will be the full path to the module file (while the module file is being located, the first element will be set to "-m" ). As with the -c option, the current directory will be added to the start of sys.path .

-I option can be used to run the script in isolated mode where sys.path contains neither the current directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too.

Many standard library modules contain code that is invoked on their execution as a script. An example is the timeit module:

Raises an auditing event cpython.run_module with argument module-name .

Equivalent functionality directly available to Python code

PEP 338 – Executing modules as scripts

Changed in version 3.1: Supply the package name to run a __main__ submodule.

Changed in version 3.4: namespace packages are also supported

Read commands from standard input ( sys.stdin ). If standard input is a terminal, -i is implied.

If this option is given, the first element of sys.argv will be "-" and the current directory will be added to the start of sys.path .

Raises an auditing event cpython.run_stdin with no arguments.

Execute the Python code contained in script, which must be a filesystem path (absolute or relative) referring to either a Python file, a directory containing a __main__.py file, or a zipfile containing a __main__.py file.

If this option is given, the first element of sys.argv will be the script name as given on the command line.

If the script name refers directly to a Python file, the directory containing that file is added to the start of sys.path , and the file is executed as the __main__ module.

If the script name refers to a directory or zipfile, the script name is added to the start of sys.path and the __main__.py file in that location is executed as the __main__ module.

-I option can be used to run the script in isolated mode where sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too.

Raises an auditing event cpython.run_file with argument filename .

Equivalent functionality directly available to Python code

If no interface option is given, -i is implied, sys.argv[0] is an empty string ( "" ) and the current directory will be added to the start of sys.path . Also, tab-completion and history editing is automatically enabled, if available on your platform (see Readline configuration ).

Changed in version 3.4: Automatic enabling of tab-completion and history editing.

1.1.2. Generic options¶

Print a short description of all command line options and corresponding environment variables and exit.

Print a short description of Python-specific environment variables and exit.

New in version 3.11.

Print a description of implementation-specific -X options and exit.

New in version 3.11.

Print complete usage information and exit.

New in version 3.11.

Print the Python version number and exit. Example output could be:

When given twice, print more information about the build, like:

New in version 3.6: The -VV option.

1.1.3. Miscellaneous options¶

Issue a warning when comparing bytes or bytearray with str or bytes with int . Issue an error when the option is given twice ( -bb ).

Changed in version 3.5: Affects comparisons of bytes with int .

If given, Python won’t try to write .pyc files on the import of source modules. See also PYTHONDONTWRITEBYTECODE .

Control the validation behavior of hash-based .pyc files. See Cached bytecode invalidation . When set to default , checked and unchecked hash-based bytecode cache files are validated according to their default semantics. When set to always , all hash-based .pyc files, whether checked or unchecked, are validated against their corresponding source file. When set to never , hash-based .pyc files are not validated against their corresponding source files.

The semantics of timestamp-based .pyc files are unaffected by this option.

Turn on parser debugging output (for expert only, depending on compilation options). See also PYTHONDEBUG .

Ignore all PYTHON* environment variables, e.g. PYTHONPATH and PYTHONHOME , that might be set.

See also the -P and -I (isolated) options.

When a script is passed as first argument or the -c option is used, enter interactive mode after executing the script or the command, even when sys.stdin does not appear to be a terminal. The PYTHONSTARTUP file is not read.

This can be useful to inspect global variables or a stack trace when a script raises an exception. See also PYTHONINSPECT .

Run Python in isolated mode. This also implies -E , -P and -s options.

In isolated mode sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too. Further restrictions may be imposed to prevent the user from injecting malicious code.

New in version 3.4.

Remove assert statements and any code conditional on the value of __debug__ . Augment the filename for compiled ( bytecode ) files by adding .opt-1 before the .pyc extension (see PEP 488). See also PYTHONOPTIMIZE .

Changed in version 3.5: Modify .pyc filenames according to PEP 488.

Do -O and also discard docstrings. Augment the filename for compiled ( bytecode ) files by adding .opt-2 before the .pyc extension (see PEP 488).

Changed in version 3.5: Modify .pyc filenames according to PEP 488.

Don’t prepend a potentially unsafe path to sys.path :

python -m module command line: Don’t prepend the current working directory.

python script.py command line: Don’t prepend the script’s directory. If it’s a symbolic link, resolve symbolic links.

python -c code and python (REPL) command lines: Don’t prepend an empty string, which means the current working directory.

See also the PYTHONSAFEPATH environment variable, and -E and -I (isolated) options.

New in version 3.11.

Don’t display the copyright and version messages even in interactive mode.

New in version 3.2.

Turn on hash randomization. This option only has an effect if the PYTHONHASHSEED environment variable is set to 0 , since hash randomization is enabled by default.

On previous versions of Python, this option turns on hash randomization, so that the __hash__() values of str and bytes objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.

Hash randomization is intended to provide protection against a denial-of-service caused by carefully chosen inputs that exploit the worst case performance of a dict construction, O(n 2 ) complexity. See http://ocert.org/advisories/ocert-2011-003.html for details.

PYTHONHASHSEED allows you to set a fixed value for the hash seed secret.

Changed in version 3.7: The option is no longer ignored.

New in version 3.2.3.

PEP 370 – Per user site-packages directory

Disable the import of the module site and the site-dependent manipulations of sys.path that it entails. Also disable these manipulations if site is explicitly imported later (call site.main() if you want them to be triggered).

Force the stdout and stderr streams to be unbuffered. This option has no effect on the stdin stream.

Changed in version 3.7: The text layer of the stdout and stderr streams now is unbuffered.

Print a message each time a module is initialized, showing the place (filename or built-in module) from which it is loaded. When given twice ( -vv ), print a message for each file that is checked for when searching for a module. Also provides information on module cleanup at exit.

Changed in version 3.10: The site module reports the site-specific paths and .pth files being processed.

Warning control. Python’s warning machinery by default prints warning messages to sys.stderr .

The simplest settings apply a particular action unconditionally to all warnings emitted by a process (even those that are otherwise ignored by default):

The action names can be abbreviated as desired and the interpreter will resolve them to the appropriate action name. For example, -Wi is the same as -Wignore .

The full form of argument is:

Empty fields match all values; trailing empty fields may be omitted. For example -W ignore::DeprecationWarning ignores all DeprecationWarning warnings.

The action field is as explained above but only applies to warnings that match the remaining fields.

The message field must match the whole warning message; this match is case-insensitive.

The category field matches the warning category (ex: DeprecationWarning ). This must be a class name; the match test whether the actual warning category of the message is a subclass of the specified warning category.

The module field matches the (fully qualified) module name; this match is case-sensitive.

The lineno field matches the line number, where zero matches all line numbers and is thus equivalent to an omitted line number.

Multiple -W options can be given; when a warning matches more than one option, the action for the last matching option is performed. Invalid -W options are ignored (though, a warning message is printed about invalid options when the first warning is issued).

Warnings can also be controlled using the PYTHONWARNINGS environment variable and from within a Python program using the warnings module. For example, the warnings.filterwarnings() function can be used to use a regular expression on the warning message.

Skip the first line of the source, allowing use of non-Unix forms of #!cmd . This is intended for a DOS specific hack only.

Reserved for various implementation-specific options. CPython currently defines the following possible values:

-X faulthandler to enable faulthandler . See also PYTHONFAULTHANDLER .

-X showrefcount to output the total reference count and number of used memory blocks when the program finishes or after each statement in the interactive interpreter. This only works on debug builds .

-X tracemalloc to start tracing Python memory allocations using the tracemalloc module. By default, only the most recent frame is stored in a traceback of a trace. Use -X tracemalloc=NFRAME to start tracing with a traceback limit of NFRAME frames. See tracemalloc.start() and PYTHONTRACEMALLOC for more information.

-X importtime to show how long each import takes. It shows module name, cumulative time (including nested imports) and self time (excluding nested imports). Note that its output may be broken in multi-threaded application. Typical usage is python3 -X importtime -c ‘import asyncio’ . See also PYTHONPROFILEIMPORTTIME .

-X dev : enable Python Development Mode , introducing additional runtime checks that are too expensive to be enabled by default.

-X utf8 enables the Python UTF-8 Mode . -X utf8=0 explicitly disables Python UTF-8 Mode (even when it would otherwise activate automatically). See also PYTHONUTF8 .

-X pycache_prefix=PATH enables writing .pyc files to a parallel tree rooted at the given directory instead of to the code tree. See also PYTHONPYCACHEPREFIX .

-X warn_default_encoding issues a EncodingWarning when the locale-specific default encoding is used for opening files. See also PYTHONWARNDEFAULTENCODING .

-X no_debug_ranges disables the inclusion of the tables mapping extra location information (end line, start column offset and end column offset) to every instruction in code objects. This is useful when smaller code objects and pyc files are desired as well as suppressing the extra visual location indicators when the interpreter displays tracebacks. See also PYTHONNODEBUGRANGES .

-X frozen_modules determines whether or not frozen modules are ignored by the import machinery. A value of “on” means they get imported and “off” means they are ignored. The default is “on” if this is an installed Python (the normal case). If it’s under development (running from the source tree) then the default is “off”. Note that the “importlib_bootstrap” and “importlib_bootstrap_external” frozen modules are always used, even if this flag is set to “off”.

It also allows passing arbitrary values and retrieving them through the sys._xoptions dictionary.

Changed in version 3.2: The -X option was added.

New in version 3.3: The -X faulthandler option.

New in version 3.4: The -X showrefcount and -X tracemalloc options.

New in version 3.6: The -X showalloccount option.

New in version 3.7: The -X importtime , -X dev and -X utf8 options.

New in version 3.8: The -X pycache_prefix option. The -X dev option now logs close() exceptions in io.IOBase destructor.

Changed in version 3.9: Using -X dev option, check encoding and errors arguments on string encoding and decoding operations.

The -X showalloccount option has been removed.

New in version 3.10: The -X warn_default_encoding option.

Deprecated since version 3.9, removed in version 3.10: The -X oldparser option.

New in version 3.11: The -X no_debug_ranges option.

New in version 3.11: The -X frozen_modules option.

New in version 3.11: The -X int_max_str_digits option.

1.1.4. Options you shouldn’t use¶

Reserved for use by Jython.

1.2. Environment variables¶

These environment variables influence Python’s behavior, they are processed before the command-line switches other than -E or -I. It is customary that command-line switches override environmental variables where there is a conflict.

Change the location of the standard Python libraries. By default, the libraries are searched in prefix /lib/python version and exec_prefix /lib/python version , where prefix and exec_prefix are installation-dependent directories, both defaulting to /usr/local .

When PYTHONHOME is set to a single directory, its value replaces both prefix and exec_prefix . To specify different values for these, set PYTHONHOME to prefix : exec_prefix .

Augment the default search path for module files. The format is the same as the shell’s PATH : one or more directory pathnames separated by os.pathsep (e.g. colons on Unix or semicolons on Windows). Non-existent directories are silently ignored.

In addition to normal directories, individual PYTHONPATH entries may refer to zipfiles containing pure Python modules (in either source or compiled form). Extension modules cannot be imported from zipfiles.

The default search path is installation dependent, but generally begins with prefix /lib/python version (see PYTHONHOME above). It is always appended to PYTHONPATH .

An additional directory will be inserted in the search path in front of PYTHONPATH as described above under Interface options . The search path can be manipulated from within a Python program as the variable sys.path .

If this is set to a non-empty string, don’t prepend a potentially unsafe path to sys.path : see the -P option for details.

New in version 3.11.

If this is set to a non-empty string, it overrides the sys.platlibdir value.

New in version 3.9.

If this is the name of a readable file, the Python commands in that file are executed before the first prompt is displayed in interactive mode. The file is executed in the same namespace where interactive commands are executed so that objects defined or imported in it can be used without qualification in the interactive session. You can also change the prompts sys.ps1 and sys.ps2 and the hook sys.__interactivehook__ in this file.

Raises an auditing event cpython.run_startup with the filename as the argument when called on startup.

If this is set to a non-empty string it is equivalent to specifying the -O option. If set to an integer, it is equivalent to specifying -O multiple times.

If this is set, it names a callable using dotted-path notation. The module containing the callable will be imported and then the callable will be run by the default implementation of sys.breakpointhook() which itself is called by built-in breakpoint() . If not set, or set to the empty string, it is equivalent to the value “pdb.set_trace”. Setting this to the string “0” causes the default implementation of sys.breakpointhook() to do nothing but return immediately.

New in version 3.7.

If this is set to a non-empty string it is equivalent to specifying the -d option. If set to an integer, it is equivalent to specifying -d multiple times.

If this is set to a non-empty string it is equivalent to specifying the -i option.

This variable can also be modified by Python code using os.environ to force inspect mode on program termination.

If this is set to a non-empty string it is equivalent to specifying the -u option.

If this is set to a non-empty string it is equivalent to specifying the -v option. If set to an integer, it is equivalent to specifying -v multiple times.

If this is set, Python ignores case in import statements. This only works on Windows and macOS.

If this is set to a non-empty string, Python won’t try to write .pyc files on the import of source modules. This is equivalent to specifying the -B option.

If this is set, Python will write .pyc files in a mirror directory tree at this path, instead of in __pycache__ directories within the source tree. This is equivalent to specifying the -X pycache_prefix=PATH option.

New in version 3.8.

If this variable is not set or set to random , a random value is used to seed the hashes of str and bytes objects.

If PYTHONHASHSEED is set to an integer value, it is used as a fixed seed for generating the hash() of the types covered by the hash randomization.

Its purpose is to allow repeatable hashing, such as for selftests for the interpreter itself, or to allow a cluster of python processes to share hash values.

The integer must be a decimal number in the range [0,4294967295]. Specifying the value 0 will disable hash randomization.

New in version 3.2.3.

If this variable is set to an integer, it is used to configure the interpreter’s global integer string conversion length limitation .

New in version 3.11.

If this is set before running the interpreter, it overrides the encoding used for stdin/stdout/stderr, in the syntax encodingname:errorhandler . Both the encodingname and the :errorhandler parts are optional and have the same meaning as in str.encode() .

For stderr, the :errorhandler part is ignored; the handler will always be ‘backslashreplace’ .

Changed in version 3.4: The encodingname part is now optional.

Changed in version 3.6: On Windows, the encoding specified by this variable is ignored for interactive console buffers unless PYTHONLEGACYWINDOWSSTDIO is also specified. Files and pipes redirected through the standard streams are not affected.

If this is set, Python won’t add the user site-packages directory to sys.path .

PEP 370 – Per user site-packages directory

Defines the user base directory , which is used to compute the path of the user site-packages directory and Distutils installation paths for python setup.py install —user .

PEP 370 – Per user site-packages directory

If this environment variable is set, sys.argv[0] will be set to its value instead of the value got through the C runtime. Only works on macOS.

This is equivalent to the -W option. If set to a comma separated string, it is equivalent to specifying -W multiple times, with filters later in the list taking precedence over those earlier in the list.

The simplest settings apply a particular action unconditionally to all warnings emitted by a process (even those that are otherwise ignored by default):

If this environment variable is set to a non-empty string, faulthandler.enable() is called at startup: install a handler for SIGSEGV , SIGFPE , SIGABRT , SIGBUS and SIGILL signals to dump the Python traceback. This is equivalent to -X faulthandler option.

New in version 3.3.

If this environment variable is set to a non-empty string, start tracing Python memory allocations using the tracemalloc module. The value of the variable is the maximum number of frames stored in a traceback of a trace. For example, PYTHONTRACEMALLOC=1 stores only the most recent frame. See the tracemalloc.start() function for more information. This is equivalent to setting the -X tracemalloc option.

New in version 3.4.

If this environment variable is set to a non-empty string, Python will show how long each import takes. This is equivalent to setting the -X importtime option.

New in version 3.7.

If this environment variable is set to a non-empty string, enable the debug mode of the asyncio module.

New in version 3.4.

Set the Python memory allocators and/or install debug hooks.

Set the family of memory allocators used by Python:

malloc : use the malloc() function of the C library for all domains ( PYMEM_DOMAIN_RAW , PYMEM_DOMAIN_MEM , PYMEM_DOMAIN_OBJ ).

pymalloc : use the pymalloc allocator for PYMEM_DOMAIN_MEM and PYMEM_DOMAIN_OBJ domains and use the malloc() function for the PYMEM_DOMAIN_RAW domain.

debug : install debug hooks on top of the default memory allocators .

malloc_debug : same as malloc but also install debug hooks.

pymalloc_debug : same as pymalloc but also install debug hooks.

Changed in version 3.7: Added the "default" allocator.

New in version 3.6.

If set to a non-empty string, Python will print statistics of the pymalloc memory allocator every time a new pymalloc object arena is created, and on shutdown.

This variable is ignored if the PYTHONMALLOC environment variable is used to force the malloc() allocator of the C library, or if Python is configured without pymalloc support.

Changed in version 3.6: This variable can now also be used on Python compiled in release mode. It now has no effect if set to an empty string.

If set to a non-empty string, the default filesystem encoding and error handler mode will revert to their pre-3.6 values of ‘mbcs’ and ‘replace’, respectively. Otherwise, the new defaults ‘utf-8’ and ‘surrogatepass’ are used.

This may also be enabled at runtime with sys._enablelegacywindowsfsencoding() .

New in version 3.6: See PEP 529 for more details.

If set to a non-empty string, does not use the new console reader and writer. This means that Unicode characters will be encoded according to the active console code page, rather than using utf-8.

This variable is ignored if the standard streams are redirected (to files or pipes) rather than referring to console buffers.

New in version 3.6.

If set to the value 0 , causes the main Python command line application to skip coercing the legacy ASCII-based C and POSIX locales to a more capable UTF-8 based alternative.

If this variable is not set (or is set to a value other than 0 ), the LC_ALL locale override environment variable is also not set, and the current locale reported for the LC_CTYPE category is either the default C locale, or else the explicitly ASCII-based POSIX locale, then the Python CLI will attempt to configure the following locales for the LC_CTYPE category in the order listed before loading the interpreter runtime:

If setting one of these locale categories succeeds, then the LC_CTYPE environment variable will also be set accordingly in the current process environment before the Python runtime is initialized. This ensures that in addition to being seen by both the interpreter itself and other locale-aware components running in the same process (such as the GNU readline library), the updated setting is also seen in subprocesses (regardless of whether or not those processes are running a Python interpreter), as well as in operations that query the environment rather than the current C locale (such as Python’s own locale.getdefaultlocale() ).

Configuring one of these locales (either explicitly or via the above implicit locale coercion) automatically enables the surrogateescape error handler for sys.stdin and sys.stdout ( sys.stderr continues to use backslashreplace as it does in any other locale). This stream handling behavior can be overridden using PYTHONIOENCODING as usual.

For debugging purposes, setting PYTHONCOERCECLOCALE=warn will cause Python to emit warning messages on stderr if either the locale coercion activates, or else if a locale that would have triggered coercion is still active when the Python runtime is initialized.

Also note that even when locale coercion is disabled, or when it fails to find a suitable target locale, PYTHONUTF8 will still activate by default in legacy ASCII-based locales. Both features must be disabled in order to force the interpreter to use ASCII instead of UTF-8 for system interfaces.

New in version 3.7: See PEP 538 for more details.

If this environment variable is set to a non-empty string, enable Python Development Mode , introducing additional runtime checks that are too expensive to be enabled by default. This is equivalent to setting the -X dev option.

New in version 3.7.

If set to 1 , enable the Python UTF-8 Mode .

If set to 0 , disable the Python UTF-8 Mode .

Setting any other non-empty string causes an error during interpreter initialisation.

New in version 3.7.

If this environment variable is set to a non-empty string, issue a EncodingWarning when the locale-specific default encoding is used.

New in version 3.10.

If this variable is set, it disables the inclusion of the tables mapping extra location information (end line, start column offset and end column offset) to every instruction in code objects. This is useful when smaller code objects and pyc files are desired as well as suppressing the extra visual location indicators when the interpreter displays tracebacks.

New in version 3.11.

1.2.1. Debug-mode variables¶

If set, Python will print threading debug info into stdout.

Deprecated since version 3.10, will be removed in version 3.12.

If set, Python will dump objects and reference counts still alive after shutting down the interpreter.

Need Python configured with the —with-trace-refs build option.

If set, Python will dump objects and reference counts still alive after shutting down the interpreter into a file called FILENAME.

Передача аргументов скрипту (argv)#

Очень часто скрипт решает какую-то общую задачу. Например, скрипт обрабатывает как-то файл конфигурации. Конечно, в таком случае не хочется каждый раз руками в скрипте править название файла.

Гораздо лучше будет передавать имя файла как аргумент скрипта и затем использовать уже указанный файл.

Модуль sys позволяет работать с аргументами скрипта с помощью argv.

Проверка работы скрипта:

Аргументы, которые были переданы скрипту, подставляются как значения в шаблон.

Тут надо пояснить несколько моментов:

argv — это список

все аргументы находятся в списке в виде строк

argv содержит не только аргументы, которые передали скрипту, но и название самого скрипта

How to Pass Arguments to a Python Script from the Command Line

Running Python scripts from the command line can be a great way to automate your workflows. To do this, you’ll need to learn how to pass arguments from the command line to a Python script. This will allow you to create reusable scripts that can be updated, or run for new situations or data by just passing in a couple of new arguments. In Python getting arguments from the command line to a script is quite easy.

Before you can pass arguments to a script, you’ll need to understand how to run a Python script from the command line. Follow this tutorial for a step-by-step guide.

In Python, arguments are passed to a script from the command line using the sys package. The argv member of sys ( sys.argv ) will store all the information in the command line entry and can be accessed inside the Python script. Python’s getopt module can also be used to parse named arguments.

Let’s go through some examples.

A Python Script to Read Command Line Arguments

To start, we’ll create a script that prints out the entire command line statement. Then we can examine how the arguments are passed and learn how to incorporate those into our code.

In the Python script, we’ll import sys , then just print out the full value of sys.argv . The script looks like this.

Save this script as myscript.py . Now we’ll call this script from the command line (follow this tutorial if you need directions), as follows. Make sure your working directory is the same directory that contains myscript.py .

You’ll notice when I call the script that I’ve included three arguments separated by a space (arg1, arg2, and arg3). These are just to illustrate how sys stores and displays the arguments. They don’t have any meaning.

Here’s my call to myscript.py from the command line. The second line of code shows the output.

You can see that sys.argv has stored the arguments as strings in a list. Let’s try this again with different data types (float, int, and string) to see how they are stored.

Here’s the script call and output.

As you can see, the float and integer were also stored as strings by sys.argv .

Accessing Command Line Arguments in a Python Script

Now that we have some basic information about how to access command-line arguments and how they are stored, we can start parsing those arguments for use in our script.

In this simple example, we’ll iterate through each argument (except the first one, which is the script name) and print it to the console.

Let’s start by updating the Python script, myscript.py . We’ll add a loop to iterate through the last three arguments in sys.argv . For each element, we’ll print out its index (or position) and its value.

Here’s the new script. Notice that we iterate through a range that starts at 1. This skips the first argument, which is the script name.

Run the script using the last set of arguments. Like this.

You should get output that looks like this.

That gives you the basics of passing command-line arguments to a Python script. From here, you’ll probably want to do some logical checks to make sure the input values are the appropriate types and fall within the correct range or set of values.

Improved Parsing of Python Command Line Arguments

The examples above provide simple examples to get you started. However, if you’re looking for something more advanced that allows users to specify arguments with keywords and print help messages we’ll need to get a little more advanced.

To retrieve named arguments from the command line we’ll use Python’s getopt module. getopt is built into base Python so you don’t need to install it.

Let’s start a new script that uses both sys and getopt to parse command-line arguments. The script will have the possibility of four named arguments, ‘help’, ‘input’, ‘user’, and ‘output’. From the command line, these arguments can be specified with a single dash and the first letter ( -h ) or a double dash and the full argument name ( —help ). Name this script myscript2.py .

This script will consist of two parts. The first part is a function ( myfunc ) that will take the arguments ( argv ) as an input. The second part is an if statement that will recognize when the script is called and pass the arguments from sys.argv to myfunc .

In the body of myfunc , we’ll define variables for the input, user, and output. We’ll also define a variable for ‘help’ and give it a value. The ‘help’ variable will print out if an error is thrown or if the user specifies -h or —help .

Now call getopt.getopt and pass it the arguments from the command line, but not the script name (like this: argv[1:] ). In the call to getopt is also where we specify both the parameter short and long names. The colons (:) following i , u , and o indicate that a value is required for that parameter. The equal signs (=) following input , user , and output indicate the same.

I’ve put the call to getopt.getopt into a try except statement so that the script will print the help message and then exit if there are any problems. Here’s what the script looks like so far.

In the final part of the script, we’ll parse the arguments based on their short or long names, or keywords, and print out the final values.

To start, loop through all the elements of opts . This will return the argument name ( opt ) and value ( arg ). Then use an if , elif , else statement to determine which variable to assign the argument to. After all the arguments have been handled, print out the argument name and its value.

The final script should look similar to this.

Let’s use this script in a couple of different ways to see what happens.

First, let’s get the help message using both the short -h and long —help names.

As expected, both examples resulted in the help message printing to the console.

Next, let’s see what happens if we specify an invalid argument name, —madeup .

This caused an error, which resulted in the help message printing to the console again.

Now, let’s enter the correct arguments.

The arguments were assigned to the appropriate variables.

Next Steps

This article gives you a primer on passing and parsing command line arguments with Python. For a full-fledged implementation, there is still more work you will want to do. It will be important to check the types and values of the input arguments to be sure they are valid. You’ll also want to make sure to print out helpful messages to the user when an error or other exception occurs. If you’re just implementing this for personal use, those features aren’t so important. I’ve found that writing my Python scripts to run from the command line has helped me automate many of my tasks and analyses and has saved me lots of time.

Whether you’re looking to take your GIS skills to the next level, or just getting started with GIS, we have a course for you! We’re constantly creating and curating more courses to help you improve your geospatial skills.

QGIS for Beginners

Remote Sensing with QGIS

QGIS Python Scripting with PyQGIS

All of our courses are taught by industry professionals and include step-by-step video instruction so you don’t get lost in YouTube videos and blog posts, downloadable data so you can reproduce everything the instructor does, and code you can copy so you can avoid repetitive typing

Konrad has a Master's Degree in Ecology and a Doctorate Degree in Water Resources and has been performing geospatial analysis and writing code (in multiple programming languages) for over a decade. He writes code to develop models and analysis workflows to predict and evaluate changes to landscapes and water resources. He has published multiple articles in prominent peer-reviewed, scientific journals. Konrad's code and workflow contribute to operational products that inform water and ecosystem management.

Latest Tutorials

With QGIS reprojections can be calculated with the export data tool, the reproject layer tool, and GDAL Warp. Reprojecting layers (i.e., converting them to a different coordinate reference system, or.

In cartography and GIS, it is to display two different products side by side to make comparisons. This is a powerful and often necessary feature of any serious GIS software. QGIS makes it possible to.

Ezoic

report this ad

About Us

We believe data processing and analytics routines should be repeatable without purchasing expensive software licenses. This is possible with open-source programs and programming languages. Our goal is to help you learn open-source software and programming languages for GIS and data science. We do this with free tutorials and paid courses.

Ezoic

report this ad

Ezoic

report this ad

Python Command-Line Arguments

Adding the capability of processing Python command-line arguments provides a user-friendly interface to your text-based command line program. It’s similar to what a graphical user interface is for a visual application that’s manipulated by graphical elements or widgets.

Python exposes a mechanism to capture and extract your Python command-line arguments. These values can be used to modify the behavior of a program. For example, if your program processes data read from a file, then you can pass the name of the file to your program, rather than hard-coding the value in your source code.

By the end of this tutorial, you’ll know:

  • The origins of Python command-line arguments
  • The underlying support for Python command-line arguments
  • The standards guiding the design of a command-line interface
  • The basics to manually customize and handle Python command-line arguments
  • The libraries available in Python to ease the development of a complex command-line interface

If you want a user-friendly way to supply Python command-line arguments to your program without importing a dedicated library, or if you want to better understand the common basis for the existing libraries that are dedicated to building the Python command-line interface, then keep on reading!

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.

The Command-Line Interface

A command-line interface (CLI) provides a way for a user to interact with a program running in a text-based shell interpreter. Some examples of shell interpreters are Bash on Linux or Command Prompt on Windows. A command-line interface is enabled by the shell interpreter that exposes a command prompt. It can be characterized by the following elements:

  • A command or program
  • Zero or more command line arguments
  • An output representing the result of the command
  • Textual documentation referred to as usage or help

Not every command-line interface may provide all these elements, but this list isn’t exhaustive, either. The complexity of the command line ranges from the ability to pass a single argument, to numerous arguments and options, much like a Domain Specific Language. For example, some programs may launch web documentation from the command line or start an interactive shell interpreter like Python.

The two following examples with the Python command illustrates the description of a command-line interface:

In this first example, the Python interpreter takes option -c for command, which says to execute the Python command-line arguments following the option -c as a Python program.

Another example shows how to invoke Python with -h to display the help:

Try this out in your terminal to see the complete help documentation.

The C Legacy

Python command-line arguments directly inherit from the C programming language. As Guido Van Rossum wrote in An Introduction to Python for Unix/C Programmers in 1993, C had a strong influence on Python. Guido mentions the definitions of literals, identifiers, operators, and statements like break , continue , or return . The use of Python command-line arguments is also strongly influenced by the C language.

To illustrate the similarities, consider the following C program:

Line 4 defines main() , which is the entry point of a C program. Take good note of the parameters:

  1. argc is an integer representing the number of arguments of the program.
  2. argv is an array of pointers to characters containing the name of the program in the first element of the array, followed by the arguments of the program, if any, in the remaining elements of the array.

You can compile the code above on Linux with gcc -o main main.c , then execute with ./main to obtain the following:

Unless explicitly expressed at the command line with the option -o , a.out is the default name of the executable generated by the gcc compiler. It stands for assembler output and is reminiscent of the executables that were generated on older UNIX systems. Observe that the name of the executable ./main is the sole argument.

Let’s spice up this example by passing a few Python command-line arguments to the same program:

The output shows that the number of arguments is 5 , and the list of arguments includes the name of the program, main , followed by each word of the phrase «Python Command Line Arguments» , which you passed at the command line.

Note: argc stands for argument count, while argv stands for argument vector. To learn more, check out A Little C Primer/C Command Line Arguments.

The compilation of main.c assumes that you used a Linux or a Mac OS system. On Windows, you can also compile this C program with one of the following options:

    It’s available in a few Linux distributions, like Ubuntu, OpenSUSE, and Debian, among others. You can install it from the Microsoft Store. This includes the Windows command line build tools, the Microsoft C/C++ compiler cl.exe , and a compiler front end named clang.exe for C/C++. This is the main Microsoft integrated development environment (IDE). To learn more about IDEs that can be used for both Python and C on various operating systems, including Windows, check out Python IDEs and Code Editors (Guide). This supports the GCC compiler on Windows.

If you’ve installed Microsoft Visual Studio or the Windows Build Tools, then you can compile main.c as follows:

You’ll obtain an executable named main.exe that you can start with:

You could implement a Python program, main.py , that’s equivalent to the C program, main.c , you saw above:

You don’t see an argc variable like in the C code example. It doesn’t exist in Python because sys.argv is sufficient. You can parse the Python command-line arguments in sys.argv without having to know the length of the list, and you can call the built-in len() if the number of arguments is needed by your program.

Also, note that enumerate() , when applied to an iterable, returns an enumerate object that can emit pairs associating the index of an element in sys.arg to its corresponding value. This allows looping through the content of sys.argv without having to maintain a counter for the index in the list.

Execute main.py as follows:

sys.argv contains the same information as in the C program:

  • The name of the program main.py is the first item of the list.
  • The arguments Python , Command , Line , and Arguments are the remaining elements in the list.

With this short introduction into a few arcane aspects of the C language, you’re now armed with some valuable knowledge to further grasp Python command-line arguments.

Two Utilities From the Unix World

To use Python command-line arguments in this tutorial, you’ll implement some partial features of two utilities from the Unix ecosystem:

You’ll gain some familiarity with these Unix tools in the following sections.

sha1sum

sha1sum calculates SHA-1 hashes, and it’s often used to verify the integrity of files. For a given input, a hash function always returns the same value. Any minor changes in the input will result in a different hash value. Before you use the utility with concrete parameters, you may try to display the help:

Displaying the help of a command line program is a common feature exposed in the command-line interface.

To calculate the SHA-1 hash value of the content of a file, you proceed as follows:

The result shows the SHA-1 hash value as the first field and the name of the file as the second field. The command can take more than one file as arguments:

Thanks to the wildcards expansion feature of the Unix terminal, it’s also possible to provide Python command-line arguments with wildcard characters. One such a character is the asterisk or star ( * ):

The shell converts main.* to main.c and main.py , which are the two files matching the pattern main.* in the current directory, and passes them to sha1sum . The program calculates the SHA1 hash of each of the files in the argument list. You’ll see that, on Windows, the behavior is different. Windows has no wildcard expansion, so the program may have to accommodate for that. Your implementation may need to expand wildcards internally.

Without any argument, sha1sum reads from the standard input. You can feed data to the program by typing characters on the keyboard. The input may incorporate any characters, including the carriage return Enter . To terminate the input, you must signal the end of file with Enter , followed by the sequence Ctrl + D :

You first enter the name of the program, sha1sum , followed by Enter , and then Real and Python , each also followed by Enter . To close the input stream, you type Ctrl + D . The result is the value of the SHA1 hash generated for the text Real\nPython\n . The name of the file is — . This is a convention to indicate the standard input. The hash value is the same when you execute the following commands:

Up next, you’ll read a short description of seq .

seq generates a sequence of numbers. In its most basic form, like generating the sequence from 1 to 5, you can execute the following:

To get an overview of the possibilities exposed by seq , you can display the help at the command line:

For this tutorial, you’ll write a few simplified variants of sha1sum and seq . In each example, you’ll learn a different facet or combination of features about Python command-line arguments.

On Mac OS and Linux, sha1sum and seq should come pre-installed, though the features and the help information may sometimes differ slightly between systems or distributions. If you’re using Windows 10, then the most convenient method is to run sha1sum and seq in a Linux environment installed on the WSL. If you don’t have access to a terminal exposing the standard Unix utilities, then you may have access to online terminals:

  • Create a free account on PythonAnywhere and start a Bash Console.
  • Create a temporary Bash terminal on repl.it.

These are two examples, and you may find others.

The sys.argv Array

Before exploring some accepted conventions and discovering how to handle Python command-line arguments, you need to know that the underlying support for all Python command-line arguments is provided by sys.argv . The examples in the following sections show you how to handle the Python command-line arguments stored in sys.argv and to overcome typical issues that occur when you try to access them. You’ll learn:

  • How to access the content of sys.argv
  • How to mitigate the side effects of the global nature of sys.argv
  • How to process whitespaces in Python command-line arguments
  • How to handle errors while accessing Python command-line arguments
  • How to ingest the original format of the Python command-line arguments passed by bytes

Let’s get started!

Displaying Arguments

The sys module exposes an array named argv that includes the following:

  1. argv[0] contains the name of the current Python program.
  2. argv[1:] , the rest of the list, contains any and all Python command-line arguments passed to the program.

The following example demonstrates the content of sys.argv :

Here’s how this code works:

  • Line 2 imports the internal Python module sys .
  • Line 4 extracts the name of the program by accessing the first element of the list sys.argv .
  • Line 5 displays the Python command-line arguments by fetching all the remaining elements of the list sys.argv .

Note: The f-string syntax used in argv.py leverages the new debugging specifier in Python 3.8. To read more about this new f-string feature and others, check out Cool New Features in Python 3.8.

If your Python version is less than 3.8, then simply remove the equals sign ( = ) in both f-strings to allow the program to execute successfully. The output will only display the value of the variables, not their names.

Execute the script argv.py above with a list of arbitrary arguments as follows:

The output confirms that the content of sys.argv[0] is the Python script argv.py , and that the remaining elements of the sys.argv list contains the arguments of the script, [‘un’, ‘deux’, ‘trois’, ‘quatre’] .

To summarize, sys.argv contains all the argv.py Python command-line arguments. When the Python interpreter executes a Python program, it parses the command line and populates sys.argv with the arguments.

Reversing the First Argument

Now that you have enough background on sys.argv , you’re going to operate on arguments passed at the command line. The example reverse.py reverses the first argument passed at the command line:

In reverse.py the process to reverse the first argument is performed with the following steps:

  • Line 5 fetches the first argument of the program stored at index 1 of sys.argv . Remember that the program name is stored at index 0 of sys.argv .
  • Line 6 prints the reversed string. args[::-1] is a Pythonic way to use a slice operation to reverse a list.

You execute the script as follows:

As expected, reverse.py operates on «Real Python» and reverses the only argument to output «nohtyP laeR» . Note that surrounding the multi-word string «Real Python» with quotes ensures that the interpreter handles it as a unique argument, instead of two arguments. You’ll delve into argument separators in a later section.

Mutating sys.argv

sys.argv is globally available to your running Python program. All modules imported during the execution of the process have direct access to sys.argv . This global access might be convenient, but sys.argv isn’t immutable. You may want to implement a more reliable mechanism to expose program arguments to different modules in your Python program, especially in a complex program with multiple files.

Observe what happens if you tamper with sys.argv :

You invoke .pop() to remove and return the last item in sys.argv .

Execute the script above:

Notice that the fourth argument is no longer included in sys.argv .

In a short script, you can safely rely on the global access to sys.argv , but in a larger program, you may want to store arguments in a separate variable. The previous example could be modified as follows:

This time, although sys.argv lost its last element, args has been safely preserved. args isn’t global, and you can pass it around to parse the arguments per the logic of your program. The Python package manager, pip , uses this approach. Here’s a short excerpt of the pip source code:

In this snippet of code taken from the pip source code, main() saves into args the slice of sys.argv that contains only the arguments and not the file name. sys.argv remains untouched, and args isn’t impacted by any inadvertent changes to sys.argv .

Escaping Whitespace Characters

In the reverse.py example you saw earlier, the first and only argument is «Real Python» , and the result is «nohtyP laeR» . The argument includes a whitespace separator between «Real» and «Python» , and it needs to be escaped.

On Linux, whitespaces can be escaped by doing one of the following:

  1. Surrounding the arguments with single quotes ( ‘ )
  2. Surrounding the arguments with double quotes ( » )
  3. Prefixing each space with a backslash ( \ )

Without one of the escape solutions, reverse.py stores two arguments, «Real» in sys.argv[1] and «Python» in sys.argv[2] :

The output above shows that the script only reverses «Real» and that «Python» is ignored. To ensure both arguments are stored, you’d need to surround the overall string with double quotes ( » ).

You can also use a backslash ( \ ) to escape the whitespace:

With the backslash ( \ ), the command shell exposes a unique argument to Python, and then to reverse.py .

In Unix shells, the internal field separator (IFS) defines characters used as delimiters. The content of the shell variable, IFS , can be displayed by running the following command:

From the result above, ‘ \t\n’ , you identify three delimiters:

  1. Space ( ‘ ‘ )
  2. Tab ( \t )
  3. Newline ( \n )

Prefixing a space with a backslash ( \ ) bypasses the default behavior of the space as a delimiter in the string «Real Python» . This results in one block of text as intended, instead of two.

Note that, on Windows, the whitespace interpretation can be managed by using a combination of double quotes. It’s slightly counterintuitive because, in the Windows terminal, a double quote ( » ) is interpreted as a switch to disable and subsequently to enable special characters like space, tab, or pipe ( | ).

As a result, when you surround more than one string with double quotes, the Windows terminal interprets the first double quote as a command to ignore special characters and the second double quote as one to interpret special characters.

With this information in mind, it’s safe to assume that surrounding more than one string with double quotes will give you the expected behavior, which is to expose the group of strings as a single argument. To confirm this peculiar effect of the double quote on the Windows command line, observe the following two examples:

In the example above, you can intuitively deduce that «Real Python» is interpreted as a single argument. However, realize what occurs when you use a single double quote:

The command prompt passes the whole string «Real Python» as a single argument, in the same manner as if the argument was «Real Python» . In reality, the Windows command prompt sees the unique double quote as a switch to disable the behavior of the whitespaces as separators and passes anything following the double quote as a unique argument.

For more information on the effects of double quotes in the Windows terminal, check out A Better Way To Understand Quoting and Escaping of Windows Command Line Arguments.

Handling Errors

Python command-line arguments are loose strings. Many things can go wrong, so it’s a good idea to provide the users of your program with some guidance in the event they pass incorrect arguments at the command line. For example, reverse.py expects one argument, and if you omit it, then you get an error:

The Python exception IndexError is raised, and the corresponding traceback shows that the error is caused by the expression arg = sys.argv[1] . The message of the exception is list index out of range . You didn’t pass an argument at the command line, so there’s nothing in the list sys.argv at index 1 .

This is a common pattern that can be addressed in a few different ways. For this initial example, you’ll keep it brief by including the expression arg = sys.argv[1] in a try block. Modify the code as follows:

The expression on line 4 is included in a try block. Line 8 raises the built-in exception SystemExit . If no argument is passed to reverse_exc.py , then the process exits with a status code of 1 after printing the usage. Note the integration of sys.argv[0] in the error message. It exposes the name of the program in the usage message. Now, when you execute the same program without any Python command-line arguments, you can see the following output:

reverse.py didn’t have an argument passed at the command line. As a result, the program raises SystemExit with an error message. This causes the program to exit with a status of 1 , which displays when you print the special variable $? with echo .

Calculating the sha1sum

You’ll write another script to demonstrate that, on Unix-like systems, Python command-line arguments are passed by bytes from the OS. This script takes a string as an argument and outputs the hexadecimal SHA-1 hash of the argument:

This is loosely inspired by sha1sum , but it intentionally processes a string instead of the contents of a file. In sha1sum.py , the steps to ingest the Python command-line arguments and to output the result are the following:

  • Line 6 stores the content of the first argument in data .
  • Line 7 instantiates a SHA1 algorithm.
  • Line 8 updates the SHA1 hash object with the content of the first program argument. Note that hash.update takes a byte array as an argument, so it’s necessary to convert data from a string to a bytes array.
  • Line 9 prints a hexadecimal representation of the SHA1 hash computed on line 8.

When you run the script with an argument, you get this:

For the sake of keeping the example short, the script sha1sum.py doesn’t handle missing Python command-line arguments. Error handling could be addressed in this script the same way you did it in reverse_exc.py .

Note: Checkout hashlib for more details about the hash functions available in the Python standard library.

From the sys.argv documentation, you learn that in order to get the original bytes of the Python command-line arguments, you can use os.fsencode() . By directly obtaining the bytes from sys.argv[1] , you don’t need to perform the string-to-bytes conversion of data :

The main difference between sha1sum.py and sha1sum_bytes.py are highlighted in the following lines:

  • Line 7 populates data with the original bytes passed to the Python command-line arguments.
  • Line 9 passes data as an argument to m.update() , which receives a bytes-like object.

Execute sha1sum_bytes.py to compare the output:

The hexadecimal value of the SHA1 hash is the same as in the previous sha1sum.py example.

The Anatomy of Python Command-Line Arguments

Now that you’ve explored a few aspects of Python command-line arguments, most notably sys.argv , you’re going to apply some of the standards that are regularly used by developers while implementing a command-line interface.

Python command-line arguments are a subset of the command-line interface. They can be composed of different types of arguments:

  1. Options modify the behavior of a particular command or program.
  2. Arguments represent the source or destination to be processed.
  3. Subcommands allow a program to define more than one command with the respective set of options and arguments.

Before you go deeper into the different types of arguments, you’ll get an overview of the accepted standards that have been guiding the design of the command-line interface and arguments. These have been refined since the advent of the computer terminal in the mid-1960s.

Standards

A few available standards provide some definitions and guidelines to promote consistency for implementing commands and their arguments. These are the main UNIX standards and references:

The standards above define guidelines and nomenclatures for anything related to programs and Python command-line arguments. The following points are examples taken from those references:

  • POSIX:
    • A program or utility is followed by options, option-arguments, and operands.
    • All options should be preceded with a hyphen or minus ( — ) delimiter character.
    • Option-arguments should not be optional.
    • All programs should support two standard options, which are —version and —help .
    • Long-named options are equivalent to the single-letter Unix-style options. An example is —debug and -d .
    • Short options can be stacked, meaning that -abc is equivalent to -a -b -c .
    • Long options can have arguments specified after a space or the equals sign ( = ). The long option —input=ARG is equivalent to —input ARG .

    These standards define notations that are helpful when you describe a command. A similar notation can be used to display the usage of a particular command when you invoke it with the option -h or —help .

    The GNU standards are very similar to the POSIX standards but provide some modifications and extensions. Notably, they add the long option that’s a fully named option prefixed with two hyphens ( — ). For example, to display the help, the regular option is -h and the long option is —help .

    Note: You don’t need to follow those standards rigorously. Instead, follow the conventions that have been used successfully for years since the advent of UNIX. If you write a set of utilities for you or your team, then ensure that you stay consistent across the different utilities.

    In the following sections, you’ll learn more about each of the command line components, options, arguments, and sub-commands.

    Options

    An option, sometimes called a flag or a switch, is intended to modify the behavior of the program. For example, the command ls on Linux lists the content of a given directory. Without any arguments, it lists the files and directories in the current directory:

    Let’s add a few options. You can combine -l and -s into -ls , which changes the information displayed in the terminal:

    An option can take an argument, which is called an option-argument. See an example in action with od below:

    od stands for octal dump. This utility displays data in different printable representations, like octal (which is the default), hexadecimal, decimal, and ASCII. In the example above, it takes the binary file main and displays the first 16 bytes of the file in hexadecimal format. The option -t expects a type as an option-argument, and -N expects the number of input bytes.

    In the example above, -t is given type x1 , which stands for hexadecimal and one byte per integer. This is followed by z to display the printable characters at the end of the input line. -N takes 16 as an option-argument for limiting the number of input bytes to 16.

    Arguments

    The arguments are also called operands or parameters in the POSIX standards. The arguments represent the source or the destination of the data that the command acts on. For example, the command cp , which is used to copy one or more files to a file or a directory, takes at least one source and one target:

    In line 4, cp takes two arguments:

    1. main : the source file
    2. main2 : the target file

    It then copies the content of main to a new file named main2 . Both main and main2 are arguments, or operands, of the program cp .

    Subcommands

    The concept of subcommands isn’t documented in the POSIX or GNU standards, but it does appear in docopt. The standard Unix utilities are small tools adhering to the Unix philosophy. Unix programs are intended to be programs that do one thing and do it well. This means no subcommands are necessary.

    By contrast, a new generation of programs, including git , go , docker , and gcloud , come with a slightly different paradigm that embraces subcommands. They’re not necessarily part of the Unix landscape as they span several operating systems, and they’re deployed with a full ecosystem that requires several commands.

    Take git as an example. It handles several commands, each possibly with their own set of options, option-arguments, and arguments. The following examples apply to the git subcommand branch :

    • git branch displays the branches of the local git repository.
    • git branch custom_python creates a local branch custom_python in a local repository.
    • git branch -d custom_python deletes the local branch custom_python .
    • git branch —help displays the help for the git branch subcommand.

    In the Python ecosystem, pip has the concept of subcommands, too. Some pip subcommands include list , install , freeze , or uninstall .

    Windows

    On Windows, the conventions regarding Python command-line arguments are slightly different, in particular, those regarding command line options. To validate this difference, take tasklist , which is a native Windows executable that displays a list of the currently running processes. It’s similar to ps on Linux or macOS systems. Below is an example of how to execute tasklist in a command prompt on Windows:

    Note that the separator for an option is a forward slash ( / ) instead of a hyphen ( — ) like the conventions for Unix systems. For readability, there’s a space between the program name, taskslist , and the option /FI , but it’s just as correct to type taskslist/FI .

    The particular example above executes tasklist with a filter to only show the Notepad processes currently running. You can see that the system has two running instances of the Notepad process. Although it’s not equivalent, this is similar to executing the following command in a terminal on a Unix-like system:

    The ps command above shows all the current running vi processes. The behavior is consistent with the Unix Philosophy, as the output of ps is transformed by two grep filters. The first grep command selects all the occurrences of vi , and the second grep filters out the occurrence of grep itself.

    With the spread of Unix tools making their appearance in the Windows ecosystem, non-Windows-specific conventions are also accepted on Windows.

    Visuals

    At the start of a Python process, Python command-line arguments are split into two categories:

    Python options: These influence the execution of the Python interpreter. For example, adding option -O is a means to optimize the execution of a Python program by removing assert and __debug__ statements. There are other Python options available at the command line.

    Python program and its arguments: Following the Python options (if there are any), you’ll find the Python program, which is a file name that usually has the extension .py , and its arguments. By convention, those can also be composed of options and arguments.

    Take the following command that’s intended to execute the program main.py , which takes options and arguments. Note that, in this example, the Python interpreter also takes some options, which are -B and -v .

    In the command line above, the options are Python command-line arguments and are organized as follows:

    • The option -B tells Python not to write .pyc files on the import of source modules. For more details about .pyc files, check out the section What Does a Compiler Do? in Your Guide to the CPython Source Code.
    • The option -v stands for verbose and tells Python to trace all import statements.
    • The arguments passed to main.py are fictitious and represent two long options ( —verbose and —debug ) and two arguments ( un and deux ).

    This example of Python command-line arguments can be illustrated graphically as follows:

    Anatomy of the Python Command Line Arguments

    Within the Python program main.py , you only have access to the Python command-line arguments inserted by Python in sys.argv . The Python options may influence the behavior of the program but are not accessible in main.py .

    A Few Methods for Parsing Python Command-Line Arguments

    Now you’re going to explore a few approaches to apprehend options, option-arguments, and operands. This is done by parsing Python command-line arguments. In this section, you’ll see some concrete aspects of Python command-line arguments and techniques to handle them. First, you’ll see an example that introduces a straight approach relying on list comprehensions to collect and separate options from arguments. Then you will:

    • Use regular expressions to extract elements of the command line
    • Learn how to handle files passed at the command line
    • Apprehend the standard input in a way that’s compatible with the Unix tools
    • Differentiate the regular output of the program from the errors
    • Implement a custom parser to read Python command-line arguments

    This will serve as a preparation for options involving modules in the standard libraries or from external libraries that you’ll learn about later in this tutorial.

    For something uncomplicated, the following pattern, which doesn’t enforce ordering and doesn’t handle option-arguments, may be enough:

    The intent of the program above is to modify the case of the Python command-line arguments. Three options are available:

    • -c to capitalize the arguments
    • -u to convert the arguments to uppercase
    • -l to convert the argument to lowercase

    The code collects and separates the different argument types using list comprehensions:

    • Line 5 collects all the options by filtering on any Python command-line arguments starting with a hyphen ( — ).
    • Line 6 assembles the program arguments by filtering out the options.

    When you execute the Python program above with a set of options and arguments, you get the following output:

    This approach might suffice in many situations, but it would fail in the following cases:

    • If the order is important, and in particular, if options should appear before the arguments
    • If support for option-arguments is needed
    • If some arguments are prefixed with a hyphen ( — )

    You can leverage other options before you resort to a library like argparse or click .

    Regular Expressions

    You can use a regular expression to enforce a certain order, specific options and option-arguments, or even the type of arguments. To illustrate the usage of a regular expression to parse Python command-line arguments, you’ll implement a Python version of seq , which is a program that prints a sequence of numbers. Following the docopt conventions, a specification for seq.py could be this:

    First, look at a regular expression that’s intended to capture the requirements above:

    To experiment with the regular expression above, you may use the snippet recorded on Regular Expression 101. The regular expression captures and enforces a few aspects of the requirements given for seq . In particular, the command may take:

    1. A help option, in short ( -h ) or long format ( —help ), captured as a named group called HELP
    2. A separator option, -s or —separator , taking an optional argument, and captured as named group called SEP
    3. Up to three integer operands, respectively captured as OP1 , OP2 , and OP3

    For clarity, the pattern args_pattern above uses the flag re.VERBOSE on line 11. This allows you to spread the regular expression over a few lines to enhance readability. The pattern validates the following:

    • Argument order: Options and arguments are expected to be laid out in a given order. For example, options are expected before the arguments.
    • Option values**: Only —help , -s , or —separator are expected as options.
    • Argument mutual exclusivity: The option —help isn’t compatible with other options or arguments.
    • Argument type: Operands are expected to be positive or negative integers.

    For the regular expression to be able to handle these things, it needs to see all Python command-line arguments in one string. You can collect them using str.join():

    This makes arg_line a string that includes all arguments, except the program name, separated by a space.

    Given the pattern args_pattern above, you can extract the Python command-line arguments with the following function:

    The pattern is already handling the order of the arguments, mutual exclusivity between options and arguments, and the type of the arguments. parse() is applying re.match() to the argument line to extract the proper values and store the data in a dictionary.

    The dictionary includes the names of each group as keys and their respective values. For example, if the arg_line value is —help , then the dictionary is <'HELP': 'help'>. If arg_line is -s T 10 , then the dictionary becomes <'SEP': 'T', 'OP1': '10'>. You can expand the code block below to see an implementation of seq with regular expressions.

    An Implementation of seq With Regular Expressions Show/Hide

    The code below implements a limited version of seq with a regular expression to handle the command line parsing and validation:

    You can execute the code above by running this command:

    This should output the following:

    Try this command with other combinations, including the —help option.

    You didn’t see a version option supplied here. This was done intentionally to reduce the length of the example. You may consider adding the version option as an extended exercise. As a hint, you could modify the regular expression by replacing the line (—(?P<HELP>help).*)| with (—(?P<HELP>help).*)|(—(?P<VER>version).*)| . An additional if block would also be needed in main() .

    At this point, you know a few ways to extract options and arguments from the command line. So far, the Python command-line arguments were only strings or integers. Next, you’ll learn how to handle files passed as arguments.

    File Handling

    It’s time now to experiment with Python command-line arguments that are expected to be file names. Modify sha1sum.py to handle one or more files as arguments. You’ll end up with a downgraded version of the original sha1sum utility, which takes one or more files as arguments and displays the hexadecimal SHA1 hash for each file, followed by the name of the file:

    sha1sum() is applied to the data read from each file that you passed at the command line, rather than the string itself. Take note that m.update() takes a bytes-like object as an argument and that the result of invoking read() after opening a file with the mode rb will return a bytes object. For more information about handling file content, check out Reading and Writing Files in Python, and in particular, the section Working With Bytes.

    The evolution of sha1sum_file.py from handling strings at the command line to manipulating the content of files is getting you closer to the original implementation of sha1sum :

    The execution of the Python program with the same Python command-line arguments gives this:

    Because you interact with the shell interpreter or the Windows command prompt, you also get the benefit of the wildcard expansion provided by the shell. To prove this, you can reuse main.py , which displays each argument with the argument number and its value:

    You can see that the shell automatically performs wildcard expansion so that any file with a base name matching main , regardless of the extension, is part of sys.argv .

    The wildcard expansion isn’t available on Windows. To obtain the same behavior, you need to implement it in your code. To refactor main.py to work with wildcard expansion, you can use glob . The following example works on Windows and, though it isn’t as concise as the original main.py , the same code behaves similarly across platforms:

    In main_win.py , expand_args relies on glob.glob() to process the shell-style wildcards. You can verify the result on Windows and any other operating system:

    This addresses the problem of handling files using wildcards like the asterisk ( * ) or question mark ( ? ), but how about stdin ?

    If you don’t pass any parameter to the original sha1sum utility, then it expects to read data from the standard input. This is the text you enter at the terminal that ends when you type Ctrl + D on Unix-like systems or Ctrl + Z on Windows. These control sequences send an end of file (EOF) to the terminal, which stops reading from stdin and returns the data that was entered.

    In the next section, you’ll add to your code the ability to read from the standard input stream.

    Standard Input

    When you modify the previous Python implementation of sha1sum to handle the standard input using sys.stdin , you’ll get closer to the original sha1sum :

    Two conventions are applied to this new sha1sum version:

    1. Without any arguments, the program expects the data to be provided in the standard input, sys.stdin , which is a readable file object.
    2. When a hyphen ( — ) is provided as a file argument at the command line, the program interprets it as reading the file from the standard input.

    Try this new script without any arguments. Enter the first aphorism of The Zen of Python, then complete the entry with the keyboard shortcut Ctrl + D on Unix-like systems or Ctrl + Z on Windows:

    You can also include one of the arguments as stdin mixed with the other file arguments like so:

    Another approach on Unix-like systems is to provide /dev/stdin instead of — to handle the standard input:

    On Windows there’s no equivalent to /dev/stdin , so using — as a file argument works as expected.

    The script sha1sum_stdin.py isn’t covering all necessary error handling, but you’ll cover some of the missing features later in this tutorial.

    Standard Output and Standard Error

    Command line processing may have a direct relationship with stdin to respect the conventions detailed in the previous section. The standard output, although not immediately relevant, is still a concern if you want to adhere to the Unix Philosophy. To allow small programs to be combined, you may have to take into account the three standard streams:

    1. stdin
    2. stdout
    3. stderr

    The output of a program becomes the input of another one, allowing you to chain small utilities. For example, if you wanted to sort the aphorisms of the Zen of Python, then you could execute the following:

    The output above is truncated for better readability. Now imagine that you have a program that outputs the same data but also prints some debugging information:

    Executing the Python script above gives:

    The ellipsis ( . ) indicates that the output was truncated to improve readability.

    Now, if you want to sort the list of aphorisms, then execute the command as follows:

    You may realize that you didn’t intend to have the debug output as the input of the sort command. To address this issue, you want to send traces to the standard errors stream, stderr , instead:

    Execute zen_sort_stderr.py to observe the following:

    Now, the traces are displayed to the terminal, but they aren’t used as input for the sort command.

    Custom Parsers

    You can implement seq by relying on a regular expression if the arguments aren’t too complex. Nevertheless, the regex pattern may quickly render the maintenance of the script difficult. Before you try getting help from specific libraries, another approach is to create a custom parser. The parser is a loop that fetches each argument one after another and applies a custom logic based on the semantics of your program.

    A possible implementation for processing the arguments of seq_parse.py could be as follows:

    parse() is given the list of arguments without the Python file name and uses collections.deque() to get the benefit of .popleft() , which removes the elements from the left of the collection. As the items of the arguments list unfold, you apply the logic that’s expected for your program. In parse() you can observe the following:

    • The while loop is at the core of the function, and terminates when there are no more arguments to parse, when the help is invoked, or when an error occurs.
    • If the separator option is detected, then the next argument is expected to be the separator.
    • operands stores the integers that are used to calculate the sequence. There should be at least one operand and at most three.

    A full version of the code for parse() is available below:

    Click to expand the full example. Show/Hide

    Note that some error handling aspects are kept to a minimum so as to keep the examples relatively short.

    This manual approach of parsing the Python command-line arguments may be sufficient for a simple set of arguments. However, it becomes quickly error-prone when complexity increases due to the following:

    • A large number of arguments
    • Complexity and interdependency between arguments
    • Validation to perform against the arguments

    The custom approach isn’t reusable and requires reinventing the wheel in each program. By the end of this tutorial, you’ll have improved on this hand-crafted solution and learned a few better methods.

    A Few Methods for Validating Python Command-Line Arguments

    You’ve already performed validation for Python command-line arguments in a few examples like seq_regex.py and seq_parse.py . In the first example, you used a regular expression, and in the second example, a custom parser.

    Both of these examples took the same aspects into account. They considered the expected options as short-form ( -s ) or long-form ( —separator ). They considered the order of the arguments so that options would not be placed after operands. Finally, they considered the type, integer for the operands, and the number of arguments, from one to three arguments.

    Type Validation With Python Data Classes

    The following is a proof of concept that attempts to validate the type of the arguments passed at the command line. In the following example, you validate the number of arguments and their respective type:

    Unless you pass the —help option at the command line, this script expects two or three arguments:

    1. A mandatory string: firstname
    2. A mandatory string: lastname
    3. An optional integer: age

    Because all the items in sys.argv are strings, you need to convert the optional third argument to an integer if it’s composed of digits. str.isdigit() validates if all the characters in a string are digits. In addition, by constructing the data class Arguments with the values of the converted arguments, you obtain two validations:

    1. If the number of arguments doesn’t correspond to the number of mandatory fields expected by Arguments , then you get an error. This is a minimum of two and a maximum of three fields.
    2. If the types after conversion aren’t matching the types defined in the Arguments data class definition, then you get an error.

    You can see this in action with the following execution:

    In the execution above, the number of arguments is correct and the type of each argument is also correct.

    Now, execute the same command but omit the third argument:

    The result is also successful because the field age is defined with a default value, 0 , so the data class Arguments doesn’t require it.

    On the contrary, if the third argument isn’t of the proper type—say, a string instead of integer—then you get an error:

    The expected value Van Rossum , isn’t surrounded by quotes, so it’s split. The second word of the last name, Rossum , is a string that’s handled as the age, which is expected to be an int . The validation fails.

    Note: For more details about the usage of data classes in Python, check out The Ultimate Guide to Data Classes in Python 3.7.

    Similarly, you could also use a NamedTuple to achieve a similar validation. You’d replace the data class with a class deriving from NamedTuple , and check_type() would change as follows:

    A NamedTuple exposes functions like _asdict that transform the object into a dictionary that can be used for data lookup. It also exposes attributes like __annotations__ , which is a dictionary storing types for each field, and For more on __annotations__ , check out Python Type Checking (Guide).

    As highlighted in Python Type Checking (Guide), you could also leverage existing packages like Enforce, Pydantic, and Pytypes for advanced validation.

    Custom Validation

    Not unlike what you’ve already explored earlier, detailed validation may require some custom approaches. For example, if you attempt to execute sha1sum_stdin.py with an incorrect file name as an argument, then you get the following:

    bad_file.txt doesn’t exist, but the program attempts to read it.

    Revisit main() in sha1sum_stdin.py to handle non-existing files passed at the command line:

    To see the complete example with this extra validation, expand the code block below:

    Complete Source Code of sha1sum_val.py Show/Hide

    When you execute this modified script, you get this:

    Note that the error displayed to the terminal is written to stderr , so it doesn’t interfere with the data expected by a command that would read the output of sha1sum_val.py :

    This command pipes the output of sha1sum_val.py to cut to only include the first field. You can see that cut ignores the error message because it only receives the data sent to stdout .

    The Python Standard Library

    Despite the different approaches you took to process Python command-line arguments, any complex program might be better off leveraging existing libraries to handle the heavy lifting required by sophisticated command-line interfaces. As of Python 3.7, there are three command line parsers in the standard library:

    The recommended module to use from the standard library is argparse . The standard library also exposes optparse but it’s officially deprecated and only mentioned here for your information. It was superseded by argparse in Python 3.2 and you won’t see it discussed in this tutorial.

    argparse

    You’re going to revisit sha1sum_val.py , the most recent clone of sha1sum , to introduce the benefits of argparse . To this effect, you’ll modify main() and add init_argparse to instantiate argparse.ArgumentParser :

    For the cost of a few more lines compared to the previous implementation, you get a clean approach to add —help and —version options that didn’t exist before. The expected arguments (the files to be processed) are all available in field files of object argparse.Namespace . This object is populated on line 17 by calling parse_args() .

    To look at the full script with the modifications described above, expand the code block below:

    Complete Source Code of sha1sum_argparse.py Show/Hide

    To illustrate the immediate benefit you obtain by introducing argparse in this program, execute the following:

    getopt

    getopt finds its origins in the getopt C function. It facilitates parsing the command line and handling options, option arguments, and arguments. Revisit parse from seq_parse.py to use getopt :

    getopt.getopt() takes the following arguments:

    1. The usual arguments list minus the script name, sys.argv[1:]
    2. A string defining the short options
    3. A list of strings for the long options

    Note that a short option followed by a colon ( : ) expects an option argument, and that a long option trailed with an equals sign ( = ) expects an option argument.

    The remaining code of seq_getopt.py is the same as seq_parse.py and is available in the collapsed code block below:

    Complete Source Code of seq_getopt.py Show/Hide

    Next, you’ll take a look at some external packages that will help you parse Python command-line arguments.

    A Few External Python Packages

    Building upon the existing conventions you saw in this tutorial, there are a few libraries available on the Python Package Index (PyPI) that take many more steps to facilitate the implementation and maintenance of command-line interfaces.

    The following sections offer a glance at Click and Python Prompt Toolkit. You’ll only be exposed to very limited capabilities of these packages, as they both would require a full tutorial—if not a whole series—to do them justice!

    Click

    As of this writing, Click is perhaps the most advanced library to build a sophisticated command-line interface for a Python program. It’s used by several Python products, most notably Flask and Black. Before you try the following example, you need to install Click in either a Python virtual environment or your local environment. If you’re not familiar with the concept of virtual environments, then check out Python Virtual Environments: A Primer.

    To install Click, proceed as follows:

    So, how could Click help you handle the Python command-line arguments? Here’s a variation of the seq program using Click:

    Setting ignore_unknown_options to True ensures that Click doesn’t parse negative arguments as options. Negative integers are valid seq arguments.

    As you may have observed, you get a lot for free! A few well-carved decorators are sufficient to bury the boilerplate code, allowing you to focus on the main code, which is the content of seq() in this example.

    Note: For more about Python decorators, check out Primer on Python Decorators.

    The only import remaining is click . The declarative approach of decorating the main command, seq() , eliminates repetitive code that’s otherwise necessary. This could be any of the following:

    • Defining a help or usage procedure
    • Handling the version of the program
    • Capturing and setting up default values for options
    • Validating arguments, including the type

    The new seq implementation barely scratches the surface. Click offers many niceties that will help you craft a very professional command-line interface:

    • Output coloring
    • Prompt for omitted arguments
    • Commands and sub-commands
    • Argument type validation
    • Callback on options and arguments
    • File path validation
    • Progress bar

    There are many other features as well. Check out Writing Python Command-Line Tools With Click to see more concrete examples based on Click.

    Python Prompt Toolkit

    There are other popular Python packages that are handling the command-line interface problem, like docopt for Python. So, you may find the choice of the Prompt Toolkit a bit counterintuitive.

    The Python Prompt Toolkit provides features that may make your command line application drift away from the Unix philosophy. However, it helps to bridge the gap between an arcane command-line interface and a full-fledged graphical user interface. In other words, it may help to make your tools and programs more user-friendly.

    You can use this tool in addition to processing Python command-line arguments as in the previous examples, but this gives you a path to a UI-like approach without you having to depend on a full Python UI toolkit. To use prompt_toolkit , you need to install it with pip :

    You may find the next example a bit contrived, but the intent is to spur ideas and move you slightly away from more rigorous aspects of the command line with respect to the conventions you’ve seen in this tutorial.

    As you’ve already seen the core logic of this example, the code snippet below only presents the code that significantly deviates from the previous examples:

    The code above involves ways to interact and possibly guide users to enter the expected input, and to validate the input interactively using three dialog boxes:

    1. button_dialog
    2. message_dialog
    3. input_dialog

    The Python Prompt Toolkit exposes many other features intended to improve interaction with users. The call to the handler in main() is triggered by calling a function stored in a dictionary. Check out Emulating switch/case Statements in Python if you’ve never encountered this Python idiom before.

    You can see the full example of the program using prompt_toolkit by expanding the code block below:

    Complete Source Code for seq_prompt.py Show/Hide

    When you execute the code above, you’re greeted with a dialog prompting you for action. Then, if you choose the action Sequence, another dialog box is displayed. After collecting all the necessary data, options, or arguments, the dialog box disappears, and the result is printed at the command line, as in the previous examples:

    As the command line evolves and you can see some attempts to interact with users more creatively, other packages like PyInquirer also allow you to capitalize on a very interactive approach.

    To further explore the world of the Text-Based User Interface (TUI), check out Building Console User Interfaces and the Third Party section in Your Guide to the Python Print Function.

    If you’re interested in researching solutions that rely exclusively on the graphical user interface, then you may consider checking out the following resources:

    Conclusion

    In this tutorial, you’ve navigated many different aspects of Python command-line arguments. You should feel prepared to apply the following skills to your code:

    • The conventions and pseudo-standards of Python command-line arguments
    • The origins of sys.argv in Python
    • The usage of sys.argv to provide flexibility in running your Python programs
    • The Python standard libraries like argparse or getopt that abstract command line processing
    • The powerful Python packages like click and python_toolkit to further improve the usability of your programs

    Whether you’re running a small script or a complex text-based application, when you expose a command-line interface you’ll significantly improve the user experience of your Python software. In fact, you’re probably one of those users!

    Next time you use your application, you’ll appreciate the documentation you supplied with the —help option or the fact that you can pass options and arguments instead of modifying the source code to supply different data.

    Additional Resources

    To gain further insights about Python command-line arguments and their many facets, you may want to check out the following resources:

    You may also want to try other Python libraries that target the same problems while providing you with different solutions:

    Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Command Line Interfaces in Python

    Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

    Python Tricks Dictionary Merge

    About Andre Burgaud

    Andre is a seasoned software engineer passionate about technology and programming languages, in particular, Python.

    Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *