Confuse: Painless Configuration
Confuse is a straightforward, full-featured configuration system for Python.
Basic Usage
Set up your Configuration object, which provides unified access to all of your application’s config settings:
config = confuse.Configuration("MyGreatApp", __name__)
The first parameter is required; it’s the name of your application, which will
be used to search the system for a config file named config.yaml. See
Search Paths for the specific locations searched.
The second parameter is optional: it’s the name of a module that will guide the
search for a defaults file. Use this if you want to include a
config_default.yaml file inside your package. (The included example
package does exactly this.)
Now, you can access your configuration data as if it were a simple structure
consisting of nested dicts and lists—except that you need to call the method
.get() on the leaf of this tree to get the result as a value:
value = config["foo"][2]["bar"].get()
Under the hood, accessing items in your configuration tree builds up a view
into your app’s configuration. Then, get() flattens this view into a value,
performing a search through each configuration data source to find an answer.
(More on views later.)
If you know that a configuration value should have a specific type, just pass
that type to get():
int_value = config["number_of_goats"].get(int)
This way, Confuse will either give you an integer or raise a ConfigTypeError
if the user has messed up the configuration. You’re safe to assume after this
call that int_value has the right type. If the key doesn’t exist in any
configuration file, Confuse will raise a NotFoundError. Together, catching
these exceptions (both subclasses of confuse.ConfigError) lets you
painlessly validate the user’s configuration as you go.
View Theory
The Confuse API is based on the concept of views. You can think of a view as a
place to look in a config file: for example, one view might say get the value
for key ``number_of_goats``. Another might say get the value at index 8 inside
the sequence for key ``animal_counts``. To get the value for a given view, you
resolve it by calling the get() method.
This concept separates the specification of a location from the mechanism for retrieving data from a location. (In this sense, it’s a little like XPath: you specify a path to data you want and then you retrieve it.)
Using views, you can write config['animal_counts'][8] and know that no
exceptions will be raised until you call get(), even if the
animal_counts key does not exist. More importantly, it lets you write a
single expression to search many different data sources without preemptively
merging all sources together into a single data structure.
Views also solve an important problem with overriding collections. Imagine, for
example, that you have a dictionary called deliciousness in your config file
that maps food names to tastiness ratings. If the default configuration gives
carrots a rating of 8 and the user’s config rates them a 10, then clearly
config['deliciousness']['carrots'].get() should return 10. But what if the
two data sources have different sets of vegetables? If the user provides a value
for broccoli and zucchini but not carrots, should carrots have a default
deliciousness value of 8 or should Confuse just throw an exception? With
Confuse’s views, the application gets to decide.
The above expression, config['deliciousness']['carrots'].get(), returns 8
(falling back on the default). However, you can also write
config['deliciousness'].get(). This expression will cause the entire
user-specified mapping to override the default one, providing a dict object like
{'broccoli': 7, 'zucchini': 9}. As a rule, then, resolve a view at the same
granularity you want config files to override each other.
Warning
It may appear that calling config.get() would retrieve the entire
configuration at once. However, this will return only the highest-priority
configuration source, masking any lower-priority values for keys that are
not present in the top source. This pitfall is especially likely when using
Command-Line Options or Environment Variables, which may place
an empty configuration at the top of the stack. A subsequent call to
config.get() might then return no configuration at all.
Validation
We saw above that you can easily assert that a configuration value has a certain
type by passing that type to get(). But sometimes you need to do more than
just type checking. For this reason, Confuse provides a few methods on views
that perform fancier validation or even conversion:
as_filename(): Normalize a filename, substituting tildes and absolute-ifying relative paths. For filenames defined in a config file, by default the filename is relative to the application’s config directory (Configuration.config_dir(), as described below). However, if the config file was loaded with thebase_for_pathsparameter set toTrue(see Manually Specifying Config Files), then a relative path refers to the directory containing the config file. A relative path from any other source (e.g., command-line options) is relative to the working directory. For full control over relative path resolution, use theFilenametemplate directly (see Filename).as_choice(choices): Check that a value is one of the provided choices. The argument should be a sequence of possible values. If the sequence is adict, then this method returns the associated value instead of the key.as_number(): Raise an exception unless the value is of a numeric type.as_pairs(): Get a collection as a list of pairs. The collection should be a list of elements that are either pairs (i.e., two-element lists) already or single-entry dicts. This can be helpful because, in YAML, lists of single-element mappings have a simple syntax (- key: value) and, unlike real mappings, preserve order.as_str_seq(): Given either a string or a list of strings, return a list of strings. A single string is split on whitespace.as_str_expanded(): Expand any environment variables contained in a string using os.path.expandvars().
For example, config['path'].as_filename() ensures that you get a reasonable
filename string from the configuration. And calling
config['direction'].as_choice(['up', 'down']) will raise a
ConfigValueError unless the direction value is either “up” or “down”.
Command-Line Options
Arguments to command-line programs can be seen as just another source for configuration options. Just as options in a user-specific configuration file should override those from a system-wide config, command-line options should take priority over all configuration files.
You can use the argparse and optparse modules from the standard library with
Confuse to accomplish this. Just call the set_args method on any view and
pass in the object returned by the command-line parsing library. Values from the
command-line option namespace object will be added to the overlay for the view
in question. For example, with argparse:
args = parser.parse_args()
config.set_args(args)
Correspondingly, with optparse:
options, args = parser.parse_args()
config.set_args(options)
This call will turn all of the command-line options into a top-level source in your configuration. The key associated with each option in the parser will become a key available in your configuration. For example, consider this argparse script:
config = confuse.Configuration("myapp")
parser = argparse.ArgumentParser()
parser.add_argument("--foo", help="a parameter")
args = parser.parse_args()
config.set_args(args)
print(config["foo"].get())
This will allow the user to override the configured value for key foo by
passing --foo <something> on the command line.
Overriding nested values can be accomplished by passing dots=True and have dot-delimited properties on the incoming object.
parser.add_argument("--bar", help="nested parameter", dest="foo.bar")
args = parser.parse_args() # args looks like: {'foo.bar': 'value'}
config.set_args(args, dots=True)
print(config["foo"]["bar"].get())
set_args works with generic dictionaries too.
args = {"foo": {"bar": 1}}
config.set_args(args, dots=True)
print(config["foo"]["bar"].get())
Note that, while you can use the full power of your favorite command-line
parsing library, you’ll probably want to avoid specifying defaults in your
argparse or optparse setup. This way, Confuse can use other configuration
sources—possibly your config_default.yaml—to fill in values for
unspecified command-line switches. Otherwise, the argparse/optparse default
value will hide options configured elsewhere.
Environment Variables
Confuse supports using environment variables as another source to provide an
additional layer of configuration. The environment variables to include are
identified by a prefix, which defaults to the uppercased name of your
application followed by an underscore. Matching environment variable names are
first stripped of this prefix and then lowercased to determine the corresponding
configuration option. To load the environment variables for your application
using the default prefix, just call set_env on your Configuration
object. Config values from the environment will then be added as an overlay at
the highest precedence. For example:
export MYAPP_FOO=something
import confuse
config = confuse.Configuration("myapp", __name__)
config.set_env()
print(config["foo"].get())
Nested config values can be overridden by using a separator string in the environment variable name. By default, double underscores are used as the separator for nesting, to avoid clashes with config options that contain single underscores. Note that most shells restrict environment variable names to alphanumeric and underscore characters, so dots are not a valid separator.
export MYAPP_FOO__BAR=something
import confuse
config = confuse.Configuration("myapp", __name__)
config.set_env()
print(config["foo"]["bar"].get())
Both the prefix and the separator can be customized when using set_env. Note
that prefix matching is done to the environment variables prior to
lowercasing, while the separator is matched after lowercasing.
export APPFOO_NESTED_BAR=something
import confuse
config = confuse.Configuration("myapp", __name__)
config.set_env(prefix="APP", sep="_nested_")
print(config["foo"]["bar"].get())
For configurations that include lists, use integers starting from 0 as nested keys to invoke “list conversion.” If any of the sibling nested keys are not integers or the integers are not sequential starting from 0, then conversion will not be performed. Nested lists and combinations of nested dicts and lists are supported.
export MYAPP_FOO__0=first
export MYAPP_FOO__1=second
export MYAPP_FOO__2__BAR__0=nested
import confuse
config = confuse.Configuration("myapp", __name__)
config.set_env()
print(config["foo"].get()) # ['first', 'second', {'bar': ['nested']}]
For consistency with YAML config files, the values of environment variables are
type converted using the same YAML parser used for file-based configs. This
means that numeric strings will be converted to integers or floats, “true” and
“false” will be converted to booleans, and the empty string or “null” will be
converted to None. Setting an environment variable to the empty string or
“null” allows unsetting a config value from a lower-precedence source.
To change the lowercasing and list handling behaviors when loading environment
variables or to enable full YAML parsing of environment variables, you can
initialize an EnvSource configuration source directly.
If you use config overlays from both command-line args and environment
variables, the order of calls to set_args and set_env will determine the
precedence, with the last call having the highest precedence.
Search Paths
Confuse looks in a number of locations for your application’s configurations.
The locations are determined by the platform. For each platform, Confuse has a
list of directories in which it looks for a directory named after the
application. For example, the first search location on Unix-y systems is
$XDG_CONFIG_HOME/AppName for an application called AppName.
Here are the default search paths for each platform:
macOS:
~/.config/appand~/Library/Application Support/appOther Unix:
~/.config/appand/etc/appWindows:
%APPDATA%\appwhere the APPDATA environment variable falls back to%HOME%\AppData\Roamingif undefined
Both macOS and other Unix operating sytems also try to use the
XDG_CONFIG_HOME and XDG_CONFIG_DIRS environment variables if set then
search those directories as well.
Users can also add an override configuration directory with an environment
variable. The variable name is the application name in capitals with “DIR”
appended: for an application named AppName, the environment variable is
APPNAMEDIR.
Manually Specifying Config Files
You may want to leverage Confuse’s features without Search Paths. This can be done by manually specifying the YAML files you want to include, which also allows changing how relative paths in the file will be resolved:
import confuse
# Instantiates config. Confuse searches for a config_default.yaml
config = confuse.Configuration("MyGreatApp", __name__)
# Add config items from specified file. Relative path values within the
# file are resolved relative to the application's configuration directory.
config.set_file("subdirectory/default_config.yaml")
# Add config items from a second file. If some items were already defined,
# they will be overwritten (new file precedes the previous ones). With
# `base_for_paths` set to True, relative path values in this file will be
# resolved relative to the config file's directory (i.e., 'subdirectory').
config.set_file("subdirectory/local_config.yaml", base_for_paths=True)
val = config["foo"]["bar"].get(int)
Your Application Directory
Confuse provides a simple helper, Configuration.config_dir(), that gives you
a directory used to store your application’s configuration. If a configuration
file exists in any of the searched locations, then the highest-priority
directory containing a config file is used. Otherwise, a directory is created
for you and returned. So you can always expect this method to give you a
directory that actually exists.
As an example, you may want to migrate a user’s settings to Confuse from an older configuration system such as ConfigParser. Just do something like this:
config_filename = os.path.join(config.config_dir(), confuse.CONFIG_FILENAME)
with open(config_filename, "w") as f:
yaml.dump(migrated_config, f)
Dynamic Updates
Occasionally, a program will need to modify its configuration while it’s running. For example, an interactive prompt from the user might cause the program to change a setting for the current execution only. Or the program might need to add a derived configuration value that the user doesn’t specify.
To facilitate this, Confuse lets you assign to view objects using ordinary
Python assignment. Assignment will add an overlay source that precedes all other
configuration sources in priority. Here’s an example of programmatically setting
a configuration value based on a DEBUG constant:
if DEBUG:
config["verbosity"] = 100
...
my_logger.setLevel(config["verbosity"].get(int))
This example allows the constant to override the default verbosity level, which would otherwise come from a configuration file.
Assignment works by creating a new “source” for configuration data at the top of
the stack. This new source takes priority over all other, previously-loaded
sources. You can cause this explicitly by calling the set() method on any
view. A related method, add(), works similarly but instead adds a new
lowest-priority source to the bottom of the stack. This can be used to provide
defaults for options that may be overridden by previously-loaded configuration
files.
YAML Tweaks
Confuse uses the PyYAML module to parse YAML configuration files. However, it deviates very slightly from the official YAML specification to provide a few niceties suited to human-written configuration files. Those tweaks are:
All strings are returned as Python Unicode objects.
YAML maps are parsed as Python OrderedDict objects. This means that you can recover the order that the user wrote down a dictionary.
Bare strings can begin with the % character. In stock PyYAML, this will throw a parse error.
To produce a YAML string reflecting a configuration, just call
config.dump(). This does not cleanly round-trip YAML, but it does play some
tricks to preserve comments and spacing in the original file.
Custom YAML Loaders
You can also specify your own PyYAML Loader object to parse YAML files. Supply the loader parameter to a Configuration constructor, like this:
config = confuse.Configuration("name", loader=yaml.Loaded)
To imbue a loader with Confuse’s special parser overrides, use its add_constructors method:
class MyLoader(yaml.Loader): ...
confuse.Loader.add_constructors(MyLoader)
config = confuse.Configuration("name", loader=MyLoader)
Configuring Large Programs
One problem that must be solved by a configuration system is the issue of global configuration for complex applications. In a large program with many components and many config options, it can be unwieldy to explicitly pass configuration values from component to component. You quickly end up with monstrous function signatures with dozens of keyword arguments, decreasing code legibility and testability.
In such systems, one option is to pass a single Configuration object through to each component. To avoid even this, however, it’s sometimes appropriate to use a little bit of shared global state. As evil as shared global state usually is, configuration is (in my opinion) one valid use: since configuration is mostly read-only, it’s relatively unlikely to cause the sorts of problems that global values sometimes can. And having a global repository for configuration option can vastly reduce the amount of boilerplate threading-through needed to explicitly pass configuration from call to call.
To use global configuration, consider creating a configuration object in a well-known module (say, the root of a package). But since this object will be initialized at module load time, Confuse provides a LazyConfig object that loads your configuration files on demand instead of when the object is constructed. (Doing complicated stuff like parsing YAML at module load time is generally considered a Bad Idea.)
Global state can cause problems for unit testing. To alleviate this, consider adding code to your test fixtures (e.g., setUp in the unittest module) that clears out the global configuration before each test is run. Something like this:
config.clear()
config.read(user=False)
These lines will empty out the current configuration and then re-load the defaults (but not the user’s configuration files). Your tests can then modify the global configuration values without affecting other tests since these modifications will be cleared out before the next test runs.
Redaction
You can also mark certain configuration values as “sensitive” and avoid including them in output. Just set the redact flag:
config["key"].redact = True
Then flatten or dump the configuration like so:
config.dump(redact=True)
The resulting YAML will contain “key: REDACTED” instead of the original data.