Introduction¶
Welcome to the documentation of the pycomex package for computational experiments in python! This page
contains some general remarks about the design choices of the project, how it can be installed and finally
a quickstart code example.
More detailed explanations of the various features of the package are provided as a series of annotated code Examples.
Main Philosophy¶
This package aims to improve the experience of conducting and managing records of computational experiments as they are often required in academia regarding fields such as algorithm engineering, data science or machine learning.
The main presumption made in this context is that each individual experiment is/can be defined within it’s own python code file. These experiment files will act as the basic units of code encapsulation and concern separation. Think of these files as having the same basic purpose as encapsulating complex code into reusable functions.
The basic goals of this approach are the following:
Improved management of records/artifacts. Every run of an experiment creates new data, be it in the form of logged metrics, result plots, images or other kinds of file artifacts. This package creates a wrapping layer around each experiment, which automatically takes care of these menial “housekeeping” tasks. For every run of each experiment a new folder is automatically created. This folder will contain all the relevant artifacts and data created during the run.
Improve repeatability. A core goal of any kind of experiment is reproducibility of results. However, the reproducibility of various intermediate results is often lost as the experiment code or parameters are irreversibly changed over time. Pycomex saves a snapshot of the exact original code used to produce the results, so that every experiment can be repeated in the future.
Note
Why not use functions?
As mentioned, the rationale behind the experiment files is basically the same as encapsulating various pieces of code into their own functions. Why not just have experiments be functions then?
That certainly would have been possible and a lot of this comes down to arbitrary design choices / personal preference. That being said, I think there are some advantages to using files:
Files are more self-contained than just functions or classes. Being able to execute one file as a (more or less) standalone makes implementing the repeatability of experiments easier.
Code for computational experiments can become rather complex to the point where additional functions and classes specific to an individual experiment may have to be defined. Having to define local functions/classes within another function may be considered bad practice and decrease readability.
Installation¶
Stable release¶
To install pycomex, run this command in your terminal:
$ pip install pycomex
This is the preferred method to install pycomex, as it will always install the most recent stable release.
If you don’t have pip installed, this Python installation guide can guide you through the process.
From sources¶
The sources for pycomex can be downloaded from the Github repo.
You can either clone the public repository:
$ git clone git://github.com/the16thpythonist/pycomex
Once you have a copy of the source, you can install it with:
$ cd pycomex
$ pip3 install .
$ python -m pycomex.cli --version
0.6.0
Getting Started¶
To get started, you only need to create a fresh python file to contain the experiment code and
import the Experiment content manager from the pycomex package. At the top of the file, you may
define any kinds of imports as usual. Then at the top you should also define the concrete values of
experiment parameters, functions etc.
The main logic of the experiment should go into the with block of the context manager. Upon entering
the context manager, the experiment folder for this particular run is created. The Experiment context
manager also manages error handling (and saving to a file), storing and saving of the main associative
data store among other things.
"""
This doc string will be saved as the "description" meta data of the experiment records
"""
from pycomex.experiment import Experiment
from pycomex.util import Skippable
SHORT_DESCRIPTION = ('An example experiment describing the very first steps, which are needed to get '
'started with the library')
# Experiment parameters can simply be defined as uppercase global variables.
# These are automatically detected and can possibly be overwritten in command
# line invocation
HELLO = "hello "
WORLD = "world!"
# Experiment context manager needs 3 positional arguments:
# - Path to an existing folder in which to store the results
# - A namespace name unique for each experiment
# - access to the local globals() dict
with Skippable(), (e := Experiment("/tmp", "example/quickstart", globals())):
# Internally saved into automatically created nested dict
# {'strings': {'hello_world': '...'}}
e["strings/hello_world"] = HELLO + WORLD
# Alternative to "print". Message is printed to stdout as well as
# recorded to log file
e.info("some debug message")
# Automatically saves text file artifact to the experiment record folder
file_name = "hello_world.txt"
e.commit_raw(file_name, HELLO + WORLD)
# e.commit_fig(file_name, fig)
# e.commit_png(file_name, image)
# ...
# All the code inside this context will be copied to the "analysis.py"
# file which will be created as an experiment artifact.
with Skippable(), e.analysis:
# And we can access all the internal fields of the experiment object
# and the experiment parameters here!
print(HELLO, WORLD)
print(e['strings/hello_world'])
# logging will print to stdout but not modify the log file
e.info('analysis done')
For a more detailed introduction of the various features look at the series of Examples in the next chapter!