Lab
: An environment for running experiments¶
-
class
epyc.
Lab
(notebook: epyc.labnotebook.LabNotebook = None, design: epyc.design.Design = None)¶ A laboratory for computational experiments.
A
Lab
conducts an experiment at different points in a multi-dimensional parameter space. The default performs all the experiments locally; sub-classes exist to perform remote parallel experiments.A
Lab
stores its result in a notebook, an instance ofLabNotebook
. By default the baseLab
class uses an in-memory notebook, essentially just a dict; sub-classes use persistent notebooks to manage larger sets of experiments.Each lab has an associated
Design
that turns a set of parameter ranges into a set of individual “points” of the parameter space at which to perform actual experiments. The default is to use aFactorialDesign
that performs an experiment for every combination of parameter values. This might be a lot of experiments, and other designs can be used to reduce or modify the space.Parameters: - notebook – the notebook used to store results (defaults to an empty
LabNotebook
) - design – the experimental design to use (defaults to a
FactorialDesign
)
- notebook – the notebook used to store results (defaults to an empty
Lab creation and management¶
-
Lab.
__init__
(notebook: epyc.labnotebook.LabNotebook = None, design: epyc.design.Design = None)¶ Initialize self. See help(type(self)) for accurate signature.
-
Lab.
open
()¶ Open a lab for business. Sub-classes might insist the they are opened and closed explicitly when experiments are being performed. The default does nothing.
-
Lab.
close
()¶ Shut down a lab. Sub-classes might insist the they are opened and closed explicitly when experiments are being performed. The default does nothing.
-
Lab.
updateResults
()¶ Update the lab’s results. This method is called by all other methods that return results in some sense, and may be overridden to let the results “catch up” with external processing. The default does nothing.
Parameter management¶
A Lab
is equipped with a multi-dimensional parameter space
over which to run experiments, one experiment per point. The
dimensions of the space can be defined by single values, lists, or
iterators that give the points along that dimension. Strings are
considered to be single values, even though they’re technically
iterable in Python. Experiments are then conducted on the cross
product of the dimensions.
-
Lab.
addParameter
(k: str, r: Any)¶ Add a parameter to the experiment’s parameter space. k is the parameter name, and r is its range. The range can be a single value or a list, or any other iterable. (Strings are counted as single values.)
Parameters: - k – parameter name
- r – parameter range
-
Lab.
parameters
() → List[str]¶ Return a list of parameter names.
Returns: a list of parameter names
-
Lab.
__len__
() → int¶ The length of an experiment is the total number of data points that will be explored. This is the length of the experimental configuration returned by
experiments()
.Returns: the number of experimental runs
-
Lab.
__getitem__
(k: str) → Any¶ Access a parameter range using array notation.
Parameters: k – parameter name Returns: the parameter range
-
Lab.
__setitem__
(k: str, r: Any)¶ Add a parameter using array notation.
Parameters: - k – the parameter name
- r – the parameter range
Parameters can be dropped, either individually or en masse, to
prepare the lab for another experiment. This will often accompany
creating or selecting a new result set in the LabNotebook
.
-
Lab.
__delitem__
(k: str)¶ Delete a parameter using array notation.
Parameters: k – the key
-
Lab.
deleteParameter
(k: str)¶ Delete a parameter from the parameter space. If the parameter doesn’t exist then this is a no-op.
Parameters: k – the parameter name
-
Lab.
deleteAllParameters
()¶ Delete all parameters from the parameter space.
Building the parameter space¶
The parameter ranges defined above need to be translated into “points”
in the parameter space at which to conduct experiments. This function
is delegated to the experimental design, an instance of
Design
, which turns ranges into points. The design is
provided at construction time: by default a FactorialDesign
is used, and this will be adequate for most use cases..
-
Lab.
design
() → epyc.design.Design¶ Return the experimental design this lab uses.
Returns: the design
-
Lab.
experiments
(e: epyc.experiment.Experiment) → List[Tuple[epyc.experiment.Experiment, Dict[str, Any]]]¶ Return the experimental configuration, a list consisting of experiments and the points at which they should be run. The structure of the experimental space is defined by the lab’s experimental design, which may also change the experiment to be run.
Parameters: e – the experiment Returns: an experimental configuration
Running experiments¶
Running experiments involves providing a Experiment
object
which can then be executed by setting its parameter point (using Experiment.set()
)
and then run (by calling Experiment.run()
) The Lab
co-ordinates the running of the experiment at all the points chosen by
the design.
-
Lab.
runExperiment
(e: epyc.experiment.Experiment)¶ Run an experiment over all the points in the parameter space. The results will be stored in the notebook.
Parameters: e – the experiment
-
Lab.
ready
(tag: str = None) → bool¶ Test whether all the results are ready in the tagged result set – that is, none are pending.
Parameters: tag – (optional) the result set to check (default is the current result set) Returns: True if the results are in
-
Lab.
readyFraction
(tag: str = None) → float¶ Return the fraction of results available (not pending) in the tagged result set after first updating the results.
Parameters: tag – (optional) the result set to check (default is the current result set) Returns: the ready fraction
Conditional experiments¶
Sometimes it is useful to run experiments conditionally, for example
to create a result set only if it doesn’t already
exist. Lab
can do this by providing a function to execute in
order to populate a result set.
Note
This technique work especially well with Jupyter notebooks, to avoid re-computing some cells. See Avoiding repeated computation.
-
Lab.
createWith
(tag: str, f: Callable[[Lab], bool], description: str = None, propagate: bool = True, delete: bool = True, finish: bool = False, deleteAllParameters: bool = True)¶ Use a function to create a result set.
If the result set already exists in the lab’s notebook, it is selected; if it doesn’t, it is created, selected, and the creation function is called. The creation function is passed a reference to the lab it is populating.
By default any exception in the creation function will cause the incomplete result set to be deleted and the previously current result set to be re-selected: this can be inhibited by setting
delete=False
. Any raised exception is propagated by default: this can be inhibited by settingpropagate = False
. The result set can be locked after creation by settingfinished=True
, as long as the creation was successful: poorly-created result sets aren’t locked.By default the lab has its parameters cleared before calling the creation function, so that it occurd “clean”. Set
deleteAllParameters=False
to inhibit this.Parameters: - tag – the result set tag
- f – the creation function (taking Lab as argument)
- description – (optional) description if a result set is created
- propagate – (optional) propagate any excepton (defaults to True)
- delete – (optional) delete on exception (default is True)
- finish – (optional) lock the result set after creation (defaults to False)
- deleteAllParameters – (optional) delete all lab parameters before creation (defaults to True)
Returns: True if the result set exists already or was properly created
Accessing results¶
Results of experiments can be accessed directly. via the lab’s
underlying LabNotebook
, or directly as a DataFrame
from
the pandas
analysis package.
-
Lab.
notebook
() → epyc.labnotebook.LabNotebook¶ Return the notebook being used by this lab.
Returns: the notebook
-
Lab.
results
() → List[Dict[str, Dict[str, Any]]]¶ Return the current results as a list of results dicts after resolving any pending results that have completed. This makes use of the underlying notebook’s current result set. For finer control, access the notebook’s
LabNotebook.results()
or :meth:LabNotebook.resultsFor` methods directly.Note that this approach to acquiring results is a lot slower and more memory-hungry than using
dataframe()
, but may be useful for small sets of results that benefit from a more Pythonic intertface.
-
Lab.
dataframe
(only_successful: bool = True) → pandas.core.frame.DataFrame¶ Return the current results as a pandas DataFrame after resolving any pending results that have completed. This makes use of the underlying notebook’s current result set. For finer control, access the notebook’s
LabNotebook.dataframe()
or :meth:LabNotebook.dataframeFor` methods directly.Parameters: only_successful – only return successful results Returns: the resulting dataset as a DataFrame