From Inputs to Outputs
What is a run?
QA-Board will run:
- your code on inputs with a given configuration
- and expect outputs files along with metrics.
Depending on your domain, those will be different. Here are some examples:
Domain | Input | Configuration | Output | Metric |
---|---|---|---|---|
Image processing | image | feature flag & params | transformed image, debug data... | SNR, sharpness, color accuracy... |
Cloud server choice | integration test | instance type | cost, throughput... | |
Machine learning | database/sample | hyperparameters | convergence plots / individual results | loss |
Optimization research | problem | model type, solver | solution | cost, runtime... |
Software performance | unit test | feature flag, platform | perf recordings, benchmark histograms | runtime, memory, latency, IPC, throughput... |
Hardware/driver perf | hardware/unit-test | sysbench config | latency histogram | ops/s, runtime... |
How QA-Board looks for inputs
To make things simple, QA-Board expects that your inputs are existing paths.
note
It is also possible to use external databases not just files. Or use a list of unit tests... If you need it, read this.
Try to run:
qa run --input /dev/random
#=> it should invite you to run *your* code
tip
Relative paths will be searched relative to the "database". The default is /
(or C://
on Windows), and you can change it in qaboard.yaml (inputs.database
).
Batches of inputs
To run on batches of multiple inputs, use qa batch my-batch
, where my-batch is defined in:
my-batch:
inputs:
- images/A.jpg
- images/B.jpg
qa batch my-batch
#=> qa run --input images/A.jpg
#=> qa run --input images/B.jpg
note
We'll cover batches in more depth later. By default, batches run in parallel locally, but you can easily setup an async task queue like Celery (built-in support), LSF or others.
(Optional) Identifying inputs
You'll often want to do something like "run on all the images in a given folder". For that to work, you have to tell QA-Board how to identify those images as inputs.
In qaboard.yaml edit and inputs.globs
with a glob pattern. Here is an example where your inputs are .jpg images:
inputs:
globs: '*.jpg'
When you do this, you no longer have to define an explicit list of input paths in your batches. You can instead use folders or even globs/wildcards (*
, **
...):
my-batch:
inputs:
- images
tip
To run on all the inputs found under $database / $PATH
you can simply use qa batch $PATH
.
You can give multiple patterns:
inputs:
globs:
- '*.jpg'
- '*.bmp'
- '*.dng'
A common use case is identifying folders containing a file patching a pattern, for instance movies given a sequence of frames, frame_000.jpg, frame_001.jpg... In this case you can use use_parent_folder
:
inputs:
globs: frame_000.jpg
use_parent_folder: true
(Advanced) Handling multiple input types
Big projects sometimes need to distinguish different types of inputs, which will be processed with a different logic.
inputs:
types:
default: image
image:
globs: '*.raw'
movie:
globs: frame_000.jpg
use_parent_folder: true
# you can override the defaults per-type
database:
linux: /mnt/movies
windows: F://movies
You can choose what type each batch is:
my-images:
inputs:
- my/image.jpg
my-batch:
type: movie
inputs:
- folder/of/movies
- other/movies
If needed, you can also specify the input type from the CLI:
qa batch my-imagess # by default look for images
qa --type movie batch my-movies # here we look for movies
(Advanced) Allowing duplicate batch names
In some of our projects, we have tons of subprojects, which all define the same "standard" batch name used in the CI. And sometimes, we also want to be able to run all of those batche at the same time, from the root project... To do it:
inputs:
batches: some/path/*/batches.yaml
allow_duplicate_batches: true