11 Named Scenarios | NoSQLBench Project (PREVIEW)

There is one final element of a yaml that you need to know about: named scenarios.

Named Scenarios allow anybody to run your testing workflows with a single command.

You can provide named scenarios for a workload like this:

# contents of myworkloads.yaml
scenarios:
  default:
    - run driver=diag cycles=10 alias=first-ten
    - run driver=diag cycles=10..20 alias=second-ten
  longrun:
    - run driver=diag cycles=10M

This provides a way to specify more detailed workflows that users may want to run without them having to build up a command line for themselves.

A couple of other forms are supported in the YAML, for terseness:

scenarios:
  oneliner: run driver=diag cycles=10
  mapform:
    part1: run driver=diag cycles=10 alias=part2
    part2: run driver=diag cycles=20 alias=part2

These forms simply provide finesse for common editing habits, but they are automatically read internally as a list of named steps. In the map form, the names are used to name activities. The order is retained.

Scenario selection

When a named scenario is run, it is always named, so that it can be looked up in the list of named scenarios under your scenarios: property. The only exception to this is when an explicit scenario name is not found on the command line, in which case it is automatically assumed to be default.

Some examples may be more illustrative:

# runs the scenario named 'default' if it exists, or throws an error if it does not.
nb5 myworkloads
# or
nb5 myworkloads default

# runs the named scenario 'longrun' if it exists, or throws an error if it does not.
nb5 myworkloads longrun

# runs the named scenario 'longrun' if it exists, or throws an error if it does not.
# this is simply the canonical form which is more verbose, but more explicit.
nb5 scenario myworkloads longrun

# run multiple named scenarios from one workload, and then some from another
nb5 scenario myworkloads longrun default longrun scenario another.yaml name1 name2
# In this form ^ you may have to add the explicit form to avoid conflicts between
# workload names and scenario names. That's why the explicit form is provided, after all.

Workload selection

The examples above contain no reference to a workload (formerly called yaml). They don't need to, as they refer to themselves implicitly. You may add a workload= parameter to the command templates if you like, but this is never needed for basic use, and it is error-prone to keep the filename matched to the command template. Just leave it out by default.

However, if you are doing advanced scripting across multiple systems, you can actually provide a workload= parameter particularly to use another workload description in your test.

👉 This is a powerful feature for workload automation and organization. However, it can get unwieldy quickly. Caution is advised for deep-linking too many scenarios in a workspace, as there is no mechanism for keeping them in sync when small changes are made.

Named Scenario Discovery

For named scenarios, there is a way for users to find all the named scenarios that are currently bundled or in view of their current directory. A couple simple rules must be followed by scenario publishers in order to keep things simple:

Workload files in the current directory *.yaml are considered.
Workload files under in the relative path activities/ with name *.yaml are considered.
The same rules are used when looking in the bundled NoSQLBench, so built-ins come along for the ride.
Any workload file that contains a scenarios: tag is included, but all others are ignored.

This doesn't mean that you can't use named scenarios for workloads in other locations. It simply means that when users use the --list-scenarios option, these are the only ones they will see listed.

Parameter Overrides

You can override parameters that are provided by named scenarios. Any parameter that you specify on the command line after your workload and optional scenario name will be used to override or augment the commands that are provided for the named scenario.

This is powerful, but it also means that you can sometimes munge user-provided activity parameters on the command line with the named scenario commands in ways that may not make sense. To solve this, the parameters in the named scenario commands may be locked. You can lock them silently, or you can provide a verbose locking that will cause an error if the user even tries to adjust them.

Silent locking is provided with a form like param==value. Any silent locked parameters will reject overrides from the command line, but will not interrupt the user.

Verbose locking is provided with a form like param===value. Any time a user provides a parameter on the command line for the named parameter, an error is thrown, and they are informed that this is not possible. This level is provided for cases in which you would not want the user to be unaware of an unset parameter which is germain and specific to the named scenario.

All other parameters provided by the user will take the place of the same-named parameters provided in each command templates, in the order they appear in the template. Any other parameters provided by the user will be added to each of the command templates in the order they appear on the command line.

This is a little counter-intuitive at first, but once you see some examples it should make sense.

Parameter Override Examples

Consider a simple workload with three named scenarios:

# basics.yaml
scenarios:
  s1: run driver=stdout cycles=10
  s2: run driver=stdout cycles==10
  s3: run driver=stdout cycles===10

bindings:
  c: Identity()

statements:
  - A: "cycle={c}\n"

Running this with no options prompts the user to select one of the named scenarios:

$ nb5 basics
ERROR: Unable to find named scenario 'default' in workload 'basics', but you can pick from s1,s2,s3
$

Basic Override example

If you run the first scenario s1 with your own value for cycles=7, it does as you ask:

$ nb5 basics s1 cycles=7
Logging to logs/scenario_20200324_205121_554.log
cycle=0
cycle=1
cycle=2
cycle=3
cycle=4
cycle=5
cycle=6
$

Silent Locking example

If you run the second scenario s2 with your own value for cycles=7, then it does what the locked parameter cycles==10 requires, without telling you that it is ignoring the specified value on your command line.

$ nb5 basics s2 cycles=7
Logging to logs/scenario_20200324_205339_486.log
cycle=0
cycle=1
cycle=2
cycle=3
cycle=4
cycle=5
cycle=6
cycle=7
cycle=8
cycle=9
$

Sometimes, this is appropriate, such as when specifying settings like threads== for schema activities.

Verbose Locking example

If you run the third scenario s3 with your own value for cycles=7, then you will get an error telling you that this is not possible. Sometimes you want to make sure tha the user knows a parameter should not be changed, and that if they want to change it, they'll have to make their own custom version of the scenario in question.

$ nb5 basics s3 cycles=7
ERROR: Unable to reassign value for locked param 'cycles===7'
$

Ultimately, it is up to the scenario designer when to lock parameters for users. The built-in workloads offer some examples on how to set these parameters so that the right value are locked in place without bother the user, but some values are made very clear in how they should be set. Please look at these examples for inspiration when you need.

Forcing Undefined Parameters

If you want to ensure that any parameter in a named scenario template remains unset in the generated scenario script, you can assign it a value of UNDEF. The locking behaviors described above apply to this one as well. Thus, for schema commands which rely on the default sequence length (which is based on the number of active statements), you can set cycles==UNDEF to ensure that when a user passes a cycles parameter, the schema activity doesn't break with too many cycles.

Automatic Parameters

Some parameters are already known due to the fact that you are using named scenarios.

workload

The workload parameter is, by default, set to the logical path (fully qualified workload name) of the yaml file containing the named scenario. However, if the command template contains this parameter, it may be overridden by users as any other parameter depending on the assignment operators as explained above.

alias

The alias parameter is, by default, set to the expanded name of WORKLOAD_SCENARIO_STEP, which means that each activity within the scenario has a distinct and symbolic name. This is important for distinguishing metrics from one another across workloads, named scenarios, and steps within a named scenario. The above words are interpolated into the alias as follows:

WORKLOAD - The simple name part of the fully qualified workload name. For example, with a workload (yaml path) of foo/bar/baz.yaml, the WORKLOAD name used here would be baz.
SCENARIO - The name of the scenario as provided on the command line.
STEP - The name of the step in the named scenario. If you used the list or string forms to provide a command template, then the steps are automatically named as a zero-padded number representing the step in the named scenario, starting from 000, per named scenario. (The numbers are not globally assigned)

Because it is important to have uniquely named activities for the sake of sane metrics and logging, any alias provided when using named scenarios which does not include the three tokens above will cause a warning to be issued to the user explaining why this is a bad idea.

👉 UNDEF is handled before alias expansion above, so it is possible to force the default activity naming behavior above with alias===UNDEF. This is generally recommended, and will inform users if they try to set the alias in an unsafe way.

Back to top