Perform a parameter sensitivity analysis on the StrathE2E model.

Performs a one-at-a-time parameter sensitivity analysis on the StrathE2E model using the Morris Method for factorial sampling of the physical configuration parameters, ecology model parameters, the fishing fleet model parameters, and the environmental forcings.

Usage

e2e_run_sens(
  model,
  nyears = 50,
  n_traj = 16,
  trajsd = 0.0075,
  n_setoflevels = 4,
  v_setoflevels = 0.1,
  coldstart = TRUE,
  quiet = TRUE,
  postprocess = TRUE,
  csv.output = FALSE,
  runtime.plot = TRUE,
  outID = 0
)

Arguments

model: R-list object generated by the e2e_read() function which defines the parent model configuration.
nyears: Number of years to run the model in each iteration (default=50).
n_traj: Number of trajectories of parameter sets (default=16).
trajsd: Coefficient of variation used to set the standard deviation for the gaussian distribution from which new paramater values are drawn to create each trajectoy baseline from the initial parent values (default=0.0075).
n_setoflevels: Number of fixed levels of coefficient of variation used to generate the individual parameter values in each level-run. Must be an even number (default=4).
v_setoflevels: Maximum coefficient of variation for the set of levels (default=0.1, i.e. -10 percent to +10 percent).
coldstart: Logical. If TRUE then the run is starting from cold - which means that the first trajectory baseline is the parent configuration as specified in the 'model' list object. If FALSE then signifies that this is a parallel run whuch will later be merged with the 'coldstart=TRUE' run. In this case the first trajectory baseline is a derivative of the parent. Default=TRUE.
quiet: Logical. If TRUE then suppress informational messages at the start of each iteration (default=TRUE).
postprocess: Logical. If TRUE then process the results through to a final sorted list of parameter sensitivies for plotting. If FALSE just produce the raw results. The reason for NOT processing would be if the job has been shared across multiple machines/processors and several raw result files need to be merged before processing. Default=TRUE.
csv.output: Logical. If TRUE then enable writing of csv output files (default=FALSE).
runtime.plot: Logical. If FALSE then disable runtime plotting of the progress of the run - useful for testing (default=TRUE).
outID: Numeric value in the range 0 to 251. Selects the output criterion to be used as the basis for the analysis. Default=0 corresponds to the likelihood of observed target data. Other values obtainable by running e2e_get_senscrit().

Value

Depends on the settings of arguments 'postprocess' and 'csv.ouptut': If postprocess=TRUE and csv.output=TRUE then outputs are csv files of raw parameter vectors, likelihoods and Elementary Effects for each run, and parameter list sorted by EE_mean, plus the function returns the data of sorted parameters. If postprocess=FALSE and csv.output=FALSE then the function simply returns a dataframe of likelihoods and Elementary Effects for each run.

Details

The basis for the method is a scheme for sampling the model parameters, one at a time, from distributions around baseline sets, and testing the effects on the performance of the model against some criterion. The default criterion is the likelihood of the observational data on the state of the ecosystem that are used as the target for parameter optimization in the various simulated annealing functions supplied with the package. However, the criterion can in principle be any metric output from the model, e.g. a state variable value at some point in time or averaged over a time interval, or one of the computed fluxes.

The process requires an initial set of parameters for the model. We refer to this as the 'parent' parameter set. It is recommended that this should be the parameters producing the maximum likelihood of the observational target data (as estimated by e.g. the e2e_optimize_eco() function). The MODEL_SETUP.csv file in the folder /Models/Modelname/Modelvariant/ should be configured to point to the relevant files, and then loaded with the e2e_read() function.

From this parent set, a series of 'child' parameter sets are generated by applying a separate random increment to each parameter drawn from a gaussian distribution of mean 0 and standard deviation given by a fixed coefficient of variation applied to the parent-set value of each parameter.

For each of the child-sets of parameters, the chosen output criterion is saved following runs of StrathE2E to stationary state. We refer to these as trajectory baselines.

Then, for each trajectory, the parameters are varied in turn, one at a time, by adding a fixed proportionality increment to the trajectory baseline values, the model re-run, and the output criterion computed. We refer to these as 'level runs'. The proportionality increment is the same for all of the level runs within a given trajectory, and is drawn at random from a set of fixed levels distributed symetrically around 0 (e.g. -10, -5, +5, +10 percent, i.e. proportions of the trajectory baseline values = 0.9, 0.95, 1.05, 1.10).

For each level run, the 'Elementary Effect (EE)' of the given parameter is calculated from the difference between the level run criterion value and the corresponding trajectory baseline criterion value.

On completion of all the trajectories, the raw results are (optionally) post-processed to generate the mean and standard deviations of all the EE values for each parameter. EE_mean is an index of the magnitude of the sensitivity, and EE_sd is an index of the extent of interaction with other parameters.

During the run the function produces a real-time plot for each trajectory, in which the x-axis represents the sequence of parameters, and the y-axis is the likelihood of the target data. A horizontal red line indicates the likelihood of the parent parameter set, horizontal grey line indicates the likelihood for each trajectory baseline and each level-run likelihood is shown by a symbol. The y-axis range can be changed in real-time by editing the setup file "/Models/Modelname/Modelvariant/Param/control/sensitivity.csv"

The outputs from the function are saved as list objects and directed to csv files (provided that the argument csv.output=TRUE) in the "results" folder sepcified in an e2e_read() function call. The outputs are:

Table of parameter values applied in each run of the model (OAT_parameter_values-*.csv, where * = model.ident as defined by e2e_read())
Table of the criterion value and EE value for each trajectory/level run (OAT_results-*.csv)
If post-processing is selected, then a table of Mean EE and standard deviation of EE for each parameter, sorted by the absolute value of EE_mean (sorted_parameter_elemental_effects-*.csv)

As mentioned above, the default criterion for assessing the model sensitivity is the likelihood of the observed target data set on the state of the ecosystem given each set of model drivers and parameters. However, a function argument allows other criteria to be chosen as the basis for the analysis from the list of annually averaged or integrated variables saved in the output objects:

results$final.year.output$mass_results_wholedomain (whole-domain annual averages of stage variables over the final year of a model run), and
results$final.year.output$annual_flux_results_wholedomain (whole-domian annual integrals of fluxes between state variables over the final year of a model run).

The criterion is chosen by setting a value for the argument outID. The default outID=0 selects the likelihood of the observed target data. Other values in the range 1 to 251 select annualy averaged mass or annually integrated flux outputs from a list viewable by running the function e2e_get_senscrit().

WARNING - The e2e_run_sens() function will take several days to run to completion on a single processor with even a modest number of iterations. The total number of model runs required to support the analysis is r*(n+1) where r is the number of trajectories and n is the number of parameters. The function incorporates all of the physical configuration parameters, fixed and fitted ecology model parameters, the fishing fleet model parameters, and the environmental forcings into the analysis, so n = 450. Each model run needs to be sufficiently long to achieve a stationary state and as a consequnce a typical runtime will be around 10h per trajectory. The mininimum recommended number of trajectories is 15, so the function can take several days to complete.

However, it is possible to spread the load over multiple processor/machines with arguments in the function allowing for management of this parallelization. Afterwards, the raw results files are combined into a single data set using the e2e_merge_sens_mc() function, and then processed using the function e2e_process_sens_mc().

A separate function e2e_plot_sens_mc() produces a graphical representation of the EE_mean and EE_sd results.

References

For details on the sensitivity analysis method see: Morris, M.D. (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33, 161-174.

For a review of sensitivity analysis methods including the Morris Method see: Wu, J. et al. (2013). Sensitivity analysis of infectious disease models: methods, advances and their application. J R Soc Interface 10: 20121018, 14pp.