e2e_run_sens {StrathE2E2} | R Documentation |
Performs a one-at-a-time parameter sensitivity analysis on the StrathE2E model using the Morris Method for factorial sampling of the physical configuration parameters, ecology model parameters, the fishing fleet model parameters, and the environmental forcings.
e2e_run_sens( model, nyears = 50, n_traj = 16, trajsd = 0.0075, n_setoflevels = 4, v_setoflevels = 0.1, coldstart = TRUE, quiet = TRUE, postprocess = TRUE, csv.output = TRUE, runtime.plot = TRUE )
model |
R-list object generated by the e2e_read() function which defines the parent model configuration. |
nyears |
Number of years to run the model in each iteration (default=50). |
n_traj |
Number of trajectories of parameter sets (default=16). |
trajsd |
Coefficient of variation used to set the standard deviation for the gaussian distribution from which new paramater values are drawn to create each trajectoy baseline from the initial parent values (default=0.0075). |
n_setoflevels |
Number of fixed levels of coefficient of variation used to generate the individual parameter values in each level-run. Must be an even number (default=4). |
v_setoflevels |
Maximum coefficient of variation for the set of levels (default=0.1, i.e. -10 percent to +10 percent). |
coldstart |
Logical. If TRUE then the run is starting from cold - which means that the first trajectory baseline is the parent configuration as specified in the 'model' list object. If FALSE then signifies that this is a parallel run whuch will later be merged with the 'coldstart=TRUE' run. In this case the first trajectory baseline is a derivative of the parent. Default=TRUE. |
quiet |
Logical. If TRUE then suppress informational messages at the start of each iteration (default=TRUE). |
postprocess |
Logical. If TRUE then process the results through to a final sorted list of parameter sensitivies for plotting. If FALSE just produce the raw results. The reason for NOT processing would be if the job has been shared across multiple machines/processors and several raw result files need to be merged before processing. Default=TRUE. |
csv.output |
Logical. If FALSE then disable writing of csv output files - useful for testing (default=TRUE). |
runtime.plot |
Logical. If FALSE then disable runtime plotting of the progress of the run - useful for testing (default=TRUE). |
The basis for the method is a scheme for sampling the model parameters, one at a time, from distributions around baseline sets, and testing the effects on the performance of the model against some criterion. In this case, the criterion is the likelihood of the observational data on the state of the ecosystem that are used as the target for parameter optimization in the various simulated annealing functions supplied with the package.
The process requires an initial set of parameters for the model. We refer to this as the 'parent' parameter set. It is recommended that this should be the parameters producing the maximum likelihood of the observational target data (as estimated by e.g. the e2e_optimize_eco() function). The MODEL_SETUP.csv file in the folder /Models/Modelname/Modelvariant/ should be configured to point to the relevant files, and then loaded with the e2e_read() function.
From this parent set, a series of 'child' parameter sets are generated by applying a separate random increment to each parameter drawn from a gaussian distribution of mean 0 and standard deviation given by a fixed coefficient of variation applied to the parent-set value of each parameter.
For each of the child-sets of parameters, the likelihood of the observed target data is calculated following runs of StrathE2E to stationary state. We refer to these as trajectory baselines.
Then, for each trajectory, the parameters are varied in turn, one at a time, by adding a fixed proportionality increment to the trajectory baseline values, the model re-run, and the likelihood computed. We refer to these as 'level runs'. The proportionality increment is the same for all of the level runs within a given trajectory, and is drawn at random from a set of fixed levels distributed symetrically around 0 (e.g. -10, -5, +5, +10 percent, i.e. proportions of the trajectory baseline values = 0.9, 0.95, 1.05, 1.10).
For each level run, the 'Elementary Effect (EE)' of the given parameter is calculated from the difference between the level run likelihood and the corresponding trajectory baseline likelihood.
On completion of all the trajectories, the raw results are (optionally) post-processed to generate the mean and standard deviations of all the EE values for each parameter. EE_mean is an index of the magnitude of the sensitivity, and EE_sd is an index of the extent of interaction with other parameters.
During the run the function produces a real-time plot for each trajectory, in which the x-axis represents the sequence of parameters, and the y-axis is the likelihood of the target data. A horizontal red line indicates the likelihood of the parent parameter set, horizontal grey line indicates the likelihood for each trajectory baseline and each level-run likelihood is shown by a symbol. The y-axis range can be changed in real-time by editing the setup file "/Models/Modelname/Modelvariant/Param/control/sensitivity.csv"
The outputs from the function are directed to csv files in the current "results" folder. The outputs are: a) Table of parameter values applied in each run of the model (OAT_parameter_values-*.csv, where * = model.ident as defined by e2e_read()) b) Table of the likelihood and EE value for each trajectory/level run (OAT_results-*.csv) c) If post-processing is selected, then a table of Mean EE and standard deviation of EE for each parameter, sorted by the absolute value of EE_mean (sorted_parameter_elemental_effects-*.csv)
WARNING - This function will take several days to run to completion on a single processor with even a modest number of iteration. The total number of model runs required to support the analysis is r*(n+1) where r is the number of trajectories and n is the number of parameters. The function incorporates all of the physical configuration parameters, fixed and fitted ecology model parameters, the fishing fleet model parameters, and the environmental forcings into the analysis, so r = 453. Each model run needs to be sufficiently long to achieve a stationary state and as a consequnce a typical runtime of around 10h per trajectory. The mininimum recommended number of trajectories is 15, to the function can take several days to complete.
However, it is possible to spread the load over multiple processor/machines with arguments in the function allowing for management of this parallelization. Afterwards, the raw results files are combined into a single data set using the e2e_merge_sens_mc() function, and then processed using the function e2e_process_sens_mc().
A separate function e2e_plot_sens_mc() produces a graphical representation of the EE_mean and EE_sd results.
Depends on the settings of arguments 'postprocess' and 'csv.ouptut': If postprocess=TRUE and csv.output=TRUE then outputs are csv files of raw parameter vectors, likelihoods and Elementary Effects for each run, and parameter list sorted by EE_mean, plus the function returns the data of sorted parameters. If postprocess=FALSE and csv.output=FALSE then the function simply returns a dataframe of likelihoods and Elementary Effects for each run.
Morris, M.D. (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33, 161-174.
e2e_read
, e2e_merge_sens_mc
, e2e_process_sens_mc
, e2e_plot_sens_mc
## Not run: # Load the 2003-2013 version of the North Sea model supplied with the package: model <- e2e_read("North_Sea", "1970-1999") # Run the sensitivity analysis process (a quick demonstration): # WARNING - Running a full sensitivity analysis takes days of computer time on a single # machine/processor because it involves a huge number of model runs. # The example below is just a (relatively) quick minimalist demonstration and should NOT # be taken as the basis for any analysis or conclusions. # Even so, this minimalist demonstration run could take 45 min to complete because it # involves 1359 model runs. sens_results <- e2e_run_sens(model, nyears=1, n_traj=3, csv.output=FALSE) head(sens_results) # A more realistic sensitivity analysis would be something like: sens_results <- e2e_run_sens(model, nyears=50, n_traj=16, postprocess=TRUE) # DO NOT launch this configuration unless you are prepared to wait many days for the results # Example of parallelizing the process: # Launch two (or more) runs separately on different processors... # Launch batch 1 (processor 1): model1 <- e2e_read("North_Sea", "1970-1999", model.ident="BATCH1") sens_results1 <- e2e_run_sens(model1, nyears=50, n_traj=10, coldstart=TRUE, postprocess=FALSE) # Note that coldstart=TRUE for the first batch only. # Launch batch 2 (on processor 2): model2 <- e2e_read("North_Sea", "1970-1999", model.ident="BATCH2") sens_results1 <- e2e_run_sens(model2, nyears=50, n_traj=10, coldstart=FALSE, postprocess=FALSE) # Note that these two runs return only raw data since postprocess=FALSE # Then, afterwards, merge the two raw results files with text-tags BATCH1 and BATCH2, # and post process the combined file: model3 <- e2e_read("North_Sea", "1970-1999", model.ident="COMBINED") processed_data <- e2e_merge_sens_mc(model3, selection="SENS", ident.list<-c("BATCH1","BATCH2"), postprocess=TRUE, csv.output=TRUE) # or... combined_data <- e2e_merge_sens_mc(model3, selection="SENS", ident.list<-c("BATCH1","BATCH2"), postprocess=FALSE, csv.output=TRUE) processed_data <- e2e_process_sens_mc(model3, selection="SENS", use.example=FALSE, csv.output=TRUE) # Plot a diagram of parameter sensitivities from the combined data e2e_plot_sens_mc(model3, selection="SENS", use.example=FALSE) ## End(Not run)