Settings#

settings activitysim.core.configuration.Settings#

The overall settings for the ActivitySim model system.

The input for these settings is typically stored in one main YAML file, usually called settings.yaml.

Note that this implementation is presently used only for generating documentation, but future work may migrate the settings implementation to actually use this pydantic code to validate the settings before running the model.

Fields
  • benchmarking (bool)

  • check_for_variability (bool)

  • checkpoints (Union[bool, list])

  • chunk_method (str)

  • chunk_size (int)

  • chunk_training_mode (str)

  • cleanup_pipeline_after_run (bool)

  • cleanup_trace_files_on_resume (bool)

  • create_input_store (bool)

  • disable_destination_sampling (bool)

  • disable_zarr (bool)

  • fail_fast (bool)

  • households_sample_size (int)

  • input_store (str)

  • input_table_list (list[activitysim.core.configuration.top.InputTable])

  • instrument (bool)

  • keep_mem_logs (bool)

  • log_alt_losers (bool)

  • memory_profile (bool)

  • models (list[str])

  • multiprocess (bool)

  • multiprocess_steps (list[activitysim.core.configuration.top.MultiprocessStep])

  • num_processes (int)

  • offset_preprocessing (bool)

  • output_tables (activitysim.core.configuration.top.OutputTables)

  • recode_pipeline_columns (bool)

  • resume_after (str)

  • rotate_logs (bool)

  • sharrow (Union[bool, str])

  • testing_fail_trip_destination (bool)

  • trace_hh_id (Union[int, list])

  • trace_od (list[int])

  • use_shadow_pricing (bool)

  • want_dest_choice_presampling (bool)

  • want_dest_choice_sample_tables (bool)

  • write_raw_tables (bool)

field benchmarking: bool = False#

Flag this model run as a benchmarking run.

New in version 1.1.

This is generally a developer-only feature and not needed for regular usage of ActivitySim.

By flagging a model run as a benchmark, certain operations of the model are altered, to ensure valid benchmark readings. For example, in regular operation, data such as skims are loaded on-demand within the first model component that needs them. With benchmarking enabled, all data are always pre-loaded before any component is run, to ensure that recorded times are the runtime of the component itself, and not data I/O operations that are neither integral to that component nor necessarily stable over replication.

field check_for_variability: bool = False#

Debugging feature to find broken model specifications.

Enabling this check does not alter valid results but slows down model runs.

field checkpoints: Union[bool, list] = True#

When to write checkpoint (intermediate table states) to disk.

If True, checkpoints are written at each step. If False, no intermediate checkpoints will be written before the end of run. Or, provide an explicit list of models to checkpoint.

field chunk_method: str = None#

Memory use measure to use for chunking.

See Chunk.

field chunk_size: int = None#

Approximate amount of RAM to allocate to ActivitySim for batch processing.

See Chunk for more details.

field chunk_training_mode: str = None#

The method to use for chunk training.

Valid values include {disabled, training, production, adaptive}. See Chunk for more details.

field cleanup_pipeline_after_run: bool = False#

Cleans up pipeline after successful run.

This will clean up pipeline only after successful runs, by creating a single-checkpoint pipeline file, and deleting any subprocess pipelines.

field cleanup_trace_files_on_resume: bool = False#

Clean all trace files when restarting a model from a checkpoint.

field create_input_store: bool = False#

Write the inputs as read in back to an HDF5 store.

If enabled, this writes the store to the outputs folder to use for subsequent model runs, as reading HDF5 can be faster than reading CSV files.

field disable_destination_sampling: bool = False#
field disable_zarr: bool = False#

Disable the use of zarr format skims.

New in version 1.2.

By default, if sharrow is enabled (any setting other than false), ActivitySim currently loads data from zarr format skims if a zarr location is provided, and data is found there. If no data is found there, then original OMX skim data is loaded, any transformations or encodings are applied, and then this data is written out to a zarr file at that location. Setting this option to True will disable the use of zarr.

field fail_fast: bool = False#
field households_sample_size: int = None#

Number of households to sample and simulate

If omitted or set to 0, ActivitySim will simulate all households.

field input_store: str = None#

HDF5 inputs file

field input_table_list: list[activitysim.core.configuration.top.InputTable] [Required]#

list of table names, indices, and column re-maps for each table in input_store

field instrument: bool = False#

Use pyinstrument to profile component performance.

New in version 1.2.

This is generally a developer-only feature and not needed for regular usage of ActivitySim.

Use of this setting to enable statistical profiling of ActivitySim code, using the pyinstrument library (an optional dependency which must also be installed). A separate profiling session is triggered for each model component. See the pyinstrument documentation for a description of how this tool works.

When activated, a “profiling–*” directory is created in the output directory of the model, tagged with the date and time of the profiling run. Profile output is always tagged like this and never overwrites previous profiling outputs, facilitating serial comparisons of runtimes in response to code or configuration changes.

field keep_mem_logs: bool = False#
field log_alt_losers: bool = False#

Write out expressions when all alternatives are unavailable.

This can be useful for model development to catch errors in specifications. Enabling this check does not alter valid results but slows down model runs.

field memory_profile: bool = False#

Generate a memory profile by sampling memory usage from a secondary process.

New in version 1.2.

This is generally a developer-only feature and not needed for regular usage of ActivitySim.

Using this feature will open a secondary process, whose only job is to poll memory usage for the main ActivitySim process. The usage is logged to a file with time stamps, so it can be cross-referenced against ActivitySim logs to identify what parts of the code are using RAM. The profiling is done from a separate process to avoid the profiler itself from significantly slowing the main model core, or (more importantly) generating memory usage on its own that pollutes the collected data.

field models: list[str] [Required]#

list of model steps to run - auto ownership, tour frequency, etc.

See Pipeline for more details about each step.

field multiprocess: bool = False#

Enable multiprocessing for this model.

field multiprocess_steps: list[activitysim.core.configuration.top.MultiprocessStep] [Required]#

A list of multiprocess steps.

field num_processes: int = None#

If running in multiprocessing mode, use this number of processes by default.

If not given or set to 0, the number of processes to use is set to half the number of available CPU cores, plus 1.

field offset_preprocessing: bool = False#

Flag to indicate whether offset preprocessing has already been done.

New in version 1.2.

This flag is generally set automatically within ActivitySim during a run, and not be a user ahead of time. The ability to do so is provided as a developer-only feature for testing and development.

field output_tables: activitysim.core.configuration.top.OutputTables = None#

list of output tables to write to CSV or HDF5

field recode_pipeline_columns: bool = True#

Apply recoding instructions on input and final output for pipeline tables.

New in version 1.2.

Recoding instructions can be provided in individual InputTable.recode_columns and OutputTable.decode_columns settings. This global setting permits disabling all recoding processes simultaneously.

Warning

Disabling recoding is fine in legacy mode but it is generally not compatible with using Settings.sharrow.

field resume_after: str = None#

to resume running the data pipeline after the last successful checkpoint

field rotate_logs: bool = False#
field sharrow: Union[bool, str] = False#

Set the sharrow operating mode.

New in version 1.2.

  • false - Do not use sharrow. This is the default if no value is given.

  • true - Use sharrow optimizations when possible, but fall back to legacy pandas.eval systems when any error is encountered. This is the preferred mode for running with sharrow if reliability is more important than performance.

  • require - Use sharrow optimizations, and raise an error if they fail unexpectedly. This is the preferred mode for running with sharrow if performance is a concern.

  • test - Run every relevant calculation using both sharrow and legacy systems, and compare them to ensure the results match. This is the slowest mode of operation, but useful for development and debugging.

field testing_fail_trip_destination: bool = False#
field trace_hh_id: Union[int, list] = None#

Trace household id(s)

If omitted, no tracing is written out

field trace_od: list[int] = None#

Trace origin, destination pair in accessibility calculation

If omitted, no tracing is written out.

field use_shadow_pricing: bool = False#

turn shadow_pricing on and off for work and school location

field want_dest_choice_presampling: bool = False#
field want_dest_choice_sample_tables: bool = False#

turn writing of sample_tables on and off for all models

field write_raw_tables: bool = False#

Dump input tables back to disk immediately after loading them.

This is generally a developer-only feature and not needed for regular usage of ActivitySim.

The data tables are written out before any annotation steps, but after initial processing (renaming, filtering columns, recoding).