Disaggregate Accessibility#

The disaggregate accessibility model is an extension of the base accessibility model. While the base accessibility model is based on a mode-specific decay function and uses fixed market segments in the population (i.e., income), the disaggregate accessibility model extracts the actual destination choice logsums by purpose (i.e., mandatory fixed school/work location and non-mandatory tour destinations by purpose) from the actual model calculations using a user-defined proto-population. This enables users to include features that may be more critical to destination choice than just income (e.g., automobile ownership).

Structure#

Inputs

  • disaggregate_accessibility.yaml - Configuration settings for disaggregate accessibility model.

  • annotate.csv [optional] - Users can specify additional annotations specific to disaggregate accessibility. For example, annotating the proto-population tables.

Outputs

  • final_disaggregate_accessibility.csv [optional]

  • final_non_mandatory_tour_destination_accesibility.csv [optional]

  • final_workplace_location_accessibility.csv [optional]

  • final_school_location_accessibility.csv [optional]

  • final_proto_persons.csv [optional]

  • final_proto_households.csv [optional]

  • final_proto_tours.csv [optional]

The above tables are created in the model pipeline, but the model will not save any outputs unless specified in settings.yaml - output_tables. Users can return the proto population tables for inspection, as well as the raw logsum accessibilities for mandatory school/work and non-mandatory destinations. The logsums are then merged at the household level in final_disaggregate_accessibility.csv, which each tour purpose logsums shown as separate columns.

Usage

The disaggregate accessibility model is run as a model step in the model list. There are two necessary steps:

  • initialize_proto_population

  • compute_disaggregate_accessibility

The reason the steps must be separate is to enable multiprocessing. The proto-population must be fully generated and initialized before activitysim slices the tables into separate threads. These steps must also occur before initialize_households in order to avoid conflict with the shadow_pricing model.

The model steps can be run either as part the activitysim model run, or setup to run as a standalone run to pre-computing the accessibility values. For standalone implementations, the final_disaggregate_accessibility.csv is read into the pipeline and initialized with the initialize_household model step.

  • Configuration File: disaggregate_accessibility.yaml

  • Core Table: Users define the variables to be generated for ‘PROTO_HOUSEHOLDS’, ‘PROTO_PERSONS’, and ‘PROTO_TOURS’ tables. These tables must include all basic fields necessary for running the actual model. Additional fields can be annotated in pre-processing using the annotation settings of this file.

Configuration#

settings activitysim.abm.models.disaggregate_accessibility.DisaggregateAccessibilitySettings#

Bases: PydanticReadable

Config:
  • extra: str = forbid

Fields:
field BASE_RANDOM_SEED: int = 0#
field CREATE_TABLES: dict[str, DisaggregateAccessibilityTableSettings | str] = {}#
field DESTINATION_SAMPLE_SIZE: float | int = 0#

Number of destination zone alternatives sampled for calculating the destination logsum.

Setting this to zero implies sampling all zones.

Decimal values < 1 will be interpreted as a percentage, e.g., 0.5 = 50% sample.

field FROM_TEMPLATES: bool = False#
field KEEP_COLS: list[str] | None = None#

Disaggreate accessibility table is grouped by the “by” cols above and the KEEP_COLS are averaged across the group. Initializing the below as NA if not in the auto ownership level, they are skipped in the groupby mean and the values are correct. (It’s a way to avoid having to update code to reshape the table and introduce new functionality there.) If none, will keep all of the columns with “accessibility” in the name.

field MERGE_ON: dict[str, list[str]] [Required]#

Field to merge the proto-population logsums onto the full synthetic population/

The proto-population should be designed such that the logsums are able to be joined exactly on these variables specified to the full population. Users specify the to join on using:

  • by: An exact merge will be attempted using these discrete variables.

  • asof [optional]: The model can peform an “asof” join for continuous variables, which finds the nearest value. This method should not be necessary since synthetic populations are all discrete.

  • method [optional]: Optional join method can be “soft”, default is None. For cases where a full inner join is not possible, a Naive Bayes clustering method is fast but discretely constrained method. The proto-population is treated as the “training data” to match the synthetic population value to the best possible proto-population candidate. The Some refinement may be necessary to make this procedure work.

field NEAREST_METHOD: str = 'skims'#
field ORIGIN_SAMPLE_METHOD: Literal[None, 'full', 'uniform', 'uniform-taz', 'kmeans'] = None#

The method in which origins are sampled.

Population weighted sampling can be TAZ-based or “TAZ-agnostic” using KMeans clustering. The potential advantage of KMeans is to provide a more geographically even spread of MAZs sampled that do not rely on TAZ hierarchies. Unweighted sampling is also possible using ‘uniform’ and ‘uniform-taz’.

  • None [Default] - Sample zones weighted by population, ensuring at least one TAZ is sampled per MAZ. If n-samples > n-tazs then sample 1 MAZ from each TAZ until n-remaining-samples < n-tazs, then sample n-remaining-samples TAZs and sample an MAZ within each of those TAZs. If n-samples < n-tazs, then it proceeds to the above ‘then’ condition.

  • “kmeans” - K-Means clustering is performed on the zone centroids (must be provided as maz_centroids.csv), weighted by population. The clustering yields k XY coordinates weighted by zone population for n-samples = k-clusters specified. Once k new cluster centroids are found, these are then approximated into the nearest available zone centroid and used to calculate accessibilities on. By default, the k-means method is run on 10 different initial cluster seeds (n_init) using using [k-means++ seeding algorithm](https://en.wikipedia.org/wiki/K-means%2B%2B). The k-means method runs for max_iter iterations (default=300).

  • “uniform” - Unweighted sample of N zones independent of each other.

  • “uniform-taz” - Unweighted sample of 1 zone per taz up to the N samples specified.

field ORIGIN_SAMPLE_SIZE: float | int = 0#

The number of sampled origins where logsum is calculated.

Setting this to zero implies sampling all zones.

Origins without a logsum will draw from the nearest zone with a logsum. This parameter is useful for systems with a large number of zones with similar accessibility. Fractional values less than 1 will be interpreted as a percentage, e.g., 0.5 = 50% sample.

field ORIGIN_WEIGHTING_COLUMN: str [Required]#
field add_size_tables: bool = True#
field annotate_proto_tables: list[DisaggregateAccessibilityAnnotateSettings] = []#

Allows modification of the proto-population.

Annotation configurations are available here, if users wish to modify the proto-population beyond basic generation in the YAML.

field explicit_chunk: float | None = None#

If > 0, use this chunk size instead of adaptive chunking. If less than 1, use this fraction of the total number of rows. If not supplied or None, will default to the chunk size in the location choice model settings.

field postprocess_proto_tables: list[DisaggregateAccessibilityAnnotateSettings] = []#

List of preprocessor settings to apply to the proto-population tables after generation.

field suffixes: DisaggregateAccessibilitySuffixes = DisaggregateAccessibilitySuffixes(source_file_paths=None, SUFFIX='proto_', ROOTS=['persons', 'households', 'tours', 'persons_merged', 'person_id', 'household_id', 'tour_id'])#
field zone_id_names: dict[str, str] = {'index_col': 'zone_id'}#

Examples#

Implementation#