Models#

The currently implemented example ActivitySim AB models are described below. See the example model Sub-Model Specification Files, Example ARC Sub-Model Specification Files, and Example SEMCOG Sub-Model Specification Files for more information.

Initialize#

The initialize model isn’t really a model, but rather a few data processing steps in the data pipeline. The initialize data processing steps code variables used in downstream models, such as household and person value-of-time. This step also pre-loads the land_use, households, persons, and person_windows tables because random seeds are set differently for each step and therefore the sampling of households depends on which step they are initially loaded in.

The main interface to the initialize land use step is the initialize_landuse() function. The main interface to the initialize household step is the initialize_households() function. The main interface to the initialize tours step is the initialize_tours() function. These functions are registered as Inject steps in the example Pipeline.

activitysim.abm.models.initialize.preload_injectables()#

preload bulky injectables up front - stuff that isn’t inserted into the pipeline

Initialize LOS#

The initialize LOS model isn’t really a model, but rather a series of data processing steps in the data pipeline. The initialize LOS model does two things:

  • Loads skims and cache for later if desired

  • Loads network LOS inputs for transit virtual path building (see Transit Virtual Path Builder), pre-computes tap-to-tap total utilities and cache for later if desired

The main interface to the initialize LOS step is the initialize_los() function. The main interface to the initialize TVPB step is the initialize_tvpb() function. These functions are registered as Inject steps in the example Pipeline.

activitysim.abm.models.initialize_los.initialize_los(network_los)#

Currently, this step is only needed for THREE_ZONE systems in which the tap_tap_utilities are precomputed in the (presumably subsequent) initialize_tvpb step.

Adds attribute_combinations_df table to the pipeline so that it can be used to as the slicer for multiprocessing the initialize_tvpb s.tep

FIXME - this step is only strictly necessary when multiprocessing, but initialize_tvpb would need to be tweaked FIXME - to instantiate attribute_combinations_df if the pipeline table version were not available.

activitysim.abm.models.initialize_los.initialize_tvpb(network_los, attribute_combinations, chunk_size)#

Initialize STATIC tap_tap_utility cache and write mmap to disk.

uses pipeline attribute_combinations table created in initialize_los to determine which attribute tuples to compute utilities for.

if we are single-processing, this will be the entire set of attribute tuples required to fully populate cache

if we are multiprocessing, then the attribute_combinations will have been sliced and we compute only a subset of the tuples (and the other processes will compute the rest). All process wait until the cache is fully populated before returning, and the locutor process writes the results.

FIXME - if we did not close this, we could avoid having to reload it from mmap when single-process?

Accessibility#

The accessibilities model is an aggregate model that calculates multiple origin-based accessibility measures by origin zone to all destination zones.

The accessibility measure first multiplies an employment variable by a mode-specific decay function. The product reflects the difficulty of accessing the activities the farther (in terms of round-trip travel time) the jobs are from the location in question. The products to each destination zone are next summed over each origin zone, and the logarithm of the product mutes large differences. The decay function on the walk accessibility measure is steeper than automobile or transit. The minimum accessibility is zero.

Level-of-service variables from three time periods are used, specifically the AM peak period (6 am to 10 am), the midday period (10 am to 3 pm), and the PM peak period (3 pm to 7 pm).

Inputs

  • Highway skims for the three periods. Each skim is expected to include a table named “TOLLTIMEDA”, which is the drive alone in-vehicle travel time for automobiles willing to pay a “value” (time-savings) toll.

  • Transit skims for the three periods. Each skim is expected to include the following tables: (i) “IVT”, in-vehicle time; (ii) “IWAIT”, initial wait time; (iii) “XWAIT”, transfer wait time; (iv) “WACC”, walk access time; (v) “WAUX”, auxiliary walk time; and, (vi) “WEGR”, walk egress time.

  • Zonal data with the following fields: (i) “TOTEMP”, total employment; (ii) “RETEMPN”, retail trade employment per the NAICS classification.

Outputs

  • taz, travel analysis zone number

  • autoPeakRetail, the accessibility by automobile during peak conditions to retail employment for this TAZ

  • autoPeakTotal, the accessibility by automobile during peak conditions to all employment

  • autoOffPeakRetail, the accessibility by automobile during off-peak conditions to retail employment

  • autoOffPeakTotal, the accessibility by automobile during off-peak conditions to all employment

  • transitPeakRetail, the accessibility by transit during peak conditions to retail employment

  • transitPeakTotal, the accessibility by transit during peak conditions to all employment

  • transitOffPeakRetail, the accessiblity by transit during off-peak conditions to retail employment

  • transitOffPeakTotal, the accessiblity by transit during off-peak conditions to all employment

  • nonMotorizedRetail, the accessibility by walking during all time periods to retail employment

  • nonMotorizedTotal, the accessibility by walking during all time periods to all employment

The main interface to the accessibility model is the compute_accessibility() function. This function is registered as an Inject step in the example Pipeline.

Core Table: skims | Result Table: accessibility | Skims Keys: O-D, D-O

activitysim.abm.models.accessibility.compute_accessibility(land_use, accessibility, network_los, chunk_size, trace_od)#

Compute accessibility for each zone in land use file using expressions from accessibility_spec

The actual results depend on the expressions in accessibility_spec, but this is initially intended to permit implementation of the mtc accessibility calculation as implemented by Accessibility.job

Compute measures of accessibility used by the automobile ownership model. The accessibility measure first multiplies an employment variable by a mode-specific decay function. The product reflects the difficulty of accessing the activities the farther (in terms of round-trip travel time) the jobs are from the location in question. The products to each destination zone are next summed over each origin zone, and the logarithm of the product mutes large differences. The decay function on the walk accessibility measure is steeper than automobile or transit. The minimum accessibility is zero.

Disaggregate Accessibility#

The disaggregate accessibility model is an extension of the base accessibility model. While the base accessibility model is based on a mode-specific decay function and uses fixed market segments in the population (i.e., income), the disaggregate accessibility model extracts the actual destination choice logsums by purpose (i.e., mandatory fixed school/work location and non-mandatory tour destinations by purpose) from the actual model calculations using a user-defined proto-population. This enables users to include features that may be more critical to destination choice than just income (e.g., automobile ownership).

Inputs:
  • disaggregate_accessibility.yaml - Configuration settings for disaggregate accessibility model.

  • annotate.csv [optional] - Users can specify additional annotations specific to disaggregate accessibility. For example, annotating the proto-population tables.

Outputs:
  • final_disaggregate_accessibility.csv [optional]

  • final_non_mandatory_tour_destination_accesibility.csv [optional]

  • final_workplace_location_accessibility.csv [optional]

  • final_school_location_accessibility.csv [optional]

  • final_proto_persons.csv [optional]

  • final_proto_households.csv [optional]

  • final_proto_tours.csv [optional]

The above tables are created in the model pipeline, but the model will not save any outputs unless specified in settings.yaml - output_tables. Users can return the proto population tables for inspection, as well as the raw logsum accessibilities for mandatory school/work and non-mandatory destinations. The logsums are then merged at the household level in final_disaggregate_accessibility.csv, which each tour purpose logsums shown as separate columns.

Usage The disaggregate accessibility model is run as a model step in the model list. There are two necessary steps:

- initialize_proto_population | - compute_disaggregate_accessibility

The reason the steps must be separate is to enable multiprocessing. The proto-population must be fully generated and initialized before activitysim slices the tables into separate threads. These steps must also occur before initialize_households in order to avoid conflict with the shadow_pricing model.

The model steps can be run either as part the activitysim model run, or setup to run as a standalone run to pre-computing the accessibility values. For standalone implementations, the final_disaggregate_accessibility.csv is read into the pipeline and initialized with the initialize_household model step.

Configuration of disaggregate_accessibility.yaml:
  • CREATE_TABLES - Users define the variables to be generated for PROTO_HOUSEHOLDS, PROTO_PERSONS, and PROTO_TOURS tables. These tables must include all basic fields necessary for running the actual model. Additional fields can be annotated in pre-processing using the annotation settings of this file. The base variables in each table are defined using the following parameters:

    • VARIABLES - The base variable, must be a value or a list. Results in the cartesian product (all non-repeating combinations) of the fields.

    • mapped_fields [optional] - For non-combinatorial fields, users can map a variable to the fields generated in VARIABLES (e.g., income category bins mapped to median dollar values).

    • filter_rows [optional] - Users can also filter rows using pandas expressions if specific variable combinations are not desired.

    • JOIN_ON [required only for PROTO_TOURS] - specify the persons variable to join the tours to (e.g., person_number).

  • MERGE_ON - User specified fields to merge the proto-population logsums onto the full synthetic population. The proto-population should be designed such that the logsums are able to be joined exactly on these variables specified to the full population. Users specify the to join on using:

    • by: An exact merge will be attempted using these discrete variables.

    • asof [optional]: The model can peform an “asof” join for continuous variables, which finds the nearest value. This method should not be necessary since synthetic populations are all discrete.

    • method [optional]: Optional join method can be “soft”, default is None. For cases where a full inner join is not possible, a Naive Bayes clustering method is fast but discretely constrained method. The proto-population is treated as the “training data” to match the synthetic population value to the best possible proto-population candidate. The Some refinement may be necessary to make this procedure work.

  • annotate_proto_tables [optional] - Annotation configurations if users which to modify the proto-population beyond basic generation in the YAML.

  • DESTINATION_SAMPLE_SIZE - The destination sample size (0 = all zones), e.g., the number of destination zone alternatives sampled for calculating the destination logsum. Decimal values < 1 will be interpreted as a percentage, e.g., 0.5 = 50% sample.

  • ORIGIN_SAMPLE_SIZE - The origin sample size (0 = all zones), e.g., the number of origins where logsum is calculated. Origins without a logsum will draw from the nearest zone with a logsum. This parameter is useful for systems with a large number of zones with similar accessibility. Decimal values < 1 will be interpreted as a percentage, e.g., 0.5 = 50% sample.

  • ORIGIN_SAMPLE_METHOD - The method in which origins are sampled. Population weighted sampling can be TAZ-based or “TAZ-agnostic” using KMeans clustering. The potential advantage of KMeans is to provide a more geographically even spread of MAZs sampled that do not rely on TAZ hierarchies. Unweighted sampling is also possible using ‘uniform’ and ‘uniform-taz’.

    • None [Default] - Sample zones weighted by population, ensuring at least one TAZ is sampled per MAZ. If n-samples > n-tazs then sample 1 MAZ from each TAZ until n-remaining-samples < n-tazs, then sample n-remaining-samples TAZs and sample an MAZ within each of those TAZs. If n-samples < n-tazs, then it proceeds to the above ‘then’ condition.

    • “kmeans” - K-Means clustering is performed on the zone centroids (must be provided as maz_centroids.csv), weighted by population. The clustering yields k XY coordinates weighted by zone population for n-samples = k-clusters specified. Once k new cluster centroids are found, these are then approximated into the nearest available zone centroid and used to calculate accessibilities on. By default, the k-means method is run on 10 different initial cluster seeds (n_init) using using “k-means++” seeding algorithm (https://en.wikipedia.org/wiki/K-means%2B%2B). The k-means method runs for max_iter iterations (default=300).

    • “uniform” - Unweighted sample of N zones independent of each other.

    • “uniform-taz” - Unweighted sample of 1 zone per taz up to the N samples specified.

Work From Home#

Telecommuting is defined as workers who work from home instead of going to work. It only applies to workers with a regular workplace outside of home. The telecommute model consists of two submodels - this work from home model and a person Telecommute Frequency model. This model predicts for all workers whether they usually work from home.

The work from home model includes the ability to adjust a work from home alternative constant to attempt to realize a work from home percent for what-if type analysis. This iterative single process procedure takes as input a number of iterations, a filter on the choosers to use for the calculation, a target work from home percent, a tolerance percent for convergence, and the name of the coefficient to adjust. An example setup is provided and the coefficient adjustment at each iteration is: new_coefficient = log( target_percent / current_percent ) + current_coefficient.

The main interface to the work from home model is the work_from_home() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Field: work_from_home | Skims Keys: NA

activitysim.abm.models.work_from_home.work_from_home(persons_merged, persons, chunk_size, trace_hh_id)#

This model predicts whether a person (worker) works from home. The output from this model is TRUE (if works from home) or FALSE (works away from home). The workplace location choice is overridden for workers who work from home and set to -1.

School Location#

The usual school location choice models assign a usual school location for the primary mandatory activity of each child and university student in the synthetic population. The models are composed of a set of accessibility-based parameters (including one-way distance between home and primary destination and the tour mode choice logsum - the expected maximum utility in the mode choice model which is given by the logarithm of the sum of exponentials in the denominator of the logit formula) and size terms, which describe the quantity of grade-school or university opportunities in each possible destination.

The school location model is made up of four steps:
  • sampling - selects a sample of alternative school locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative school location.

  • simulate - starts with the table created above and chooses a final school location, this time with the mode choice logsum included.

  • shadow prices - compare modeled zonal destinations to target zonal size terms and calculate updated shadow prices.

These steps are repeated until shadow pricing convergence criteria are satisfied or a max number of iterations is reached. See Shadow Pricing.

School location choice for placeholder_multiple_zone models uses Presampling by default.

The main interfaces to the model is the school_location() function. This function is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: persons | Result Field: school_taz | Skims Keys: TAZ, alt_dest, AM time period, MD time period

Work Location#

The usual work location choice models assign a usual work location for the primary mandatory activity of each employed person in the synthetic population. The models are composed of a set of accessibility-based parameters (including one-way distance between home and primary destination and the tour mode choice logsum - the expected maximum utility in the mode choice model which is given by the logarithm of the sum of exponentials in the denominator of the logit formula) and size terms, which describe the quantity of work opportunities in each possible destination.

The work location model is made up of four steps:
  • sample - selects a sample of alternative work locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative work location.

  • simulate - starts with the table created above and chooses a final work location, this time with the mode choice logsum included.

  • shadow prices - compare modeled zonal destinations to target zonal size terms and calculate updated shadow prices.

These steps are repeated until shadow pricing convergence criteria are satisfied or a max number of iterations is reached. See Shadow Pricing.

Work location choice for placeholder_multiple_zone models uses Presampling by default.

The main interfaces to the model is the workplace_location() function. This function is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: persons | Result Field: workplace_taz | Skims Keys: TAZ, alt_dest, AM time period, PM time period

activitysim.abm.models.location_choice.iterate_location_choice(model_settings, persons_merged, persons, households, network_los, estimator, chunk_size, trace_hh_id, locutor, trace_label)#

iterate run_location_choice updating shadow pricing until convergence criteria satisfied or max_iterations reached.

(If use_shadow_pricing not enabled, then just iterate once)

Parameters
model_settingsdict
persons_mergedinjected table
personsinjected table
network_loslos.Network_LOS
chunk_sizeint
trace_hh_idint
locutorbool

whether this process is the privileged logger of shadow_pricing when multiprocessing

trace_labelstr
Returns
adds choice column model_settings[‘DEST_CHOICE_COLUMN_NAME’]
adds logsum column model_settings[‘DEST_CHOICE_LOGSUM_COLUMN_NAME’]- if provided
adds annotations to persons table
activitysim.abm.models.location_choice.run_location_choice(persons_merged_df, network_los, shadow_price_calculator, want_logsums, want_sample_table, estimator, model_settings, chunk_size, chunk_tag, trace_hh_id, trace_label, skip_choice=False)#

Run the three-part location choice algorithm to generate a location choice for each chooser

Handle the various segments separately and in turn for simplicity of expression files

Parameters
persons_merged_dfpandas.DataFrame

persons table merged with households and land_use

network_loslos.Network_LOS
shadow_price_calculatorShadowPriceCalculator

to get size terms

want_logsumsboolean
want_sample_tableboolean
estimator: Estimator object
model_settingsdict
chunk_sizeint
trace_hh_idint
trace_labelstr
Returns
choicespandas.DataFrame indexed by persons_merged_df.index

‘choice’ : location choices (zone ids) ‘logsum’ : float logsum of choice utilities across alternatives

logsums optional & only returned if DEST_CHOICE_LOGSUM_COLUMN_NAME specified in model_settings
activitysim.abm.models.location_choice.run_location_logsums(segment_name, persons_merged_df, network_los, location_sample_df, model_settings, chunk_size, chunk_tag, trace_label)#

add logsum column to existing location_sample table

logsum is calculated by running the mode_choice model for each sample (person, dest_zone_id) pair in location_sample, and computing the logsum of all the utilities

PERID

dest_zone_id

rand

pick_count

logsum (added)

23750

14

0.565502716034

4

1.85659498857

23750

16

0.711135838871

6

1.92315598631

23751

12

0.408038878552

1

2.40612135416

23751

14

0.972732479292

2

1.44009018355

activitysim.abm.models.location_choice.run_location_sample(segment_name, persons_merged, network_los, dest_size_terms, estimator, model_settings, chunk_size, chunk_tag, trace_label)#

select a sample of alternative locations.

Logsum calculations are expensive, so we build a table of persons * all zones and then select a sample subset of potential locations

The sample subset is generated by making multiple choices (<sample_size> number of choices) which results in sample containing up to <sample_size> choices for each choose (e.g. person) and a pick_count indicating how many times that choice was selected for that chooser.)

person_id, dest_zone_id, rand, pick_count 23750, 14, 0.565502716034, 4 23750, 16, 0.711135838871, 6 … 23751, 12, 0.408038878552, 1 23751, 14, 0.972732479292, 2

activitysim.abm.models.location_choice.run_location_simulate(segment_name, persons_merged, location_sample_df, network_los, dest_size_terms, want_logsums, estimator, model_settings, chunk_size, chunk_tag, trace_label, skip_choice=False)#

run location model on location_sample annotated with mode_choice logsum to select a dest zone from sample alternatives

Returns
choicespandas.DataFrame indexed by persons_merged_df.index

choice : location choices (zone ids) logsum : float logsum of choice utilities across alternatives

logsums optional & only returned if DEST_CHOICE_LOGSUM_COLUMN_NAME specified in model_settings
activitysim.abm.models.location_choice.school_location(persons_merged, persons, households, network_los, chunk_size, trace_hh_id, locutor)#

School location choice model

iterate_location_choice adds location choice column and annotations to persons table

activitysim.abm.models.location_choice.workplace_location(persons_merged, persons, households, network_los, chunk_size, trace_hh_id, locutor)#

workplace location choice model

iterate_location_choice adds location choice column and annotations to persons table

activitysim.abm.models.location_choice.write_estimation_specs(estimator, model_settings, settings_file)#

write sample_spec, spec, and coefficients to estimation data bundle

Parameters
model_settings
settings_file

Shadow Pricing#

The shadow pricing calculator used by work and school location choice.

Turning on and saving shadow prices

Shadow pricing is activated by setting the use_shadow_pricing to True in the settings.yaml file. Once this setting has been activated, ActivitySim will search for shadow pricing configuration in the shadow_pricing.yaml file. When shadow pricing is activated, the shadow pricing outputs will be exported by the tracing engine. As a result, the shadow pricing output files will be prepended with trace followed by the iteration number the results represent. For example, the shadow pricing outputs for iteration 3 of the school location model will be called trace.shadow_price_school_shadow_prices_3.csv.

In total, ActivitySim generates three types of output files for each model with shadow pricing:

  • trace.shadow_price_<model>_desired_size.csv The size terms by zone that the ctramp and daysim methods are attempting to target. These equal the size term columns in the land use data multiplied by size term coefficients.

  • trace.shadow_price_<model>_modeled_size_<iteration>.csv These are the modeled size terms after the iteration of shadow pricing identified by the <iteration> number. In other words, these are the predicted choices by zone and segment for the model after the iteration completes. (Not applicable for simulation option.)

  • trace.shadow_price_<model>_shadow_prices_<iteration>.csv The actual shadow price for each zone and segment after the <iteration> of shadow pricing. This is the file that can be used to warm start the shadow pricing mechanism in ActivitySim. (Not applicable for simulation option.)

There are three shadow pricing methods in activitysim: ctramp, daysim, and simulation. The first two methods try to match model output with workplace/school location model size terms, while the last method matches model output with actual employment/enrollmment data.

The simulation approach operates the following steps. First, every worker / student will be assigned without shadow prices applied. The modeled share and the target share for each zone are compared. If the zone is overassigned, a sample of people from the over-assigned zones will be selected for re-simulation. Shadow prices are set to -999 for the next iteration for overassigned zones which removes the zone from the set of alternatives in the next iteration. The sampled people will then be forced to choose from one of the under-assigned zones that still have the initial shadow price of 0. (In this approach, the shadow price variable is really just a switch turning that zone on or off for selection in the subsequent iterations. For this reason, warm-start functionality for this approach is not applicable.) This process repeats until the overall convergence criteria is met or the maximum number of allowed iterations is reached.

Because the simulation approach only re-simulates workers / students who were over-assigned in the previous iteration, run time is significantly less (~90%) than the CTRAMP or DaySim approaches which re-simulate all workers and students at each iteration.

shadow_pricing.yaml Attributes

  • shadow_pricing_models List model_selectors and model_names of models that use shadow pricing. This list identifies which size_terms to preload which must be done in single process mode, so predicted_size tables can be scaled to population

  • LOAD_SAVED_SHADOW_PRICES global switch to enable/disable loading of saved shadow prices. From the above example, this would be trace.shadow_price_<model>_shadow_prices_<iteration>.csv renamed and stored in the data_dir.

  • MAX_ITERATIONS If no loaded shadow prices, maximum number of times shadow pricing can be run on each model before proceeding to the next model.

  • MAX_ITERATIONS_SAVED If loaded shadow prices, maximum number of times shadow pricing can be run.

  • SIZE_THRESHOLD Ignore zones in failure calculation (ctramp or daysim method) with smaller size term value than size_threshold.

  • TARGET_THRESHOLD Ignore zones in failure calculation (simulation method) with smaller employment/enrollment than target_threshold.

  • PERCENT_TOLERANCE Maximum percent difference between modeled and desired size terms

  • FAIL_THRESHOLD percentage of zones exceeding the PERCENT_TOLERANCE considered a failure

  • SHADOW_PRICE_METHOD [ctramp | daysim | simulation]

  • workplace_segmentation_targets dict matching school segment to landuse employment column target. Only used as part of simulation option. If mutiple segments list the same target column, the segments will be added together for comparison. (Same with the school option below.)

  • school_segmentation_targets dict matching school segment to landuse enrollment column target. Only used as part of simulation option.

  • DAMPING_FACTOR On each iteration, ActivitySim will attempt to adjust the model to match desired size terms. The number is multiplied by adjustment factor to dampen or amplify the ActivitySim calculation. (only for CTRAMP)

  • DAYSIM_ABSOLUTE_TOLERANCE Absolute tolerance for DaySim option

  • DAYSIM_PERCENT_TOLERANCE Relative tolerance for DaySim option

  • WRITE_ITERATION_CHOICES [True | False ] Writes the choices of each person out to the trace folder. Used for debugging or checking itration convergence. WARNING: every person is written for each sub-process so the disc space can get large.

activitysim.abm.tables.shadow_pricing.block_name(model_selector)#

return canonical block name for model_selector

Ordinarily and ideally this would just be model_selector, but since mp_tasks saves all shared data blocks in a common dict to pass to sub-tasks, we want to be able override block naming convention to handle any collisions between model_selector names and skim names. Until and unless that happens, we just use model_selector name.

Parameters
model_selector
Returns
block_namestr

canonical block name

activitysim.abm.tables.shadow_pricing.buffers_for_shadow_pricing(shadow_pricing_info)#

Allocate shared_data buffers for multiprocess shadow pricing

Allocates one buffer per model_selector. Buffer datatype and shape specified by shadow_pricing_info

buffers are multiprocessing.Array (RawArray protected by a multiprocessing.Lock wrapper) We don’t actually use the wrapped version as it slows access down and doesn’t provide protection for numpy-wrapped arrays, but it does provide a convenient way to bundle RawArray and an associated lock. (ShadowPriceCalculator uses the lock to coordinate access to the numpy-wrapped RawArray.)

Parameters
shadow_pricing_infodict
Returns
data_buffersdict {<model_selector><shared_data_buffer>}
dict of multiprocessing.Array keyed by model_selector
activitysim.abm.tables.shadow_pricing.buffers_for_shadow_pricing_choice(shadow_pricing_choice_info)#

Same as above buffers_for_shadow_price function except now we need to store the actual choices for the simulation based shadow pricing method

This allocates a multiprocessing.Array that can store the choice for each person and then wraps a dataframe around it. That means the dataframe can be shared and accessed across all threads. Parameters ———- shadow_pricing_info : dict Returns ——-

data_buffers : dict {<model_selector> : <shared_data_buffer>} dict of multiprocessing.Array keyed by model_selector

and wrapped in a pandas dataframe

activitysim.abm.tables.shadow_pricing.get_shadow_pricing_choice_info()#

return dict with info about dtype and shapes of desired and modeled size tables

block shape is (num_zones, num_segments + 1)

Returns
shadow_pricing_info: dict

dtype: <sp_dtype>, block_shapes: dict {<model_selector>: <block_shape>}

activitysim.abm.tables.shadow_pricing.get_shadow_pricing_info()#

return dict with info about dtype and shapes of desired and modeled size tables

block shape is (num_zones, num_segments + 1)

Returns
shadow_pricing_info: dict

dtype: <sp_dtype>, block_shapes: dict {<model_selector>: <block_shape>}

activitysim.abm.tables.shadow_pricing.load_shadow_price_calculator(model_settings)#

Initialize ShadowPriceCalculator for model_selector (e.g. school or workplace)

If multiprocessing, get the shared_data buffer to coordinate global_desired_size calculation across sub-processes

Parameters
model_settingsdict
Returns
spcShadowPriceCalculator
activitysim.abm.tables.shadow_pricing.logger = <Logger activitysim.abm.tables.shadow_pricing (WARNING)>#

ShadowPriceCalculator and associated utility methods

See docstrings for documentation on:

update_shadow_prices how shadow_price coefficients are calculated synchronize_modeled_size interprocess communication to compute aggregate modeled_size check_fit convergence criteria for shadow_pric iteration

Import concepts and variables:

model_selector: str

Identifies a specific location choice model (e.g. ‘school’, ‘workplace’) The various models work similarly, but use different expression files, model settings, etc.

segment: str

Identifies a specific demographic segment of a model (e.g. ‘elementary’ segment of ‘school’) Models can have different size term coefficients (in destinatin_choice_size_terms file) and different utility coefficients in models’s location and location_sample csv expression files

size_table: pandas.DataFrame

activitysim.abm.tables.shadow_pricing.shadow_price_data_from_buffers(data_buffers, shadow_pricing_info, model_selector)#
Parameters
data_buffersdict of {<model_selector><multiprocessing.Array>}

multiprocessing.Array is simply a convenient way to bundle Array and Lock we extract the lock and wrap the RawArray in a numpy array for convenience in indexing The shared data buffer has shape (<num_zones, <num_segments> + 1) extra column is for reverse semaphores with TALLY_CHECKIN and TALLY_CHECKOUT

shadow_pricing_infodict
dict of useful info

dtype: sp_dtype, block_shapes : OrderedDict({<model_selector>: <shape tuple>}) dict mapping model_selector to block shape (including extra column for semaphores) e.g. {‘school’: (num_zones, num_segments + 1)

model_selectorstr

location type model_selector (e.g. school or workplace)

Returns
shared_data, shared_data_lock

shared_data : multiprocessing.Array or None (if single process) shared_data_lock : numpy array wrapping multiprocessing.RawArray or None (if single process)

activitysim.abm.tables.shadow_pricing.shadow_price_data_from_buffers_choice(data_buffers, shadow_pricing_info, model_selector)#
Parameters
data_buffersdict of {<model_selector><multiprocessing.Array>}

multiprocessing.Array is simply a convenient way to bundle Array and Lock we extract the lock and wrap the RawArray in a numpy array for convenience in indexing The shared data buffer has shape (<num_zones, <num_segments> + 1) extra column is for reverse semaphores with TALLY_CHECKIN and TALLY_CHECKOUT

shadow_pricing_infodict
dict of useful info

dtype: sp_dtype, block_shapes : OrderedDict({<model_selector>: <shape tuple>}) dict mapping model_selector to block shape (including extra column for semaphores) e.g. {‘school’: (num_zones, num_segments + 1)

model_selectorstr

location type model_selector (e.g. school or workplace)

Returns
shared_data, shared_data_lock

shared_data : multiprocessing.Array or None (if single process) shared_data_lock : numpy array wrapping multiprocessing.RawArray or None (if single process)

activitysim.abm.tables.shadow_pricing.size_table_name(model_selector)#

Returns canonical name of injected destination desired_size table

Parameters
model_selectorstr

e.g. school or workplace

Returns
table_namestr

Transit Pass Subsidy#

The transit fare discount model is defined as persons who purchase or are provided a transit pass. The transit fare discount consists of two submodels - this transit pass subsidy model and a person Transit Pass Ownership model. The result of this model can be used to condition downstream models such as the person Transit Pass Ownership model and the tour and trip mode choice models via fare discount adjustments.

The main interface to the transit pass subsidy model is the transit_pass_subsidy() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Field: transit_pass_subsidy | Skims Keys: NA

activitysim.abm.models.transit_pass_subsidy.transit_pass_subsidy(persons_merged, persons, chunk_size, trace_hh_id)#

Transit pass subsidy model.

Transit Pass Ownership#

The transit fare discount is defined as persons who purchase or are provided a transit pass. The transit fare discount consists of two submodels - this transit pass ownership model and a person Transit Pass Subsidy model. The result of this model can be used to condition downstream models such as the tour and trip mode choice models via fare discount adjustments.

The main interface to the transit pass ownership model is the transit_pass_ownership() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Field: transit_pass_ownership | Skims Keys: NA

activitysim.abm.models.transit_pass_ownership.transit_pass_ownership(persons_merged, persons, chunk_size, trace_hh_id)#

Transit pass ownership model.

Auto Ownership#

The auto ownership model selects a number of autos for each household in the simulation. The primary model components are household demographics, zonal density, and accessibility.

The main interface to the auto ownership model is the auto_ownership_simulate() function. This function is registered as an Inject step in the example Pipeline.

Core Table: households | Result Field: auto_ownership | Skims Keys: NA

activitysim.abm.models.auto_ownership.auto_ownership_simulate(households, households_merged, chunk_size, trace_hh_id)#

Auto ownership is a standard model which predicts how many cars a household with given characteristics owns

Vehicle Type Choice#

The vehicle type choice model selects a vehicle type for each household vehicle. A vehicle type is a combination of the vehicle’s body type, age, and fuel type. For example, a 13 year old gas powered van would have a vehicle type of van_13_gas.

There are two vehicle type choice model structures implemented:

  1. Simultaneous choice of body type, age, and fuel type.

  2. Simultaneous choice of body type and age, with fuel type assigned from a probability distribution.

The vehicle_type_choice.yaml file contains the following model specific options:

  • SPEC: Filename for input utility expressions

  • COEFS: Filename for input utility expression coefficients

  • LOGIT_TYPE: Specifies whether you are using a nested or multinomial logit structure

  • combinatorial_alts: Specifies the alternatives for the choice model. Has sub-categories of body_type, age, and fuel_type.

  • PROBS_SPEC: Filename for input fuel type probabilities. Supplying probabilities corresponds to implementation structure 2 above, and not supplying probabilities would correspond to implementation structure 1. If provided, the fuel_type category in combinatorial_alts will be excluded from the model alternatives such that only body type and age are selected. Input PROBS_SPEC table will have an index column named vehicle_type which is a combination of body type and age in the form {body type}_{age}. Subsequent column names specify the fuel type that will be added and the column values are the probabilities of that fuel type. The vehicle type model will select a fuel type for each vehicle based on the provided probabilities.

  • VEHICLE_TYPE_DATA_FILE: Filename for input vehicle type data. Must have columns body_type, fuel_type, and vehicle_year. Vehicle age is computed using the FLEET_YEAR option. Data for every alternative specified in the combinatorial_alts option must be included in the file. Vehicle type data file will be joined to the alternatives and can be used in the utility expressions if PROBS_SPEC is not provided. If PROBS_SPEC is provided, the vehicle type data will be joined after a vehicle type is decided so the data can be used in downstream models.

  • COLS_TO_INCLUDE_IN_VEHICLE_TABLE: List of columns from the vehicle type data file to include in the vehicle table that can be used in downstream models. Examples of data that might be needed is vehicle range for the Vehicle Allocation model, auto operating costs to use in tour and trip mode choice, and emissions data for post-model-run analysis.

  • FLEET_YEAR: Integer specifying the fleet year to be used in the model run. This is used to compute age in the vehicle type data table where age = (1 + FLEET_YEAR - vehicle_year). Computing age on the fly with the FLEET_YEAR variable allows the user flexibility to compile and share a single vehicle type data file containing all years and simply change the FLEET_YEAR to run different scenario years.

  • Optional additional settings that work the same in other models are constants, expression preprocessor, and annotate tables.

Input vehicle type data included in prototype_mtc_extended came from a variety of sources. The number of vehicle makes, models, MPG, and electric vehicle range was sourced from the Enivornmental Protection Agency (EPA). Additional data on vehicle costs were derived from the National Household Travel Survey. Auto operating costs in the vehicle type data file were a sum of fuel costs and maintenance costs. Fuel costs were calculated from MPG assuming a $3.00 cost for a gallon of gas. When MPG was not available to calculate fuel costs, the closest year, vehicle type, or body type available was used. Maintenance costs were taken from AAA’s 2017 driving cost study. Size categories within body types were averaged, e.g. car was an average of AAA’s small, medium, and large sedan categories. Motorcycles were assigned the small sedan maintenance costs since they were not included in AAA’s report. Maintenance costs were not varied by vehicle year. (According to data from the U.S. Bureau of Labor Statistics, there was no consistent relationship between vehicle age and maintenance costs.)

Using the above methodology, the average auto operating costs of vehicles output from prototype_mtc_extended was 18.4 cents. This value is very close to the auto operating cost of 18.3 cents used in prototype_mtc. Non-household vehicles in prototype_mtc_extended use the auto operating cost of 18.3 cents used in prototype_mtc. Users are encouraged to make their own assumptions and calculate auto operating costs as they see fit.

The distribution of fuel type probabilities included in prototype_mtc_extended are computed directly from the National Household Travel Survey data and include the entire US. Therefore, there is “lumpiness” in probabilities due to poor statistics in the data for some vehicle types. The user is encouraged to adjust the probabilities to their modeling region and “smooth” them for more consistent results.

Further discussion of output results and model sensitivities can be found here.

activitysim.abm.models.vehicle_type_choice.annotate_vehicle_type_choice_households(model_settings, trace_label)#

Add columns to the households table in the pipeline according to spec.

Parameters
model_settingsdict
trace_labelstr
activitysim.abm.models.vehicle_type_choice.annotate_vehicle_type_choice_persons(model_settings, trace_label)#

Add columns to the persons table in the pipeline according to spec.

Parameters
model_settingsdict
trace_labelstr
activitysim.abm.models.vehicle_type_choice.annotate_vehicle_type_choice_vehicles(model_settings, trace_label)#

Add columns to the vehicles table in the pipeline according to spec.

Parameters
model_settingsdict
trace_labelstr
activitysim.abm.models.vehicle_type_choice.append_probabilistic_vehtype_type_choices(choices, model_settings, trace_label)#

Select a fuel type for the provided body type and age of the vehicle.

Make probabilistic choices based on the PROBS_SPEC file.

Parameters
choicespandas.DataFrame

selection of {body_type}_{age} to append vehicle type to

probs_spec_filestr
trace_labelstr
Returns
choicespandas.DataFrame

table of chosen vehicle types

activitysim.abm.models.vehicle_type_choice.construct_model_alternatives(model_settings, alts_cats_dict, vehicle_type_data)#

Construct the table of vehicle type alternatives.

Vehicle type data is joined to the alternatives table for use in utility expressions.

Parameters
model_settingsdict
alts_cats_dictdict

nested dictionary of vehicle body, age, and fuel options

vehicle_type_datapandas.DataFrame
Returns
alts_widepd.DataFrame

includes column indicators and data for each alternative

alts_longpd.DataFrame

rows just list the alternatives

activitysim.abm.models.vehicle_type_choice.get_combinatorial_vehicle_alternatives(alts_cats_dict)#

Build a pandas dataframe containing columns for each vehicle alternative.

Rows will correspond to the alternative number and will be 0 except for the 1 in the column corresponding to that alternative.

Parameters
alts_cats_dictdict
model_settingsdict
Returns
alts_widepd.DataFrame in wide format expanded using pandas get_dummies function
alts_longpd.DataFrame in long format
activitysim.abm.models.vehicle_type_choice.get_vehicle_type_data(model_settings, vehicle_type_data_file)#

Read in the vehicle type data and computes the vehicle age.

Parameters
model_settingsdict
vehicle_type_data_filestr

name of vehicle type data file found in config folder

Returns
vehicle_type_datapandas.DataFrame

table of vehicle type data with required body_type, age, and fuel_type columns

activitysim.abm.models.vehicle_type_choice.iterate_vehicle_type_choice(vehicles_merged, model_settings, model_spec, locals_dict, estimator, chunk_size, trace_label)#

Select vehicle type for each household vehicle sequentially.

Iterate through household vehicle numbers and select a vehicle type of the form {body_type}_{age}_{fuel_type}. The preprocessor is run for each iteration on the entire chooser table, not just the one for the current vehicle number. This allows for computation of terms involving the presence of other household vehicles.

Vehicle type data is read in according to the specification and joined to the alternatives. It can optionally be included in the output vehicles table by specifying the COLS_TO_INCLUDE_IN_VEHICLE_TABLE option in the model yaml.

Parameters
vehicles_mergedorca.DataFrameWrapper

vehicle list owned by each household merged with households table

model_settingsdict

yaml model settings file as dict

model_specpandas.DataFrame

omnibus spec file with expressions in index and one column per segment

locals_dictdict

additional variables available when writing expressions

estimatorEstimator object
chunk_sizeorca.injectable
trace_labelstr
Returns
all_choicespandas.DataFrame

single table of selected vehicle types and associated data

all_chooserspandas.DataFrame

single table of chooser data with preprocessor variables included

activitysim.abm.models.vehicle_type_choice.vehicle_type_choice(persons, households, vehicles, vehicles_merged, chunk_size, trace_hh_id)#

Assign a vehicle type to each vehicle in the vehicles table.

If a “SIMULATION_TYPE” is set to simple_simulate in the vehicle_type_choice.yaml config file, then the model specification .csv file should contain one column of coefficients for each distinct alternative. This format corresponds to ActivitySim’s activitysim.core.simulate.simple_simulate() format. Otherwise, this model will construct a table of alternatives, at run time, based on all possible combinations of values of the categorical variables enumerated as “combinatorial_alts” in the .yaml config. In this case, the model leverages ActivitySim’s activitysim.core.interaction_simulate() model design, in which the model specification .csv has only one column of coefficients, and the utility expressions can turn coefficients on or off based on attributes of either the chooser _or_ the alternative.

As an optional second step, the user may also specify a “PROBS_SPEC” .csv file in the main .yaml config, corresponding to a lookup table of additional vehicle attributes and probabilities to be sampled and assigned to vehicles after the logit choices have been made. The rows of the “PROBS_SPEC” file must include all body type and vehicle age choices assigned in the logit model. These additional attributes are concatenated with the selected alternative from the logit model to form a single vehicle type name to be stored in the vehicles table as the vehicle_type column.

Only one household vehicle is selected at a time to allow for the introduction of owned vehicle related attributes. For example, a household may be less likely to own a second van if they already own one. The model is run sequentially through household vehicle numbers. The preprocessor is run for each iteration on the entire vehicles table to allow for computation of terms involving the presence of other household vehicles.

The user may also augment the households or persons tables with new vehicle type-based fields specified via expressions in “annotate_households_vehicle_type.csv” and “annotate_persons_vehicle_type.csv”, respectively.

Parameters
personsorca.DataFrameWrapper
householdsorca.DataFrameWrapper
vehiclesorca.DataFrameWrapper
vehicles_mergedorca.DataFrameWrapper
chunk_sizeorca.injectable
trace_hh_idorca.injectable

Telecommute Frequency#

Telecommuting is defined as workers who work from home instead of going to work. It only applies to workers with a regular workplace outside of home. The telecommute model consists of two submodels - a person Work From Home model and this person telecommute frequency model.

For all workers that work out of the home, the telecommute models predicts the level of telecommuting. The model alternatives are the frequency of telecommuting in days per week (0 days, 1 day, 2 to 3 days, 4+ days).

The main interface to the work from home model is the telecommute_frequency() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Field: telecommute_frequency | Skims Keys: NA

activitysim.abm.models.telecommute_frequency.telecommute_frequency(persons_merged, persons, chunk_size, trace_hh_id)#

This model predicts the frequency of telecommute for a person (worker) who does not works from home. The alternatives of this model are ‘No Telecommute’, ‘1 day per week’, ‘2 to 3 days per week’ and ‘4 days per week’. This model reflects the choices of people who prefer a combination of working from home and office during a week.

Free Parking Eligibility#

The Free Parking Eligibility model predicts the availability of free parking at a person’s workplace. It is applied for people who work in zones that have parking charges, which are generally located in the Central Business Districts. The purpose of the model is to adequately reflect the cost of driving to work in subsequent models, particularly in mode choice.

The main interface to the free parking eligibility model is the free_parking() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Field: free_parking_at_work | Skims Keys: NA

Coordinated Daily Activity Pattern#

The Coordinated Daily Activity Pattern (CDAP) model predicts the choice of daily activity pattern (DAP) for each member in the household, simultaneously. The DAP is categorized in to three types as follows:

  • Mandatory: the person engages in travel to at least one out-of-home mandatory activity - work, university, or school. The mandatory pattern may also include non-mandatory activities such as separate home-based tours or intermediate stops on mandatory tours.

  • Non-mandatory: the person engages in only maintenance and discretionary tours, which, by definition, do not contain mandatory activities.

  • Home: the person does not travel outside the home.

The CDAP model is a sequence of vectorized table operations:

  • create a person level table and rank each person in the household for inclusion in the CDAP model. Priority is given to full time workers (up to two), then to part time workers (up to two workers, of any type), then to children (youngest to oldest, up to three). Additional members up to five are randomly included for the CDAP calculation.

  • solve individual M/N/H utilities for each person

  • take as input an interaction coefficients table and then programmatically produce and write out the expression files for households size 1, 2, 3, 4, and 5 models independent of one another

  • select households of size 1, join all required person attributes, and then read and solve the automatically generated expressions

  • repeat for households size 2, 3, 4, and 5. Each model is independent of one another.

The main interface to the CDAP model is the run_cdap() function. This function is called by the Inject step cdap_simulate which is registered as an Inject step in the example Pipeline. There are two cdap class definitions in ActivitySim. The first is at cdap() and contains the Inject wrapper for running it as part of the model pipeline. The second is at cdap() and contains CDAP model logic.

Core Table: persons | Result Field: cdap_activity | Skims Keys: NA

activitysim.abm.models.cdap.cdap_simulate(persons_merged, persons, households, chunk_size, trace_hh_id)#

CDAP stands for Coordinated Daily Activity Pattern, which is a choice of high-level activity pattern for each person, in a coordinated way with other members of a person’s household.

Because Python requires vectorization of computation, there are some specialized routines in the cdap directory of activitysim for this purpose. This module simply applies those utilities using the simulation framework.

Mandatory Tour Frequency#

The individual mandatory tour frequency model predicts the number of work and school tours taken by each person with a mandatory DAP. The primary drivers of mandatory tour frequency are demographics, accessibility-based parameters such as drive time to work, and household automobile ownership. It also creates mandatory tours in the data pipeline.

The main interface to the mandatory tour purpose frequency model is the mandatory_tour_frequency() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Fields: mandatory_tour_frequency | Skims Keys: NA

activitysim.abm.models.mandatory_tour_frequency.mandatory_tour_frequency(persons_merged, chunk_size, trace_hh_id)#

This model predicts the frequency of making mandatory trips (see the alternatives above) - these trips include work and school in some combination.

Mandatory Tour Scheduling#

The mandatory tour scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each mandatory tour. The primary drivers in the model are accessibility-based parameters such as the mode choice logsum for the departure/arrival hour combination, demographics, and time pattern characteristics such as the time windows available from previously scheduled tours. This model uses person Person Time Windows.

Note

For prototype_mtc, the modeled time periods for all submodels are hourly from 3 am to 3 am the next day, and any times before 5 am are shifted to time period 5, and any times after 11 pm are shifted to time period 23.

If tour_departure_and_duration_segments.csv is included in the configs, then the model will use these representative start and end time periods when calculating mode choice logsums instead of the specific start and end combinations for each alternative to reduce runtime. This feature, know as representative logsums, takes advantage of the fact that the mode choice logsum, say, from 6 am to 2 pm is very similar to the logsum from 6 am to 3 pm, and 6 am to 4 pm, and so using just 6 am to 3 pm (with the idea that 3 pm is the “representative time period”) for these alternatives is sufficient for tour scheduling. By reusing the 6 am to 3 pm mode choice logsum, ActivitySim saves significant runtime.

The main interface to the mandatory tour purpose scheduling model is the mandatory_tour_scheduling() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: TAZ, workplace_taz, school_taz, start, end

activitysim.abm.models.mandatory_scheduling.mandatory_tour_scheduling(tours, persons_merged, tdd_alts, chunk_size, trace_hh_id)#

This model predicts the departure time and duration of each activity for mandatory tours

School Escorting#

The school escort model determines whether children are dropped-off at or picked-up from school, simultaneously with the chaperone responsible for chauffeuring the children, which children are bundled together on half-tours, and the type of tour (pure escort versus rideshare). The model is run after work and school locations have been chosen for all household members, and after work and school tours have been generated and scheduled. The model labels household members of driving age as potential ‘chauffeurs’ and children with school tours as potential ‘escortees’. The model then attempts to match potential chauffeurs with potential escortees in a choice model whose alternatives consist of ‘bundles’ of escortees with a chauffeur for each half tour.

School escorting is a household level decision – each household will choose an alternative from the school_escorting_alts.csv file, with the first alternative being no escorting. This file contains the following columns:

Column Name

Column Description

Alt

Alternative number

bundle[1,2,3]

bundle number for child 1,2, and 3

chauf[1,2,3]

chauffeur number for child 1,2, and 3 - 0 = child not escorted - 1 = chauffeur 1 as ride share - 2 = chauffeur 1 as pure escort - 3 = chauffeur 2 as ride share - 4 = chauffeur 3 as pure escort

nbund[1,2]

  • number of escorting bundles for chauffeur 1 and 2

nbundles

  • total number of bundles

  • equals nbund1 + nbund2

nrs1

  • number of ride share bundles for chauffeur 1

npe1

  • number of pure escort bundles for chauffeur 1

nrs2

  • number of ride share bundles for chauffeur 2

npe2

  • number of pure escort bundles for chauffeur 2

Description

  • text description of alternative

The model as currently implemented contains three escortees and two chauffeurs. Escortees are students under age 16 with a mandatory tour whereas chaperones are all persons in the household over the age of 18. For households that have more than three possible escortees, the three youngest children are selected for the model. The two chaperones are selected as the adults of the household with the highest weight according to the following calculation: \(Weight = 100*personType + 10*gender + 1*age(0,1)\) Where personType is the person type number from 1 to 5, gender is 1 for male and 2 for female, and age is a binary indicator equal to 1 if age is over 25 else 0.

The model is run sequentially three times, once in the outbound direction, once in the inbound direction, and again in the outbound direction with additional conditions on what happened in the inbound direction. There are therefore three sets of utility specifications, coefficients, and pre-processor files. Each of these files is specified in the school_escorting.yaml file along with the number of escortees and number of chaperones.

There is also a constants section in the school_escorting.yaml file which contain two constants. One which sets the maximum time bin difference to match school and work tours for ride sharing and another to set the number of minutes per time bin. In the prototype_mtc_extended example, these are set to 1 and 60 respectively.

After a school escorting alternative is chosen for the inbound and outbound direction, the model will create the tours and trips associated with the decision. Pure escort tours are created, and the mandatory tour start and end times are changed to match the school escort bundle start and end times. (Outbound tours have their start times matched and inbound tours have their end times matched.) Escortee drop-off / pick-up order is determined by the distance from home to the school locations. They are ordered from smallest to largest in the outbound direction, and largest to smallest in the inbound direction. Trips are created for each half-tour that includes school escorting according to the provided order.

The created pure escort tours are joined to the already created mandatory tour table in the pipeline and are also saved separately to the pipeline under the table name “school_escort_tours”. Created school escorting trips are saved to the pipeline under the table name “school_escort_trips”. By saving these to the pipeline, their data can be queried in downstream models to set correct purposes, destinations, and schedules to satisfy the school escorting model choice.

There are a host of downstream model changes that are involved when including the school escorting model. The following list contains the models that are changed in some way when school escorting is included:

  • Joint tour scheduling: Joint tours are not allowed to be scheduled over school escort tours. This happens automatically by updating the timetable object with the updated mandatory tour times and created pure escort tour times after the school escorting model is run. There were no code or config changes in this model, but it is still affected by school escorting.

  • Non-Mandatory tour frequency: Pure school escort tours are joined to the tours created in the non-mandatory tour frequency model and tour statistics (such as tour_count and tour_num) are re-calculated.

  • Non-Mandatory tour destination: Since the primary destination of pure school escort tours is known, they are removed from the choosers table and have their destination set according to the destination inschool_escort_tours table. They are also excluded from the estimation data bundle.

  • Non-Mandatory tour scheduling: Pure escort tours need to have the non-escorting portion of their tour scheduled. This is done by inserting availability conditions in the model specification that ensures the alternative chosen for the start of the tour is equal to the alternative start time for outbound tours and the end time is equal to the alternative end time for the inbound tours. There are additional terms that ensure the tour does not overlap with subsequent school escorting tours as well. Beware – If the availability conditions in the school escorting model are not set correctly, the tours created may not be consistent with each other and this model will fail.

  • Tour mode choice: Availability conditions are set in tour mode choice to prohibit the drive alone mode if the tour contains an escortee and the shared-ride 2 mode if the tour contains more than one escortee.

  • Stop Frequency: No stops are allowed on half-tours that include school escorting. This is enforced by adding availability conditions in the stop frequency model. After the stop frequency model is run, the school escorting trips are merged from the trips created by the stop frequency model and a new stop frequency is computed along with updated trip numbers.

  • Trip purpose, destination, and scheduling: Trip purpose, destination, and departure times are known for school escorting trips. As such they are removed from their respective chooser tables and the estimation data bundles, and set according to the values in the school_escort_trips table residing in the pipeline.

  • Trip mode choice: Like in tour mode choice, availability conditions are set to prohibit trip containing an escortee to use the drive alone mode or the shared-ride 2 mode for trips with more than one escortee.

Many of the changes discussed in the above list are handled in the code and the user is not required to make any changes when implementing the school escorting model. However, it is the users responsibility to include the changes in the following model configuration files for models downstream of the school escorting model:

File Name(s)

Change(s) Needed

  • non_mandatory_tour_scheduling_annotate_tours_preprocessor.csv

  • tour_scheduling_nonmandatory.csv

  • Set availability conditions based on those times

  • Do not schedule over other school escort tours

  • tour_mode_choice_annotate_choosers_preprocessor.csv

  • tour_mode_choice.csv

  • count number of escortees on tour by parsing the

escort_participants column - set mode choice availability based on number of escortees

  • stop_frequency_school.csv

  • stop_frequency_work.csv

  • stop_frequency_univ.csv

  • stop_frequency_escort.csv

Do not allow stops for half-tours that include school escorting

  • trip_mode_choice_annotate_trips_preprocessor.csv

  • trip_mode_choice.csv

  • count number of escortees on trip by parsing the

escort_participants column - set mode choice availability based on number of escortees

When not including the school escorting model, all of the escort trips to and from school are counted implicitly in escort tours determined in the non-mandatory tour frequency model. Thus, when including the school escort model and accounting for these tours explicitly, extra care should be taken not to double count them in the non-mandatory tour frequency model. The non-mandatory tour frequency model should be re-evaluated and likely changed to decrease the number of escort tours generated by that model. This was not implemented in the prototype_mtc_extended implementation due to a lack of data surrounding the number of escort tours in the region.

activitysim.abm.models.school_escorting.check_alts_consistency(alts)#

Checking to ensure that the alternatives file is consistent with the number of chaperones and escortees set in the model settings.

activitysim.abm.models.school_escorting.create_bundle_attributes(row)#

Parse a bundle to determine escortee numbers and tour info.

activitysim.abm.models.school_escorting.create_school_escorting_bundles_table(choosers, tours, stage)#

Creates a table that has one row for every school escorting bundle. Additional calculations are performed to help facilitate tour and trip creation including escortee order, times, etc.

Parameters
chooserspd.DataFrame

households pre-processed for the school escorting model

tourspd.Dataframe

mandatory tours

stagestr

inbound or outbound_cond

Returns
bundlespd.DataFrame

one school escorting bundle per row

activitysim.abm.models.school_escorting.determine_escorting_participants(choosers, persons, model_settings)#

Determining which persons correspond to chauffer 1..n and escortee 1..n. Chauffers are those with the highest weight given by: weight = 100 * person type + 10 * gender + 1*(age > 25) and escortees are selected youngest to oldest.

activitysim.abm.models.school_escorting.school_escorting(households, households_merged, persons, tours, chunk_size, trace_hh_id)#

school escorting model

The school escorting model determines whether children are dropped-off at or picked-up from school, simultaneously with the driver responsible for chauffeuring the children, which children are bundled together on half-tours, and the type of tour (pure escort versus rideshare).

Run iteratively for an outbound choice, an inbound choice, and an outbound choice conditional on the inbound choice. The choices for inbound and outbound conditional are used to create school escort tours and trips.

Updates / adds the following tables to the pipeline:

- households with school escorting choice
- tours including pure school escorting
- school_escort_tours which contains only pure school escort tours
- school_escort_trips
- timetable to avoid joint tours scheduled over school escort tours

Joint Tour Frequency#

The joint tour generation models are divided into three sub-models: the joint tour frequency model, the party composition model, and the person participation model. In the joint tour frequency model, the household chooses the purposes and number (up to two) of its fully joint travel tours. It also creates joints tours in the data pipeline.

The main interface to the joint tour purpose frequency model is the joint_tour_frequency() function. This function is registered as an Inject step in the example Pipeline.

Core Table: households | Result Fields: num_hh_joint_tours | Skims Keys: NA

activitysim.abm.models.joint_tour_frequency.joint_tour_frequency(households, persons, chunk_size, trace_hh_id)#

This model predicts the frequency of making fully joint trips (see the alternatives above).

Joint Tour Composition#

In the joint tour party composition model, the makeup of the travel party (adults, children, or mixed - adults and children) is determined for each joint tour. The party composition determines the general makeup of the party of participants in each joint tour in order to allow the micro-simulation to faithfully represent the prevalence of adult-only, children-only, and mixed joint travel tours for each purpose while permitting simplicity in the subsequent person participation model.

The main interface to the joint tour composition model is the joint_tour_composition() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Fields: composition | Skims Keys: NA

activitysim.abm.models.joint_tour_composition.joint_tour_composition(tours, households, persons, chunk_size, trace_hh_id)#

This model predicts the makeup of the travel party (adults, children, or mixed).

Joint Tour Participation#

In the joint tour person participation model, each eligible person sequentially makes a choice to participate or not participate in each joint tour. Since the party composition model determines what types of people are eligible to join a given tour, the person participation model can operate in an iterative fashion, with each household member choosing to join or not to join a travel party independent of the decisions of other household members. In the event that the constraints posed by the result of the party composition model are not met, the person participation model cycles through the household members multiple times until the required types of people have joined the travel party.

This step also creates the joint_tour_participants table in the pipeline, which stores the person ids for each person on the tour.

The main interface to the joint tour participation model is the joint_tour_participation() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Fields: number_of_participants, person_id (for the point person) | Skims Keys: NA

activitysim.abm.models.joint_tour_participation.joint_tour_participation(tours, persons_merged, chunk_size, trace_hh_id)#

Predicts for each eligible person to participate or not participate in each joint tour.

activitysim.abm.models.joint_tour_participation.participants_chooser(probs, choosers, spec, trace_label)#

custom alternative to logit.make_choices for simulate.simple_simulate

Choosing participants for mixed tours is trickier than adult or child tours becuase we need at least one adult and one child participant in a mixed tour. We call logit.make_choices and then check to see if the tour statisfies this requirement, and rechoose for any that fail until all are satisfied.

In principal, this shold always occur eventually, but we fail after MAX_ITERATIONS, just in case there is some failure in program logic (haven’t seen this occur.)

The return values are the same as logit.make_choices

Parameters
probspandas.DataFrame

Rows for choosers and columns for the alternatives from which they are choosing. Values are expected to be valid probabilities across each row, e.g. they should sum to 1.

chooserspandas.dataframe

simple_simulate choosers df

specpandas.DataFrame

simple_simulate spec df We only need spec so we can know the column index of the ‘participate’ alternative indicating that the participant has been chosen to participate in the tour

trace_labelstr
Returns
choices, rands

choices, rands as returned by logit.make_choices (in same order as probs)

Joint Tour Destination Choice#

The joint tour destination choice model operate similarly to the usual work and school location choice model, selecting the primary destination for travel tours. The only procedural difference between the models is that the usual work and school location choice model selects the usual location of an activity whether or not the activity is undertaken during the travel day, while the joint tour destination choice model selects the location for an activity which has already been generated.

The tour’s primary destination is the location of the activity that is assumed to provide the greatest impetus for engaging in the travel tour. In the household survey, the primary destination was not asked, but rather inferred from the pattern of stops in a closed loop in the respondents’ travel diaries. The inference was made by weighing multiple criteria including a defined hierarchy of purposes, the duration of activities, and the distance from the tour origin. The model operates in the reverse direction, designating the primary purpose and destination and then adding intermediate stops based on spatial, temporal, and modal characteristics of the inbound and outbound journeys to the primary destination.

The joint tour destination choice model is made up of three model steps:
  • sample - selects a sample of alternative locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative location.

  • simulate - starts with the table created above and chooses a final location, this time with the mode choice logsum included.

Joint tour location choice for placeholder_multiple_zone models uses Presampling by default.

The main interface to the model is the joint_tour_destination() function. This function is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tours | Result Fields: destination | Skims Keys: TAZ, alt_dest, MD time period

activitysim.abm.models.joint_tour_destination.joint_tour_destination(tours, persons_merged, households_merged, network_los, chunk_size, trace_hh_id)#

Given the tour generation from the above, each tour needs to have a destination, so in this case tours are the choosers (with the associated person that’s making the tour)

Joint Tour Scheduling#

The joint tour scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each joint tour. This model uses person Person Time Windows. The primary drivers in the models are accessibility-based parameters such as the auto travel time for the departure/arrival hour combination, demographics, and time pattern characteristics such as the time windows available from previously scheduled tours. The joint tour scheduling model does not use mode choice logsums.

The main interface to the joint tour purpose scheduling model is the joint_tour_scheduling() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: `` TAZ, destination, MD time period, MD time period``

activitysim.abm.models.joint_tour_scheduling.joint_tour_scheduling(tours, persons_merged, tdd_alts, chunk_size, trace_hh_id)#

This model predicts the departure time and duration of each joint tour

Non-Mandatory Tour Frequency#

The non-mandatory tour frequency model selects the number of non-mandatory tours made by each person on the simulation day. It also adds non-mandatory tours to the tours in the data pipeline. The individual non-mandatory tour frequency model operates in two stages:

  • A choice is made using a random utility model between combinations of tours containing zero, one, and two or more escort tours, and between zero and one or more tours of each other purpose.

  • Up to two additional tours of each purpose are added according to fixed extension probabilities.

The main interface to the non-mandatory tour purpose frequency model is the non_mandatory_tour_frequency() function. This function is registered as an Inject step in the example Pipeline.

Core Table: persons | Result Fields: non_mandatory_tour_frequency | Skims Keys: NA

activitysim.abm.models.non_mandatory_tour_frequency.extend_tour_counts(persons, tour_counts, alternatives, trace_hh_id, trace_label)#

extend tour counts based on a probability table

counts can only be extended if original count is between 1 and 4 and tours can only be extended if their count is at the max possible (e.g. 2 for escort, 1 otherwise) so escort might be increased to 3 or 4 and other tour types might be increased to 2 or 3

Parameters
persons: pandas dataframe

(need this for join columns)

tour_counts: pandas dataframe

one row per person, once column per tour_type

alternatives

alternatives from nmtv interaction_simulate only need this to know max possible frequency for a tour type

trace_hh_id
trace_label
Returns
extended tour_counts
tour_counts looks like this:

escort shopping othmaint othdiscr eatout social

parent_id
2588676 2 0 0 1 1 0
2588677 0 1 0 1 0 0
activitysim.abm.models.non_mandatory_tour_frequency.non_mandatory_tour_frequency(persons, persons_merged, chunk_size, trace_hh_id)#

This model predicts the frequency of making non-mandatory trips (alternatives for this model come from a separate csv file which is configured by the user) - these trips include escort, shopping, othmaint, othdiscr, eatout, and social trips in various combination.

Non-Mandatory Tour Destination Choice#

The non-mandatory tour destination choice model chooses a destination zone for non-mandatory tours. The three step (sample, logsums, final choice) process also used for mandatory tour destination choice is used for non-mandatory tour destination choice.

Non-mandatory tour location choice for placeholder_multiple_zone models uses Presampling by default.

The main interface to the non-mandatory tour destination choice model is the non_mandatory_tour_destination() function. This function is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tours | Result Field: destination | Skims Keys: TAZ, alt_dest, MD time period, MD time period

activitysim.abm.models.non_mandatory_destination.non_mandatory_tour_destination(tours, persons_merged, network_los, chunk_size, trace_hh_id)#

Given the tour generation from the above, each tour needs to have a destination, so in this case tours are the choosers (with the associated person that’s making the tour)

Non-Mandatory Tour Scheduling#

The non-mandatory tour scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each non-mandatory tour. This model uses person Person Time Windows. Includes support for Mandatory Tour Scheduling.

The main interface to the non-mandatory tour purpose scheduling model is the non_mandatory_tour_scheduling() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: TAZ, destination, MD time period, MD time period

activitysim.abm.models.non_mandatory_scheduling.non_mandatory_tour_scheduling(tours, persons_merged, tdd_alts, chunk_size, trace_hh_id)#

This model predicts the departure time and duration of each activity for non-mandatory tours

Vehicle Allocation#

The vehicle allocation model selects which vehicle would be used for a tour of given occupancy. The alternatives for the vehicle allocation model consist of the vehicles owned by the household and an additional non household vehicle option. (Zero-auto households would be assigned the non-household vehicle option since there are no owned vehicles in the household). A vehicle is selected for each occupancy level set by the user such that different tour modes that have different occupancies could see different operating characteristics. The output of the vehicle allocation model is appended to the tour table with column names vehicle_occup_{occupancy} and the values are the vehicle type selected.

In prototype_mtc_extended, three occupancy levels are used: 1, 2, and 3.5. The auto operating cost for occupancy level 1 is used in the drive alone mode and drive to transit modes. Occupancy levels 2 and 3.5 are used for shared ride 2 and shared ride 3+ auto operating costs, respectively. Auto operating costs are selected in the mode choice pre-processors by selecting the allocated vehicle type data from the vehicles table. If the allocated vehicle type was the non-household vehicle, the auto operating costs uses the previous default value from prototype_mtc. All trips and atwork subtours use the auto operating cost of the parent tour. Functionality was added in tour and atwork subtour mode choice to annotate the tour table and create a selected_vehicle which denotes the actual vehicle used. If the tour mode does not include a vehicle, then the selected_vehicle entry is left blank.

The current implementation does not account for possible use of the household vehicles by other household members. Thus, it is possible for a selected vehicle to be used in two separate tours at the same time.

activitysim.abm.models.vehicle_allocation.annotate_vehicle_allocation(model_settings, trace_label)#

Add columns to the tours table in the pipeline according to spec.

Parameters
model_settingsdict
trace_labelstr
activitysim.abm.models.vehicle_allocation.get_skim_dict(network_los, choosers)#

Returns a dictionary of skim wrappers to use in expression writing.

Skims have origin as home_zone_id and destination as the tour destination.

Parameters
network_losactivitysim.core.los.Network_LOS object
chooserspd.DataFrame
Returns
skimsdict

index is skim wrapper name, value is the skim wrapper

activitysim.abm.models.vehicle_allocation.vehicle_allocation(persons, households, vehicles, tours, tours_merged, network_los, chunk_size, trace_hh_id)#

Selects a vehicle for each occupancy level for each tour.

Alternatives consist of the up to the number of household vehicles plus one option for non-household vehicles.

The model will be run once for each tour occupancy defined in the model yaml. Output tour table will columns added for each occupancy level.

The user may also augment the tours tables with new vehicle type-based fields specified via the annotate_tours option.

Parameters
personsorca.DataFrameWrapper
householdsorca.DataFrameWrapper
vehiclesorca.DataFrameWrapper
vehicles_mergedorca.DataFrameWrapper
toursorca.DataFrameWrapper
tours_mergedorca.DataFrameWrapper
chunk_sizeorca.injectable
trace_hh_idorca.injectable

Tour Mode Choice#

The mandatory, non-mandatory, and joint tour mode choice model assigns to each tour the “primary” mode that is used to get from the origin to the primary destination. The tour-based modeling approach requires a reconsideration of the conventional mode choice structure. Instead of a single mode choice model used in a four-step structure, there are two different levels where the mode choice decision is modeled: (a) the tour mode level (upper-level choice); and, (b) the trip mode level (lower-level choice conditional upon the upper-level choice).

The mandatory, non-mandatory, and joint tour mode level represents the decisions that apply to the entire tour, and that will affect the alternatives available for each individual trip or joint trip. These decisions include the choice to use a private car versus using public transit, walking, or biking; whether carpooling will be considered; and whether transit will be accessed by car or by foot. Trip-level decisions correspond to details of the exact mode used for each trip, which may or may not change over the trips in the tour.

The mandatory, non-mandatory, and joint tour mode choice structure is a nested logit model which separates similar modes into different nests to more accurately model the cross-elasticities between the alternatives. The eighteen modes are incorporated into the nesting structure specified in the model settings file. The first level of nesting represents the use a private car, non-motorized means, or transit. In the second level of nesting, the auto nest is divided into vehicle occupancy categories, and transit is divided into walk access and drive access nests. The final level splits the auto nests into free or pay alternatives and the transit nests into the specific line-haul modes.

The primary variables are in-vehicle time, other travel times, cost (the influence of which is derived from the automobile in-vehicle time coefficient and the persons’ modeled value of time), characteristics of the destination zone, demographics, and the household’s level of auto ownership.

The main interface to the mandatory, non-mandatory, and joint tour mode model is the tour_mode_choice_simulate() function. This function is called in the Inject step tour_mode_choice_simulate and is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tours | Result Field: mode | Skims Keys: TAZ, destination, start, end

activitysim.abm.models.tour_mode_choice.append_tour_leg_trip_mode_choice_logsums(tours)#

Creates trip mode choice logsum column in tours table for each tour mode and leg

Parameters
tourspd.DataFrame
Returns
tourspd.DataFrame

Adds two * n_modes logsum columns to each tour row, e.g. “logsum_DRIVE_outbound”

activitysim.abm.models.tour_mode_choice.create_logsum_trips(tours, segment_column_name, model_settings, trace_label)#

Construct table of trips from half-tours (1 inbound, 1 outbound) for each tour-mode.

Parameters
tourspandas.DataFrame
segment_column_namestr

column in tours table used for segmenting model spec

model_settingsdict
trace_labelstr
Returns
pandas.DataFrame

Table of trips: 2 per tour, with O/D and purpose inherited from tour

activitysim.abm.models.tour_mode_choice.get_alts_from_segmented_nested_logit(model_settings, segment_name, trace_label)#

Infer alts from logit spec

Parameters
model_settingsdict
segment_column_namestr
trace_labelstr
Returns
list
activitysim.abm.models.tour_mode_choice.get_trip_mc_logsums_for_all_modes(tours, segment_column_name, model_settings, trace_label)#

Creates pseudo-trips from tours and runs trip mode choice to get logsums

Parameters
tourspandas.DataFrame
segment_column_namestr

column in tours table used for segmenting model spec

model_settingsdict
trace_labelstr
Returns
tourspd.DataFrame

Adds two * n_modes logsum columns to each tour row, e.g. “logsum_DRIVE_outbound”

activitysim.abm.models.tour_mode_choice.logger = <Logger activitysim.abm.models.tour_mode_choice (WARNING)>#

Tour mode choice is run for all tours to determine the transportation mode that will be used for the tour

activitysim.abm.models.tour_mode_choice.tour_mode_choice_simulate(tours, persons_merged, network_los, chunk_size, trace_hh_id)#

Tour mode choice simulate

At-work Subtours Frequency#

The at-work subtour frequency model selects the number of at-work subtours made for each work tour. It also creates at-work subtours by adding them to the tours table in the data pipeline. These at-work sub-tours are travel tours taken during the workday with their origin at the work location, rather than from home. Explanatory variables include employment status, income, auto ownership, the frequency of other tours, characteristics of the parent work tour, and characteristics of the workplace zone.

Choosers: work tours Alternatives: none, 1 eating out tour, 1 business tour, 1 maintenance tour, 2 business tours, 1 eating out tour + 1 business tour Dependent tables: household, person, accessibility Outputs: work tour subtour frequency choice, at-work tours table (with only tour origin zone at this point)

The main interface to the at-work subtours frequency model is the atwork_subtour_frequency() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Field: atwork_subtour_frequency | Skims Keys: NA

activitysim.abm.models.atwork_subtour_frequency.atwork_subtour_frequency(tours, persons_merged, chunk_size, trace_hh_id)#

This model predicts the frequency of making at-work subtour tours (alternatives for this model come from a separate csv file which is configured by the user).

At-work Subtours Destination Choice#

The at-work subtours destination choice model is made up of three model steps:

  • sample - selects a sample of alternative locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative location.

  • simulate - starts with the table created above and chooses a final location, this time with the mode choice logsum included.

At-work subtour location choice for placeholder_multiple_zone models uses Presampling by default.

Core Table: tours | Result Table: destination | Skims Keys: workplace_taz, alt_dest, MD time period

The main interface to the at-work subtour destination model is the atwork_subtour_destination() function. This function is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

At-work Subtour Scheduling#

The at-work subtours scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each at-work subtour. This model uses person Person Time Windows.

This model is the same as the mandatory tour scheduling model except it operates on the at-work tours and constrains the alternative set to available person Person Time Windows. The at-work subtour scheduling model does not use mode choice logsums. The at-work subtour frequency model can choose multiple tours so this model must process all first tours and then second tours since isFirstAtWorkTour is an explanatory variable.

Choosers: at-work tours Alternatives: alternative departure time and arrival back at origin time pairs WITHIN the work tour departure time and arrival time back at origin AND the person time window. If no time window is available for the tour, make the first and last time periods within the work tour available, make the choice, and log the number of times this occurs. Dependent tables: skims, person, land use, work tour Outputs: at-work tour departure time and arrival back at origin time, updated person time windows

The main interface to the at-work subtours scheduling model is the atwork_subtour_scheduling() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: workplace_taz, alt_dest, MD time period, MD time period

activitysim.abm.models.atwork_subtour_scheduling.atwork_subtour_scheduling(tours, persons_merged, tdd_alts, skim_dict, chunk_size, trace_hh_id)#

This model predicts the departure time and duration of each activity for at work subtours tours

At-work Subtour Mode#

The at-work subtour mode choice model assigns a travel mode to each at-work subtour using the Tour Mode Choice model.

The main interface to the at-work subtour mode choice model is the atwork_subtour_mode_choice() function. This function is called in the Inject step atwork_subtour_mode_choice and is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tour | Result Field: tour_mode | Skims Keys: workplace_taz, destination, start, end

activitysim.abm.models.atwork_subtour_mode_choice.atwork_subtour_mode_choice(tours, persons_merged, network_los, chunk_size, trace_hh_id)#

At-work subtour mode choice simulate

Intermediate Stop Frequency#

The stop frequency model assigns to each tour the number of intermediate destinations a person will travel to on each leg of the tour from the origin to tour primary destination and back. The model incorporates the ability for more than one stop in each direction, up to a maximum of 3, for a total of 8 trips per tour (four on each tour leg).

Intermediate stops are not modeled for drive-transit tours because doing so can have unintended consequences because of the difficulty of tracking the location of the vehicle. For example, consider someone who used a park and ride for work and then took transit to an intermediate shopping stop on the way home. Without knowing the vehicle location, it cannot be determined if it is reasonable to allow the person to drive home. Even if the tour were constrained to allow driving only on the first and final trip, the trip home from an intermediate stop may not use the same park and ride where the car was dropped off on the outbound leg, which is usually as close as possible to home because of the impracticality of coding drive access links from every park and ride lot to every zone.

This model also creates a trips table in the pipeline for later models.

The main interface to the intermediate stop frequency model is the stop_frequency() function. This function is registered as an Inject step in the example Pipeline.

Core Table: tours | Result Field: stop_frequency | Skims Keys: NA

activitysim.abm.models.stop_frequency.stop_frequency(tours, tours_merged, stop_frequency_alts, network_los, chunk_size, trace_hh_id)#

stop frequency model

For each tour, shoose a number of intermediate inbound stops and outbound stops. Create a trip table with inbound and outbound trips.

Thus, a tour with stop_frequency ‘2out_0in’ will have two outbound and zero inbound stops, and four corresponding trips: three outbound, and one inbound.

Adds stop_frequency str column to trips, with fields

creates trips table with columns:

- person_id
- household_id
- tour_id
- primary_purpose
- atwork
- trip_num
- outbound
- trip_count

Trip Purpose#

For trip other than the last trip outbound or inbound, assign a purpose based on an observed frequency distribution. The distribution is segmented by tour purpose, tour direction and person type. Work tours are also segmented by departure or arrival time period.

The main interface to the trip purpose model is the trip_purpose() function. This function is registered as an Inject step in the example Pipeline.

Core Table: trips | Result Field: purpose | Skims Keys: NA

Note

Trip purpose and trip destination choice can be run iteratively together via Trip Purpose and Destination.

activitysim.abm.models.trip_purpose.choose_intermediate_trip_purpose(trips, probs_spec, estimator, probs_join_cols, use_depart_time, trace_hh_id, trace_label)#

chose purpose for intermediate trips based on probs_spec which assigns relative weights (summing to 1) to the possible purpose choices

Returns
purpose: pandas.Series of purpose (str) indexed by trip_id
activitysim.abm.models.trip_purpose.run_trip_purpose(trips_df, estimator, chunk_size, trace_hh_id, trace_label)#

trip purpose - main functionality separated from model step so it can be called iteratively

For each intermediate stop on a tour (i.e. trip other than the last trip outbound or inbound) each trip is assigned a purpose based on an observed frequency distribution

The distribution should always be segmented by tour purpose and tour direction. By default it is also segmented by person type. The join columns can be overwritten using the “probs_join_cols” parameter in the model settings. The model will attempt to segment by trip depart time as well if necessary and depart time ranges are specified in the probability lookup table.

Returns
purpose: pandas.Series of purpose (str) indexed by trip_id
activitysim.abm.models.trip_purpose.trip_purpose(trips, chunk_size, trace_hh_id)#

trip purpose model step - calls run_trip_purpose to run the actual model

adds purpose column to trips

Trip Destination Choice#

See Trip Destination.

Trip Purpose and Destination#

After running trip purpose and trip destination separately, the two model can be ran together in an iterative fashion on the remaining failed trips (i.e. trips that cannot be assigned a destination). Each iteration uses new random numbers.

The main interface to the trip purpose model is the trip_purpose_and_destination() function. This function is registered as an Inject step in the example Pipeline.

Core Table: trips | Result Field: purpose, destination | Skims Keys: origin, (tour primary) destination, dest_taz, trip_period

Trip Scheduling (Probablistic)#

For each trip, assign a departure hour based on an input lookup table of percents by tour purpose, direction (inbound/outbound), tour hour, and trip index.

  • The tour hour is the tour start hour for outbound trips and the tour end hour for inbound trips. The trip index is the trip sequence on the tour, with up to four trips per half tour

  • For outbound trips, the trip depart hour must be greater than or equal to the previously selected trip depart hour

  • For inbound trips, trips are handled in reverse order from the next-to-last trip in the leg back to the first. The tour end hour serves as the anchor time point from which to start assigning trip time periods.

  • Outbound trips on at-work subtours are assigned the tour depart hour and inbound trips on at-work subtours are assigned the tour end hour.

The assignment of trip depart time is run iteratively up to a max number of iterations since it is possible that the time period selected for an earlier trip in a half-tour makes selection of a later trip time period impossible (or very low probability). Thus, the sampling is re-run until a feasible set of trip time periods is found. If a trip can’t be scheduled after the max iterations, then the trip is assigned the previous trip’s choice (i.e. assumed to happen right after the previous trip) or dropped, as configured by the user. The trip scheduling model does not use mode choice logsums.

Alternatives: Available time periods in the tour window (i.e. tour start and end period). When processing stops on work tours, the available time periods is constrained by the at-work subtour start and end period as well.

The main interface to the trip scheduling model is the trip_scheduling() function. This function is registered as an Inject step in the example Pipeline.

Core Table: trips | Result Field: depart | Skims Keys: NA

activitysim.abm.models.trip_scheduling.logger = <Logger activitysim.abm.models.trip_scheduling (WARNING)>#

StopDepartArrivePeriodModel

StopDepartArriveProportions.csv tourpurp,isInbound,interval,trip,p1,p2,p3,p4,p5…p40

activitysim.abm.models.trip_scheduling.schedule_trips_in_leg(outbound, trips, probs_spec, model_settings, is_last_iteration, trace_hh_id, trace_label)#
Parameters
outbound
trips
probs_spec
depart_alt_base
is_last_iteration
trace_hh_id
trace_label
Returns
choices: pd.Series

depart choice for trips, indexed by trip_id

activitysim.abm.models.trip_scheduling.set_stop_num(trips)#

Convert trip_num to stop_num in order to work with duration-based probs that are keyed on stop num. For outbound trips, trip n chooses the duration of stop n-1 (the trip origin). For inbound trips, trip n chooses the duration of stop n (the trip destination). This means outbound trips technically choose a departure time while inbound trips choose an arrival.

activitysim.abm.models.trip_scheduling.set_tour_hour(trips, tours)#

add columns ‘tour_hour’, ‘earliest’, ‘latest’ to trips

Parameters
trips: pd.DataFrame
tours: pd.DataFrame
Returns
modifies trips in place
activitysim.abm.models.trip_scheduling.trip_scheduling(trips, tours, chunk_size, trace_hh_id)#

Trip scheduling assigns depart times for trips within the start, end limits of the tour.

The algorithm is simplistic:

The first outbound trip starts at the tour start time, and subsequent outbound trips are processed in trip_num order, to ensure that subsequent trips do not depart before the trip that preceeds them.

Inbound trips are handled similarly, except in reverse order, starting with the last trip, and working backwards to ensure that inbound trips do not depart after the trip that succeeds them.

The probability spec assigns probabilities for depart times, but those possible departs must be clipped to disallow depart times outside the tour limits, the departs of prior trips, and in the case of work tours, the start/end times of any atwork subtours.

Scheduling can fail if the probability table assigns zero probabilities to all the available depart times in a trip’s depart window. (This could be avoided by giving every window a small probability, rather than zero, but the existing mtctm1 prob spec does not do this. I believe this is due to the its having been generated from a small household travel survey sample that lacked any departs for some time periods.)

Rescheduling the trips that fail (along with their inbound or outbound leg-mates) can sometimes fix this problem, if it was caused by an earlier trip’s depart choice blocking a subsequent trip’s ability to schedule a depart within the resulting window. But it can also happen if a tour is very short (e.g. one time period) and the prob spec having a zero probability for that tour hour.

Therefore we need to handle trips that could not be scheduled. There are two ways (at least) to solve this problem:

1) choose_most_initial simply assign a depart time to the trip, even if it has a zero probability. It makes most sense, in this case, to assign the ‘most initial’ depart time, so that subsequent trips are minimally impacted. This can be done in the final iteration, thus affecting only the trips that could no be scheduled by the standard approach

2) drop_and_cleanup drop trips that could no be scheduled, and adjust their leg mates, as is done for failed trips in trip_destination.

Which option is applied is determined by the FAILFIX model setting

activitysim.abm.models.trip_scheduling.update_tour_earliest(trips, outbound_choices)#

Updates “earliest” column for inbound trips based on the maximum outbound trip departure time of the tour. This is done to ensure inbound trips do not depart before the last outbound trip of a tour.

Parameters
trips: pd.DataFrame
outbound_choices: pd.Series

time periods depart choices, one per trip (except for trips with zero probs)

Returns
——-
modifies trips in place

Trip Scheduling Choice (Logit Choice)#

This model uses a logit-based formulation to determine potential trip windows for the three main components of a tour.

  • Outbound Leg: The time from leaving the origin location to the time second to last outbound stop.

  • Main Leg: The time window from the last outbound stop through the main tour destination to the first inbound stop.

  • Inbound Leg: The time window from the first inbound stop to the tour origin location.

Core Table: tours | Result Field: outbound_duration, main_leg_duration, inbound_duration | Skims Keys: NA

Required YAML attributes:

  • SPECIFICATION

    This file defines the logit specification for each chooser segment.

  • COEFFICIENTS

    Specification coefficients

  • PREPROCESSOR:

    Preprocessor definitions to run on the chooser dataframe (trips) before the model is run

Trip Departure Choice (Logit Choice)#

Used in conjuction with Trip Scheduling Choice (Logit Choice), this model chooses departure time periods consistent with the time windows for the appropriate leg of the trip.

Core Table: trips | Result Field: depart | Skims Keys: NA

Required YAML attributes:

  • SPECIFICATION

    This file defines the logit specification for each chooser segment.

  • COEFFICIENTS

    Specification coefficients

  • PREPROCESSOR:

    Preprocessor definitions to run on the chooser dataframe (trips) before the model is run

Trip Mode Choice#

The trip mode choice model assigns a travel mode for each trip on a given tour. It operates similarly to the tour mode choice model, but only certain trip modes are available for each tour mode. The correspondence rules are defined according to the following principles:

  • Pay trip modes are only available for pay tour modes (for example, drive-alone pay is only available at the trip mode level if drive-alone pay is selected as a tour mode).

  • The auto occupancy of the tour mode is determined by the maximum occupancy across all auto trips that make up the tour. Therefore, the auto occupancy for the tour mode is the maximum auto occupancy for any trip on the tour.

  • Transit tours can include auto shared-ride trips for particular legs. Therefore, ‘casual carpool’, wherein travelers share a ride to work and take transit back to the tour origin, is explicitly allowed in the tour/trip mode choice model structure.

  • The walk mode is allowed for any trip.

  • The availability of transit line-haul submodes on transit tours depends on the skimming and tour mode choice hierarchy. Free shared-ride modes are also available in walk-transit tours, albeit with a low probability. Paid shared-ride modes are not allowed on transit tours because no stated preference data is available on the sensitivity of transit riders to automobile value tolls, and no observed data is available to verify the number of people shifting into paid shared-ride trips on transit tours.

The trip mode choice models explanatory variables include household and person variables, level-of-service between the trip origin and destination according to the time period for the tour leg, urban form variables, and alternative-specific constants segmented by tour mode.

The main interface to the trip mode choice model is the trip_mode_choice() function. This function is registered as an Inject step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: trips | Result Field: trip_mode | Skims Keys: origin, destination, trip_period

activitysim.abm.models.trip_mode_choice.trip_mode_choice(trips, network_los, chunk_size, trace_hh_id)#

Trip mode choice - compute trip_mode (same values as for tour_mode) for each trip.

Modes for each primary tour putpose are calculated separately because they have different coefficient values (stored in trip_mode_choice_coefficients.csv coefficient file.)

Adds trip_mode column to trip table

Parking Location Choice#

The parking location choice model selects a parking location for specified trips. While the model does not require parking location be applied to any specific set of trips, it is usually applied for drive trips to specific zones (e.g., CBD) in the model.

The model provides provides a filter for both the eligible choosers and eligible parking location zone. The trips dataframe is the chooser of this model. The zone selection filter is applied to the land use zones dataframe.

If this model is specified in the pipeline, the Write Trip Matrices step will using the parking location choice results to build trip tables in lieu of the trip destination.

The main interface to the trip mode choice model is the parking_location_choice() function. This function is registered as an Inject step, and it is available from the pipeline. See Writing Logsums for how to write logsums for estimation.

Skims

  • odt_skims: Origin to Destination by Time of Day

  • dot_skims: Destination to Origin by Time of Day

  • opt_skims: Origin to Parking Zone by Time of Day

  • pdt_skims: Parking Zone to Destination by Time of Day

  • od_skims: Origin to Destination

  • do_skims: Destination to Origin

  • op_skims: Origin to Parking Zone

  • pd_skims: Parking Zone to Destination

Core Table: trips

Required YAML attributes:

  • SPECIFICATION

    This file defines the logit specification for each chooser segment.

  • COEFFICIENTS

    Specification coefficients

  • PREPROCESSOR:

    Preprocessor definitions to run on the chooser dataframe (trips) before the model is run

  • CHOOSER_FILTER_COLUMN_NAME

    Boolean field on the chooser table defining which choosers are eligible to parking location choice model. If no filter is specified, all choosers (trips) are eligible for the model.

  • CHOOSER_SEGMENT_COLUMN_NAME

    Column on the chooser table defining the parking segment for the logit model

  • SEGMENTS

    List of eligible chooser segments in the logit specification

  • ALTERNATIVE_FILTER_COLUMN_NAME

    Boolean field used to filter land use zones as eligible parking location choices. If no filter is specified, then all land use zones are considered as viable choices.

  • ALT_DEST_COL_NAME

    The column name to append with the parking location choice results. For choosers (trips) ineligible for this model, a -1 value will be placed in column.

  • TRIP_ORIGIN

    Origin field on the chooser trip table

  • TRIP_DESTINATION

    Destination field on the chooser trip table

activitysim.abm.models.parking_location_choice.parking_destination_simulate(segment_name, trips, destination_sample, model_settings, skims, chunk_size, trace_hh_id, trace_label)#

Chose destination from destination_sample (with od_logsum and dp_logsum columns added)

Returns
choices - pandas.Series

destination alt chosen

activitysim.abm.models.parking_location_choice.parking_location(trips, trips_merged, land_use, network_los, chunk_size, trace_hh_id)#

Given a set of trips, each trip needs to have a parking location if it is eligible for remote parking.

activitysim.abm.models.parking_location_choice.wrap_skims(model_settings)#

wrap skims of trip destination using origin, dest column names from model settings. Various of these are used by destination_sample, compute_logsums, and destination_simulate so we create them all here with canonical names.

Note that compute_logsums aliases their names so it can use the same equations to compute logsums from origin to alt_dest, and from alt_dest to primarly destination

odt_skims - SkimStackWrapper: trip origin, trip alt_dest, time_of_day dot_skims - SkimStackWrapper: trip alt_dest, trip origin, time_of_day dpt_skims - SkimStackWrapper: trip alt_dest, trip primary_dest, time_of_day pdt_skims - SkimStackWrapper: trip primary_dest,trip alt_dest, time_of_day od_skims - SkimDictWrapper: trip origin, trip alt_dest dp_skims - SkimDictWrapper: trip alt_dest, trip primary_dest

Parameters
model_settings
Returns
dict containing skims, keyed by canonical names relative to tour orientation

Write Trip Matrices#

Write open matrix (OMX) trip matrices for assignment. Reads the trips table post preprocessor and run expressions to code additional data fields, with one data fields for each matrix specified. The matrices are scaled by a household level expansion factor, which is the household sample rate by default, which is calculated when households are read in at the beginning of a model run. The main interface to write trip matrices is the write_trip_matrices() function. This function is registered as an Inject step in the example Pipeline.

If the Parking Location Choice model is defined in the pipeline, the parking location zone will be used in lieu of the destination zone.

Core Table: trips | Result: omx trip matrices | Skims Keys: origin, destination

activitysim.abm.models.trip_matrices.annotate_trips(trips, network_los, model_settings)#

Add columns to local trips table. The annotator has access to the origin/destination skims and everything defined in the model settings CONSTANTS.

Pipeline tables can also be accessed by listing them under TABLES in the preprocessor settings.

activitysim.abm.models.trip_matrices.write_matrices(aggregate_trips, zone_index, orig_index, dest_index, model_settings, is_tap=False)#

Write aggregated trips to OMX format.

The MATRICES setting lists the new OMX files to write. Each file can contain any number of ‘tables’, each specified by a table key (‘name’) and a trips table column (‘data_field’) to use for aggregated counts.

Any data type may be used for columns added in the annotation phase, but the table ‘data_field’s must be summable types: ints, floats, bools.

activitysim.abm.models.trip_matrices.write_trip_matrices(network_los)#

Write trip matrices step.

Adds boolean columns to local trips table via annotation expressions, then aggregates trip counts and writes OD matrices to OMX. Save annotated trips table to pipeline if desired.

Writes taz trip tables for one and two zone system. Writes taz and tap trip tables for three zone system. Add is_tap:True to the settings file to identify an output matrix as tap level trips as opposed to taz level trips.

For one zone system, uses the land use table for the set of possible tazs. For two zone system, uses the taz skim zone names for the set of possible tazs. For three zone system, uses the taz skim zone names for the set of possible tazs and uses the tap skim zone names for the set of possible taps.

Util#

Additional helper classes

CDAP#

activitysim.abm.models.util.cdap.add_interaction_column(choosers, p_tup)#

Add an interaction column in place to choosers, listing the ptypes of the persons in p_tup

The name of the interaction column will be determined by the cdap_ranks from p_tup, and the rows in the column contain the ptypes of those persons in that household row.

For instance, for p_tup = (1,3) choosers interaction column name will be ‘p1_p3’

For a household where person 1 is part-time worker (ptype=2) and person 3 is infant (ptype 8) the corresponding row value interaction code will be 28

We take advantage of the fact that interactions are symmetrical to simplify spec expressions: We name the interaction_column in increasing pnum (cdap_rank) order (p1_p2 and not p3_p1) And we format row values in increasing ptype order (28 and not 82) This simplifies the spec expressions as we don’t have to test for p1_p3 == 28 | p1_p3 == 82

Parameters
chooserspandas.DataFrame

household choosers, indexed on _hh_index_ choosers should contain columns ptype_p1, ptype_p2 for each cdap_rank person in hh

p_tupint tuple

tuple specifying the cdap_ranks for the interaction column p_tup = (1,3) means persons with cdap_rank 1 and 3

Returns
activitysim.abm.models.util.cdap.add_pn(col, pnum)#

return the canonical column name for the indiv_util column or columns in merged hh_chooser df for individual with cdap_rank pnum

e.g. M_p1, ptype_p2 but leave _hh_id_ column unchanged

activitysim.abm.models.util.cdap.assign_cdap_rank(persons, person_type_map, trace_hh_id=None, trace_label=None)#

Assign an integer index, cdap_rank, to each household member. (Starting with 1, not 0)

Modifies persons df in place

The cdap_rank order is important, because cdap only assigns activities to the first MAX_HHSIZE persons in each household.

This will preferentially be two working adults and the three youngest children.

Rank is assigned starting at 1. This necessitates some care indexing, but is preferred as it follows the convention of 1-based pnums in expression files.

According to the documentation of reOrderPersonsForCdap in mtctm2.abm.ctramp HouseholdCoordinatedDailyActivityPatternModel:

“Method reorders the persons in the household for use with the CDAP model, which only explicitly models the interaction of five persons in a HH. Priority in the reordering is first given to full time workers (up to two), then to part time workers (up to two workers, of any type), then to children (youngest to oldest, up to three). If the method is called for a household with less than 5 people, the cdapPersonArray is the same as the person array.”

We diverge from the above description in that a cdap_rank is assigned to all persons, including ‘extra’ household members, whose activity is assigned subsequently. The pair _hh_id_, cdap_rank will uniquely identify each household member.

Parameters
personspandas.DataFrame

Table of persons data. Must contain columns _hh_size_, _hh_id_, _ptype_, _age_

Returns
cdap_rankpandas.Series

integer cdap_rank of every person, indexed on _persons_index_

activitysim.abm.models.util.cdap.build_cdap_spec(interaction_coefficients, hhsize, trace_spec=False, trace_label=None, cache=True)#

Build a spec file for computing utilities of alternative household member interaction patterns for households of specified size.

We generate this spec automatically from a table of rules and coefficients because the interaction rules are fairly simple and can be expressed compactly whereas there is a lot of redundancy between the spec files for different household sizes, as well as in the vectorized expression of the interaction alternatives within the spec file itself

interaction_coefficients has five columns:
activity

A single character activity type name (M, N, or H)

interaction_ptypes

List of ptypes in the interaction (in order of increasing ptype) or empty for wildcards (meaning that the interaction applies to all ptypes in that size hh)

cardinality

the number of persons in the interaction (e.g. 3 for a 3-way interaction)

slug

a human friendly efficient name so we can dump a readable spec trace file for debugging this slug is replaced with the numerical coefficient value after we dump the trace file

coefficient

The coefficient to apply for all hh interactions for this activity and set of ptypes

The generated spec will have the eval expression in the index, and a utility column for each alternative (e.g. [‘HH’, ‘HM’, ‘HN’, ‘MH’, ‘MM’, ‘MN’, ‘NH’, ‘NM’, ‘NN’] for hhsize 2)

In order to be able to dump the spec in a human-friendly fashion to facilitate debugging the cdap_interaction_coefficients table, we first populate utility columns in the spec file with the coefficient slugs, dump the spec file, and then replace the slugs with coefficients.

Parameters
interaction_coefficientspandas.DataFrame

Rules and coefficients for generating interaction specs for different household sizes

hhsizeint

household size for which the spec should be built.

Returns
spec: pandas.DataFrame
activitysim.abm.models.util.cdap.extra_hh_member_choices(persons, cdap_fixed_relative_proportions, locals_d, trace_hh_id, trace_label)#

Generate the activity choices for the ‘extra’ household members who weren’t handled by cdap

Following the CTRAMP HouseholdCoordinatedDailyActivityPatternModel, “a separate, simple cross-sectional distribution is looked up for the remaining household members”

The cdap_fixed_relative_proportions spec is handled like an activitysim logit utility spec, EXCEPT that the values computed are relative proportions, not utilities (i.e. values are not exponentiated before being normalized to probabilities summing to 1.0)

Parameters
personspandas.DataFrame
Table of persons data indexed on _persons_index_

We expect, at least, columns [_hh_id_, _ptype_]

cdap_fixed_relative_proportions

spec to compute/specify the relative proportions of each activity (M, N, H) that should be used to choose activities for additional household members not handled by CDAP.

locals_dDict

dictionary of local variables that eval_variables adds to the environment for an evaluation of an expression that begins with @

Returns
choicespandas.Series

list of alternatives chosen for all extra members, indexed by _persons_index_

activitysim.abm.models.util.cdap.hh_choosers(indiv_utils, hhsize)#

Build a chooser table for calculating house utilities for all households of specified hhsize

The choosers table will have one row per household with columns containing the indiv_utils for all non-extra (i.e. cdap_rank <- MAX_HHSIZE) persons. That makes 3 columns for each individual. e.g. the utilities of person with cdap_rank 1 will be included as M_p1, N_p1, H_p1

The chooser table will also contain interaction columns for all possible interactions involving from 2 to 3 persons (actually MAX_INTERACTION_CARDINALITY, which is currently 3).

The interaction columns list the ptypes of the persons in the interaction set, sorted by ptype. For instance the interaction between persons with cdap_rank 1 and three and ptypes will be listed in a column named ‘p1_p3’ and for a household where persons p1 and p3 are 2 and 4 will a row value of 24 in the p1_p3 column.

Parameters
indiv_utilspandas.DataFrame

CDAP utilities for each individual, ignoring interactions. ind_utils has index of _persons_index_ and a column for each alternative i.e. three columns ‘M’ (Mandatory), ‘N’ (NonMandatory), ‘H’ (Home)

hhsizeint

household size for which the choosers table should be built. Households with more than MAX_HHSIZE members will be included with MAX_HHSIZE choosers since the are handled the same, and the activities of the extra members are assigned afterwards

Returns
chooserspandas.DataFrame

choosers households of hhsize with activity utility columns interaction columns for all (non-extra) household members

activitysim.abm.models.util.cdap.household_activity_choices(indiv_utils, interaction_coefficients, hhsize, trace_hh_id=None, trace_label=None)#

Calculate household utilities for each activity pattern alternative for households of hhsize The resulting activity pattern for each household will be coded as a string of activity codes. e.g. ‘MNHH’ for a 4 person household with activities Mandatory, NonMandatory, Home, Home

Parameters
indiv_utilspandas.DataFrame

CDAP utilities for each individual, ignoring interactions ind_utils has index of _persons_index_ and a column for each alternative i.e. three columns ‘M’ (Mandatory), ‘N’ (NonMandatory), ‘H’ (Home)

interaction_coefficientspandas.DataFrame

Rules and coefficients for generating interaction specs for different household sizes

hhsizeint

the size of household for which activity perttern should be calculated (1..MAX_HHSIZE)

Returns
choicespandas.Series

the chosen cdap activity pattern for each household represented as a string (e.g. ‘MNH’) with same index (_hh_index_) as utils

activitysim.abm.models.util.cdap.individual_utilities(persons, cdap_indiv_spec, locals_d, trace_hh_id=None, trace_label=None)#

Calculate CDAP utilities for all individuals.

Parameters
personspandas.DataFrame

DataFrame of individual persons data.

cdap_indiv_specpandas.DataFrame

CDAP spec applied to individuals.

Returns
utilitiespandas.DataFrame

Will have index of persons and columns for each of the alternatives. plus some ‘useful columns’ [_hh_id_, _ptype_, ‘cdap_rank’, _hh_size_]

activitysim.abm.models.util.cdap.preprocess_interaction_coefficients(interaction_coefficients)#

The input cdap_interaction_coefficients.csv file has three columns:

activity

A single character activity type name (M, N, or H)

interaction_ptypes

List of ptypes in the interaction (in order of increasing ptype) Stars (***) instead of ptypes means the interaction applies to all ptypes in that size hh.

coefficient

The coefficient to apply for all hh interactions for this activity and set of ptypes

To facilitate building the spec for a given hh ssize, we add two additional columns:

cardinality

the number of persons in the interaction (e.g. 3 for a 3-way interaction)

slug

a human friendly efficient name so we can dump a readable spec trace file for debugging this slug is then replaced with the numerical coefficient value prior to evaluation

activitysim.abm.models.util.cdap.run_cdap(persons, person_type_map, cdap_indiv_spec, cdap_interaction_coefficients, cdap_fixed_relative_proportions, locals_d, chunk_size=0, trace_hh_id=None, trace_label=None)#

Choose individual activity patterns for persons.

Parameters
personspandas.DataFrame

Table of persons data. Must contain at least a household ID, household size, person type category, and age, plus any columns used in cdap_indiv_spec

cdap_indiv_specpandas.DataFrame

CDAP spec for individuals without taking any interactions into account.

cdap_interaction_coefficientspandas.DataFrame

Rules and coefficients for generating interaction specs for different household sizes

cdap_fixed_relative_proportionspandas.DataFrame

Spec to for the relative proportions of each activity (M, N, H) to choose activities for additional household members not handled by CDAP

locals_dDict

This is a dictionary of local variables that will be the environment for an evaluation of an expression that begins with @ in either the cdap_indiv_spec or cdap_fixed_relative_proportions expression files

chunk_size: int

Chunk size or 0 for no chunking

trace_hh_idint

hh_id to trace or None if no hh tracing

trace_labelstr

label for tracing or None if no tracing

Returns
choicespandas.DataFrame

dataframe is indexed on _persons_index_ and has two columns:

cdap_activitystr

activity for that person expressed as ‘M’, ‘N’, ‘H’

activitysim.abm.models.util.cdap.unpack_cdap_indiv_activity_choices(persons, hh_choices, trace_hh_id, trace_label)#

Unpack the household activity choice list into choices for each (non-extra) household member

Parameters
personspandas.DataFrame

Table of persons data indexed on _persons_index_ We expect, at least, columns [_hh_id_, ‘cdap_rank’]

hh_choicespandas.Series

household activity pattern is encoded as a string (of length hhsize) of activity codes e.g. ‘MNHH’ for a 4 person household with activities Mandatory, NonMandatory, Home, Home

Returns
cdap_indiv_activity_choicespandas.Series

series contains one activity per individual hh member, indexed on _persons_index_

Estimation#

See Estimation for more information.

Logsums#

activitysim.abm.models.util.logsums.compute_logsums(choosers, tour_purpose, logsum_settings, model_settings, network_los, chunk_size, chunk_tag, trace_label, in_period_col=None, out_period_col=None, duration_col=None)#
Parameters
choosers
tour_purpose
logsum_settings
model_settings
network_los
chunk_size
trace_hh_id
trace_label
Returns
logsums: pandas series

computed logsums with same index as choosers

Mode#

activitysim.abm.models.util.mode.mode_choice_simulate(choosers, spec, nest_spec, skims, locals_d, chunk_size, mode_column_name, logsum_column_name, trace_label, trace_choice_name, trace_column_names=None, estimator=None)#

common method for both tour_mode_choice and trip_mode_choice

Parameters
choosers
spec
nest_spec
skims
locals_d
chunk_size
mode_column_name
logsum_column_name
trace_label
trace_choice_name
estimator
Returns
activitysim.abm.models.util.mode.run_tour_mode_choice_simulate(choosers, tour_purpose, model_settings, mode_column_name, logsum_column_name, network_los, skims, constants, estimator, chunk_size, trace_label=None, trace_choice_name=None)#

This is a utility to run a mode choice model for each segment (usually segments are tour/trip purposes). Pass in the tours/trip that need a mode, the Skim object, the spec to evaluate with, and any additional expressions you want to use in the evaluation of variables.

Overlap#

activitysim.abm.models.util.overlap.p2p_time_window_overlap(p1_ids, p2_ids)#
Parameters
p1_ids
p2_ids
Returns
activitysim.abm.models.util.overlap.rle(a)#

Compute run lengths of values in rows of a two dimensional ndarry of ints.

We assume the first and last columns are buffer columns (because this is the case for time windows) and so don’t include them in results.

Return arrays giving row_id, start_pos, run_length, and value of each run of any length.

Parameters
anumpy.ndarray of int shape(n, <num_time_periods_in_a_day>)

The input array would normally only have values of 0 or 1 to detect overlapping time period availability but we don’t assume this, and will detect and report runs of any values. (Might prove useful in future?…)

Returns
row_idnumpy.ndarray int shape(<num_runs>)
start_posnumpy.ndarray int shape(<num_runs>)
run_lengthnumpy.ndarray int shape(<num_runs>)
run_valnumpy.ndarray int shape(<num_runs>)

Tour Destination#

class activitysim.abm.models.util.tour_destination.SizeTermCalculator(size_term_selector)#

convenience object to provide size_terms for a selector (e.g. non_mandatory) for various segments (e.g. tour_type or purpose) returns size terms for specified segment in df or series form

activitysim.abm.models.util.tour_destination.choose_MAZ_for_TAZ(taz_sample, MAZ_size_terms, trace_label)#

Convert taz_sample table with TAZ zone sample choices to a table with a MAZ zone chosen for each TAZ choose MAZ probabilistically (proportionally by size_term) from set of MAZ zones in parent TAZ

Parameters
taz_sample: dataframe with duplicated index <chooser_id_col> and columns: <DEST_TAZ>, prob, pick_count
MAZ_size_terms: dataframe with duplicated index <chooser_id_col> and columns: zone_id, dest_TAZ, size_term
Returns
dataframe with with duplicated index <chooser_id_col> and columns: <DEST_MAZ>, prob, pick_count
activitysim.abm.models.util.tour_destination.run_destination_logsums(tour_purpose, persons_merged, destination_sample, model_settings, network_los, chunk_size, trace_label)#

add logsum column to existing tour_destination_sample table

logsum is calculated by running the mode_choice model for each sample (person, dest_zone_id) pair in destination_sample, and computing the logsum of all the utilities

person_id

dest_zone_id

rand

pick_count

logsum (added)

23750

14

0.565502716034

4

1.85659498857

23750

16

0.711135838871

6

1.92315598631

23751

12

0.408038878552

1

2.40612135416

23751

14

0.972732479292

2

1.44009018355

activitysim.abm.models.util.tour_destination.run_destination_simulate(spec_segment_name, tours, persons_merged, destination_sample, want_logsums, model_settings, network_los, destination_size_terms, estimator, chunk_size, trace_label, skip_choice=False)#

run destination_simulate on tour_destination_sample annotated with mode_choice logsum to select a destination from sample alternatives

Tour Frequency#

activitysim.abm.models.util.tour_frequency.create_tours(tour_counts, tour_category, parent_col='person_id')#

This method processes the tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the tours that were generated

Parameters
tour_counts: DataFrame

table specifying how many tours of each type to create one row per person (or parent_tour for atwork subtours) one (int) column per tour_type, with number of tours to create

tour_categorystr

one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

Returns
tourspandas.DataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

tours.tour_type - tour type (e.g. school, work, shopping, eat) tours.tour_type_num - if there are two ‘school’ type tours, they will be numbered 1 and 2 tours.tour_type_count - number of tours of tour_type parent has (parent’s max tour_type_num) tours.tour_num - index of tour (of any type) for parent tours.tour_count - number of tours of any type) for parent (parent’s max tour_num) tours.tour_category - one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

activitysim.abm.models.util.tour_frequency.process_atwork_subtours(work_tours, atwork_subtour_frequency_alts)#

This method processes the atwork_subtour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the subtours tours that were generated

Parameters
work_tours: DataFrame

A series which has parent work tour tour_id as the index and columns with person_id and atwork_subtour_frequency.

atwork_subtour_frequency_alts: DataFrame

A DataFrame which has as a unique index with atwork_subtour_frequency values and frequency counts for the subtours to be generated for that choice

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

activitysim.abm.models.util.tour_frequency.process_joint_tours(joint_tour_frequency, joint_tour_frequency_alts, point_persons)#

This method processes the joint_tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the joint tours that were generated

Parameters
joint_tour_frequencypandas.Series

household joint_tour_frequency (which came out of the joint tour frequency model) indexed by household_id

joint_tour_frequency_alts: DataFrame

A DataFrame which has as a unique index with joint_tour_frequency values and frequency counts for the tours to be generated for that choice

point_personspandas DataFrame

table with columns for (at least) person_ids and home_zone_id indexed by household_id

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a tour identifier, a household_id column, a tour_type column and tour_type_num and tour_num columns which is set to 1 or 2 depending whether it is the first or second joint tour made by the household.

activitysim.abm.models.util.tour_frequency.process_mandatory_tours(persons, mandatory_tour_frequency_alts)#

This method processes the mandatory_tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the mandatory tours that were generated

Parameters
personsDataFrame

Persons is a DataFrame which has a column call mandatory_tour_frequency (which came out of the mandatory tour frequency model) and a column is_worker which indicates the person’s worker status. The only valid values of the mandatory_tour_frequency column to take are “work1”, “work2”, “school1”, “school2” and “work_and_school”

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a tour identifier, a person_id column, a tour_type column which is “work” or “school” and a tour_num column which is set to 1 or 2 depending whether it is the first or second mandatory tour made by the person. The logic for whether the work or school tour comes first given a “work_and_school” choice depends on the is_worker column: work tours first for workers, second for non-workers

activitysim.abm.models.util.tour_frequency.process_non_mandatory_tours(persons, tour_counts)#

This method processes the non_mandatory_tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the non mandatory tours that were generated

Parameters
persons: pandas.DataFrame

persons table containing a non_mandatory_tour_frequency column which has the index of the chosen alternative as the value

non_mandatory_tour_frequency_alts: DataFrame

A DataFrame which has as a unique index which relates to the values in the series above typically includes columns which are named for trip purposes with values which are counts for that trip purpose. Example trip purposes include escort, shopping, othmaint, othdiscr, eatout, social, etc. A row would be an alternative which might be to take one shopping trip and zero trips of other purposes, etc.

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

activitysim.abm.models.util.tour_frequency.process_tours(tour_frequency, tour_frequency_alts, tour_category, parent_col='person_id')#

This method processes the tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the tours that were generated

Parameters
tour_frequency: Series

A series which has <parent_col> as the index and the chosen alternative index as the value

tour_frequency_alts: DataFrame

A DataFrame which has as a unique index which relates to the values in the series above typically includes columns which are named for trip purposes with values which are counts for that trip purpose. Example trip purposes include escort, shopping, othmaint, othdiscr, eatout, social, etc. A row would be an alternative which might be to take one shopping trip and zero trips of other purposes, etc.

tour_categorystr

one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

parent_col: str

the name of the index (parent_tour_id for atwork subtours, otherwise person_id)

Returns
tourspandas.DataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

tours.tour_type - tour type (e.g. school, work, shopping, eat) tours.tour_type_num - if there are two ‘school’ type tours, they will be numbered 1 and 2 tours.tour_type_count - number of tours of tour_type parent has (parent’s max tour_type_num) tours.tour_num - index of tour (of any type) for parent tours.tour_count - number of tours of any type) for parent (parent’s max tour_num) tours.tour_category - one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

Trip#

activitysim.abm.models.util.trip.cleanup_failed_trips(trips)#

drop failed trips and cleanup fields in leg_mates:

trip_num assign new ordinal trip num after failed trips are dropped trip_count assign new count of trips in leg, sans failed trips first update first flag as we may have dropped first trip (last trip can’t fail) next_trip_id assign id of next trip in leg after failed trips are dropped

activitysim.abm.models.util.trip.flag_failed_trip_leg_mates(trips_df, col_name)#

set boolean flag column of specified name to identify failed trip leg_mates in place

activitysim.abm.models.util.trip.generate_alternative_sizes(max_duration, max_trips)#

Builds a lookup Numpy array pattern sizes based on the number of trips in the leg and the duration available to the leg. :param max_duration: :param max_trips: :return:

activitysim.abm.models.util.trip.get_time_windows(residual, level)#
Parameters
  • residual

  • level

Returns

activitysim.abm.models.util.trip.initialize_from_tours(tours, stop_frequency_alts, addtl_tour_cols_to_preserve=None)#

Instantiates a trips table based on tour-level attributes: stop frequency, tour origin, tour destination.

Vectorize Tour Scheduling#

activitysim.abm.models.util.vectorize_tour_scheduling.compute_logsums(alt_tdd, tours_merged, tour_purpose, model_settings, skims, trace_label)#

Compute logsums for the tour alt_tdds, which will differ based on their different start, stop times of day, which translate to different odt_skim out_period and in_periods.

In mtctm1, tdds are hourly, but there are only 5 skim time periods, so some of the tdd_alts will be the same, once converted to skim time periods. With 5 skim time periods there are 15 unique out-out period pairs but 190 tdd alternatives.

For efficiency, rather compute a lot of redundant logsums, we compute logsums for the unique (out-period, in-period) pairs and then join them back to the alt_tdds.

activitysim.abm.models.util.vectorize_tour_scheduling.get_previous_tour_by_tourid(current_tour_window_ids, previous_tour_by_window_id, alts)#

Matches current tours with attributes of previous tours for the same person. See the return value below for more information.

Parameters
current_tour_window_idsSeries

A Series of parent ids for the tours we’re about make the choice for - index should match the tours DataFrame.

previous_tour_by_window_idSeries

A Series where the index is the parent (window) id and the value is the index of the alternatives of the scheduling.

altsDataFrame

The alternatives of the scheduling.

Returns
prev_altsDataFrame

A DataFrame with an index matching the CURRENT tours we’re making a decision for, but with columns from the PREVIOUS tour of the person associated with each of the CURRENT tours. Columns listed in PREV_TOUR_COLUMNS from the alternatives will have “_previous” added as a suffix to keep differentiated from the current alternatives that will be part of the interaction.

activitysim.abm.models.util.vectorize_tour_scheduling.run_alts_preprocessor(model_settings, alts, segment, locals_dict, trace_label)#

run preprocessor on alts, as specified by ALTS_PREPROCESSOR in model_settings

we are agnostic on whether alts are merged or not

Parameters
model_settings: dict

yaml model settings file as dict

alts: pandas.DataFrame

tdd_alts or tdd_alts merged wiht choosers (we are agnostic)

segment: string

segment selector as understood by caller (e.g. logsum_tour_purpose)

locals_dict: dict

we let caller worry about what needs to be in it. though actually depends on modelers needs

trace_label: string
Returns
alts: pandas.DataFrame

annotated copy of alts

activitysim.abm.models.util.vectorize_tour_scheduling.schedule_tours(tours, persons_merged, alts, spec, logsum_tour_purpose, model_settings, timetable, timetable_window_id_col, previous_tour, tour_owner_id_col, estimator, chunk_size, tour_trace_label, tour_chunk_tag, sharrow_skip=False)#

chunking wrapper for _schedule_tours

While interaction_sample_simulate provides chunking support, the merged tours, persons dataframe and the tdd_interaction_dataset are very big, so we want to create them inside the chunking loop to minimize memory footprint. So we implement the chunking loop here, and pass a chunk_size of 0 to interaction_sample_simulate to disable its chunking support.

activitysim.abm.models.util.vectorize_tour_scheduling.tdd_interaction_dataset(tours, alts, timetable, choice_column, window_id_col, trace_label)#

interaction_sample_simulate expects alts index same as choosers (e.g. tour_id) name of choice column in alts

Parameters
tourspandas.DataFrame

must have person_id column and index on tour_id

altspandas.DataFrame

alts index must be timetable tdd id

timetableTimeTable object
choice_columnstr

name of column to store alt index in alt_tdd DataFrame (since alt_tdd is duplicate index on person_id but unique on person_id,alt_id)

Returns
alt_tddpandas DataFrame

columns: start, end , duration, <choice_column> index: tour_id

activitysim.abm.models.util.vectorize_tour_scheduling.vectorize_joint_tour_scheduling(joint_tours, joint_tour_participants, persons_merged, alts, persons_timetable, spec, model_settings, estimator, chunk_size=0, trace_label=None, sharrow_skip=False)#

Like vectorize_tour_scheduling but specifically for joint tours

joint tours have a few peculiarities necessitating separate treatment:

Timetable has to be initialized to set all timeperiods…

Parameters
toursDataFrame

DataFrame of tours containing tour attributes, as well as a person_id column to define the nth tour for each person.

persons_mergedDataFrame

DataFrame of persons containing attributes referenced by expressions in spec

altsDataFrame

DataFrame of alternatives which represent time slots. Will be passed to interaction_simulate in batches for each nth tour.

specDataFrame

The spec which will be passed to interaction_simulate. (or dict of specs keyed on tour_type if tour_types is not None)

model_settingsdict
Returns
choicesSeries

A Series of choices where the index is the index of the tours DataFrame and the values are the index of the alts DataFrame.

persons_timetableTimeTable

timetable updated with joint tours (caller should replace_table for it to persist)

activitysim.abm.models.util.vectorize_tour_scheduling.vectorize_subtour_scheduling(parent_tours, subtours, persons_merged, alts, spec, model_settings, estimator, chunk_size=0, trace_label=None, sharrow_skip=False)#

Like vectorize_tour_scheduling but specifically for atwork subtours

subtours have a few peculiarities necessitating separate treatment:

Timetable has to be initialized to set all timeperiods outside parent tour footprint as unavailable. So atwork subtour timewindows are limited to the footprint of the parent work tour. And parent_tour_id’ column of tours is used instead of parent_id as timetable row_id.

Parameters
parent_toursDataFrame

parent tours of the subtours (because we need to know the tdd of the parent tour to assign_subtour_mask of timetable indexed by parent_tour id

subtoursDataFrame

atwork subtours to schedule

persons_mergedDataFrame

DataFrame of persons containing attributes referenced by expressions in spec

altsDataFrame

DataFrame of alternatives which represent time slots. Will be passed to interaction_simulate in batches for each nth tour.

specDataFrame

The spec which will be passed to interaction_simulate. (all subtours share same spec regardless of subtour type)

model_settingsdict
chunk_size
trace_label
Returns
choicesSeries

A Series of choices where the index is the index of the subtours DataFrame and the values are the index of the alts DataFrame.

activitysim.abm.models.util.vectorize_tour_scheduling.vectorize_tour_scheduling(tours, persons_merged, alts, timetable, tour_segments, tour_segment_col, model_settings, chunk_size=0, trace_label=None)#

The purpose of this method is fairly straightforward - it takes tours and schedules them into time slots. Alternatives should be specified so as to define those time slots (usually with start and end times).

schedule_tours adds variables that can be used in the spec which have to do with the previous tours per person. Every column in the alternatives table is appended with the suffix “_previous” and made available. So if your alternatives table has columns for start and end, then start_previous and end_previous will be set to the start and end of the most recent tour for a person. The first time through, start_previous and end_previous are undefined, so make sure to protect with a tour_num >= 2 in the variable computation.

FIXME - fix docstring: tour_segments, tour_segment_col

Parameters
toursDataFrame

DataFrame of tours containing tour attributes, as well as a person_id column to define the nth tour for each person.

persons_mergedDataFrame

DataFrame of persons containing attributes referenced by expressions in spec

altsDataFrame

DataFrame of alternatives which represent time slots. Will be passed to interaction_simulate in batches for each nth tour.

specDataFrame

The spec which will be passed to interaction_simulate. (or dict of specs keyed on tour_type if tour_types is not None)

model_settingsdict
Returns
choicesSeries

A Series of choices where the index is the index of the tours DataFrame and the values are the index of the alts DataFrame.

timetableTimeTable

persons timetable updated with tours (caller should replace_table for it to persist)

Tests#

See activitysim.abm.test and activitysim.abm.models.util.test