Models

The currently implemented example ActivitySim AB models are described below. See the example model Sub-Model Specification Files for more information.

Initialize

The initialize model isn’t really a model, but rather a few data processing steps in the data pipeline. The initialize data processing steps code variables used in downstream models, such as household and person value-of-time. This step also pre-loads the land_use, households, persons, and person_windows tables because random seeds are set differently for each step and therefore the sampling of households depends on which step they are initially loaded in.

The main interface to the initialize land use step is the initialize_landuse() function. The main interface to the initialize household step is the initialize_households() function. The main interface to the initialize tours step is the initialize_tours() function. These functions are registered as orca steps in the example Pipeline.

activitysim.abm.models.initialize.preload_injectables()

preload bulky injectables up front - stuff that isn’t inserted into the pipeline

Initialize LOS

The initialize LOS model isn’t really a model, but rather a series of data processing steps in the data pipeline. The initialize LOS model does two things:

  • Loads skims and cache for later if desired

  • Loads network LOS inputs for transit virtual path building (see Transit Virtual Path Builder), pre-computes tap-to-tap total utilities and cache for later if desired

The main interface to the initialize LOS step is the initialize_los() function. The main interface to the initialize TVPB step is the initialize_tvpb() function. These functions are registered as orca steps in the example Pipeline.

activitysim.abm.models.initialize_los.initialize_los(network_los)

Currently, this step is only needed for THREE_ZONE systems in which the tap_tap_utilities are precomputed in the (presumably subsequent) initialize_tvpb step.

Adds attribute_combinations_df table to the pipeline so that it can be used to as the slicer for multiprocessing the initialize_tvpb s.tep

FIXME - this step is only strictly necessary when multiprocessing, but initialize_tvpb would need to be tweaked FIXME - to instantiate attribute_combinations_df if the pipeline table version were not available.

activitysim.abm.models.initialize_los.initialize_tvpb(network_los, attribute_combinations, chunk_size)

Initialize STATIC tap_tap_utility cache and write mmap to disk.

uses pipeline attribute_combinations table created in initialize_los to determine which attribute tuples to compute utilities for.

if we are single-processing, this will be the entire set of attribute tuples required to fully populate cache

if we are multiprocessing, then the attribute_combinations will have been sliced and we compute only a subset of the tuples (and the other processes will compute the rest). All process wait until the cache is fully populated before returning, and the spokesman/locutor process writes the results.

FIXME - if we did not close this, we could avoid having to reload it from mmap when single-process?

activitysim.abm.models.initialize_los.initialize_tvpb_calc_row_size(choosers, network_los, trace_label)

rows_per_chunk calculator for trip_purpose

Accessibility

The accessibilities model is an aggregate model that calculates multiple origin-based accessibility measures by origin zone to all destination zones.

The accessibility measure first multiplies an employment variable by a mode-specific decay function. The product reflects the difficulty of accessing the activities the farther (in terms of round-trip travel time) the jobs are from the location in question. The products to each destination zone are next summed over each origin zone, and the logarithm of the product mutes large differences. The decay function on the walk accessibility measure is steeper than automobile or transit. The minimum accessibility is zero.

Level-of-service variables from three time periods are used, specifically the AM peak period (6 am to 10 am), the midday period (10 am to 3 pm), and the PM peak period (3 pm to 7 pm).

Inputs

  • Highway skims for the three periods. Each skim is expected to include a table named “TOLLTIMEDA”, which is the drive alone in-vehicle travel time for automobiles willing to pay a “value” (time-savings) toll.

  • Transit skims for the three periods. Each skim is expected to include the following tables: (i) “IVT”, in-vehicle time; (ii) “IWAIT”, initial wait time; (iii) “XWAIT”, transfer wait time; (iv) “WACC”, walk access time; (v) “WAUX”, auxiliary walk time; and, (vi) “WEGR”, walk egress time.

  • Zonal data with the following fields: (i) “TOTEMP”, total employment; (ii) “RETEMPN”, retail trade employment per the NAICS classification.

Outputs

  • taz, travel analysis zone number

  • autoPeakRetail, the accessibility by automobile during peak conditions to retail employment for this TAZ

  • autoPeakTotal, the accessibility by automobile during peak conditions to all employment

  • autoOffPeakRetail, the accessibility by automobile during off-peak conditions to retail employment

  • autoOffPeakTotal, the accessibility by automobile during off-peak conditions to all employment

  • transitPeakRetail, the accessibility by transit during peak conditions to retail employment

  • transitPeakTotal, the accessibility by transit during peak conditions to all employment

  • transitOffPeakRetail, the accessiblity by transit during off-peak conditions to retail employment

  • transitOffPeakTotal, the accessiblity by transit during off-peak conditions to all employment

  • nonMotorizedRetail, the accessibility by walking during all time periods to retail employment

  • nonMotorizedTotal, the accessibility by walking during all time periods to all employment

The main interface to the accessibility model is the compute_accessibility() function. This function is registered as an orca step in the example Pipeline.

Core Table: skims | Result Table: accessibility | Skims Keys: O-D, D-O

activitysim.abm.models.accessibility.compute_accessibility(accessibility, network_los, land_use, trace_od)

Compute accessibility for each zone in land use file using expressions from accessibility_spec

The actual results depend on the expressions in accessibility_spec, but this is initially intended to permit implementation of the mtc accessibility calculation as implemented by Accessibility.job

Compute measures of accessibility used by the automobile ownership model. The accessibility measure first multiplies an employment variable by a mode-specific decay function. The product reflects the difficulty of accessing the activities the farther (in terms of round-trip travel time) the jobs are from the location in question. The products to each destination zone are next summed over each origin zone, and the logarithm of the product mutes large differences. The decay function on the walk accessibility measure is steeper than automobile or transit. The minimum accessibility is zero.

School Location

The usual school location choice models assign a usual school location for the primary mandatory activity of each child and university student in the synthetic population. The models are composed of a set of accessibility-based parameters (including one-way distance between home and primary destination and the tour mode choice logsum - the expected maximum utility in the mode choice model which is given by the logarithm of the sum of exponentials in the denominator of the logit formula) and size terms, which describe the quantity of grade-school or university opportunities in each possible destination.

The school location model is made up of four steps:
  • sampling - selects a sample of alternative school locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative school location.

  • simulate - starts with the table created above and chooses a final school location, this time with the mode choice logsum included.

  • shadow prices - compare modeled zonal destinations to target zonal size terms and calculate updated shadow prices.

These steps are repeated until shadow pricing convergence criteria are satisfied or a max number of iterations is reached. See Shadow Pricing.

The main interfaces to the model is the school_location() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: persons | Result Field: school_taz | Skims Keys: TAZ, alt_dest, AM time period, MD time period

Work Location

The usual work location choice models assign a usual work location for the primary mandatory activity of each employed person in the synthetic population. The models are composed of a set of accessibility-based parameters (including one-way distance between home and primary destination and the tour mode choice logsum - the expected maximum utility in the mode choice model which is given by the logarithm of the sum of exponentials in the denominator of the logit formula) and size terms, which describe the quantity of work opportunities in each possible destination.

The work location model is made up of four steps:
  • sample - selects a sample of alternative work locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative work location.

  • simulate - starts with the table created above and chooses a final work location, this time with the mode choice logsum included.

  • shadow prices - compare modeled zonal destinations to target zonal size terms and calculate updated shadow prices.

These steps are repeated until shadow pricing convergence criteria are satisfied or a max number of iterations is reached. See Shadow Pricing.

The main interfaces to the model is the workplace_location() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: persons | Result Field: workplace_taz | Skims Keys: TAZ, alt_dest, AM time period, PM time period

activitysim.abm.models.location_choice.iterate_location_choice(model_settings, persons_merged, persons, households, network_los, estimator, chunk_size, trace_hh_id, locutor, trace_label)

iterate run_location_choice updating shadow pricing until convergence criteria satisfied or max_iterations reached.

(If use_shadow_pricing not enabled, then just iterate once)

Parameters
model_settingsdict
persons_mergedinjected table
personsinjected table
network_loslos.Network_LOS
chunk_sizeint
trace_hh_idint
locutorbool

whether this process is the privileged logger of shadow_pricing when multiprocessing

trace_labelstr
Returns
adds choice column model_settings[‘DEST_CHOICE_COLUMN_NAME’]
adds logsum column model_settings[‘DEST_CHOICE_LOGSUM_COLUMN_NAME’]- if provided
adds annotations to persons table
activitysim.abm.models.location_choice.run_location_choice(persons_merged_df, network_los, shadow_price_calculator, want_logsums, want_sample_table, estimator, model_settings, chunk_size, trace_hh_id, trace_label)

Run the three-part location choice algorithm to generate a location choice for each chooser

Handle the various segments separately and in turn for simplicity of expression files

Parameters
persons_merged_dfpandas.DataFrame

persons table merged with households and land_use

network_loslos.Network_LOS
shadow_price_calculatorShadowPriceCalculator

to get size terms

want_logsumsboolean
want_sample_tableboolean
estimator: Estimator object
model_settingsdict
chunk_sizeint
trace_hh_idint
trace_labelstr
Returns
choicespandas.DataFrame indexed by persons_merged_df.index

‘choice’ : location choices (zone ids) ‘logsum’ : float logsum of choice utilities across alternatives

logsums optional & only returned if DEST_CHOICE_LOGSUM_COLUMN_NAME specified in model_settings
activitysim.abm.models.location_choice.run_location_logsums(segment_name, persons_merged_df, network_los, location_sample_df, model_settings, chunk_size, trace_hh_id, trace_label)

add logsum column to existing location_sample table

logsum is calculated by running the mode_choice model for each sample (person, dest_zone_id) pair in location_sample, and computing the logsum of all the utilities

PERID

dest_zone_id

rand

pick_count

logsum (added)

23750

14

0.565502716034

4

1.85659498857

23750

16

0.711135838871

6

1.92315598631

23751

12

0.408038878552

1

2.40612135416

23751

14

0.972732479292

2

1.44009018355

activitysim.abm.models.location_choice.run_location_sample(segment_name, persons_merged, network_los, dest_size_terms, estimator, model_settings, chunk_size, trace_label)

select a sample of alternative locations.

Logsum calculations are expensive, so we build a table of persons * all zones and then select a sample subset of potential locations

The sample subset is generated by making multiple choices (<sample_size> number of choices) which results in sample containing up to <sample_size> choices for each choose (e.g. person) and a pick_count indicating how many times that choice was selected for that chooser.)

person_id, dest_zone_id, rand, pick_count 23750, 14, 0.565502716034, 4 23750, 16, 0.711135838871, 6 … 23751, 12, 0.408038878552, 1 23751, 14, 0.972732479292, 2

activitysim.abm.models.location_choice.run_location_simulate(segment_name, persons_merged, location_sample_df, network_los, dest_size_terms, want_logsums, estimator, model_settings, chunk_size, trace_label)

run location model on location_sample annotated with mode_choice logsum to select a dest zone from sample alternatives

Returns
choicespandas.DataFrame indexed by persons_merged_df.index

choice : location choices (zone ids) logsum : float logsum of choice utilities across alternatives

logsums optional & only returned if DEST_CHOICE_LOGSUM_COLUMN_NAME specified in model_settings
activitysim.abm.models.location_choice.school_location(persons_merged, persons, households, network_los, chunk_size, trace_hh_id, locutor)

School location choice model

iterate_location_choice adds location choice column and annotations to persons table

activitysim.abm.models.location_choice.workplace_location(persons_merged, persons, households, network_los, chunk_size, trace_hh_id, locutor)

workplace location choice model

iterate_location_choice adds location choice column and annotations to persons table

activitysim.abm.models.location_choice.write_estimation_specs(estimator, model_settings, settings_file)

write sample_spec, spec, and coefficients to estimation data bundle

Parameters
model_settings
settings_file

Shadow Pricing

The shadow pricing calculator used by work and school location choice.

activitysim.abm.tables.shadow_pricing.add_size_tables()

inject tour_destination_size_terms tables for each model_selector (e.g. school, workplace)

Size tables are pandas dataframes with locations counts for model_selector by zone and segment tour_destination_size_terms

if using shadow pricing, we scale size_table counts to sample population (in which case, they have to be created while single-process)

Scaling is problematic as it breaks household result replicability across sample sizes It also changes the magnitude of the size terms so if they are used as utilities in expression files, their importance will diminish relative to other utilities as the sample size decreases.

Scaling makes most sense for a full sample in conjunction with shadow pricing, where shadow prices can be adjusted iteratively to bring modelled counts into line with desired (size table) counts.

activitysim.abm.tables.shadow_pricing.block_name(model_selector)

return canonical block name for model_selector

Ordinarily and ideally this would just be model_selector, but since mp_tasks saves all shared data blocks in a common dict to pass to sub-tasks, we want to be able override block naming convention to handle any collisions between model_selector names and skim names. Until and unless that happens, we just use model_selector name.

Parameters
model_selector
Returns
block_namestr

canonical block name

activitysim.abm.tables.shadow_pricing.buffers_for_shadow_pricing(shadow_pricing_info)

Allocate shared_data buffers for multiprocess shadow pricing

Allocates one buffer per model_selector. Buffer datatype and shape specified by shadow_pricing_info

buffers are multiprocessing.Array (RawArray protected by a multiprocessing.Lock wrapper) We don’t actually use the wrapped version as it slows access down and doesn’t provide protection for numpy-wrapped arrays, but it does provide a convenient way to bundle RawArray and an associated lock. (ShadowPriceCalculator uses the lock to coordinate access to the numpy-wrapped RawArray.)

Parameters
shadow_pricing_infodict
Returns
data_buffersdict {<model_selector>

dict of multiprocessing.Array keyed by model_selector

activitysim.abm.tables.shadow_pricing.get_shadow_pricing_info()

return dict with info about dtype and shapes of desired and modeled size tables

block shape is (num_zones, num_segments + 1)

Returns
shadow_pricing_info: dict

dtype: <sp_dtype>, block_shapes: dict {<model_selector>: <block_shape>}

activitysim.abm.tables.shadow_pricing.load_shadow_price_calculator(model_settings)

Initialize ShadowPriceCalculator for model_selector (e.g. school or workplace)

If multiprocessing, get the shared_data buffer to coordinate global_desired_size calculation across sub-processes

Parameters
model_settingsdict
Returns
spcShadowPriceCalculator
activitysim.abm.tables.shadow_pricing.logger = <Logger activitysim.abm.tables.shadow_pricing (WARNING)>

ShadowPriceCalculator and associated utility methods

See docstrings for documentation on:

update_shadow_prices how shadow_price coefficients are calculated synchronize_choices interprocess communication to compute aggregate modeled_size check_fit convergence criteria for shadow_pric iteration

Import concepts and variables:

model_selector: str

Identifies a specific location choice model (e.g. ‘school’, ‘workplace’) The various models work similarly, but use different expression files, model settings, etc.

segment: str

Identifies a specific demographic segment of a model (e.g. ‘elementary’ segment of ‘school’) Models can have different size term coefficients (in destinatin_choice_size_terms file) and different utility coefficients in models’s location and location_sample csv expression files

size_table: pandas.DataFrame

activitysim.abm.tables.shadow_pricing.shadow_price_data_from_buffers(data_buffers, shadow_pricing_info, model_selector)
Parameters
data_buffersdict of {<model_selector>

multiprocessing.Array is simply a convenient way to bundle Array and Lock we extract the lock and wrap the RawArray in a numpy array for convenience in indexing The shared data buffer has shape (<num_zones, <num_segments> + 1) extra column is for reverse semaphores with TALLY_CHECKIN and TALLY_CHECKOUT

shadow_pricing_infodict
dict of useful info

dtype: sp_dtype, block_shapes : OrderedDict({<model_selector>: <shape tuple>}) dict mapping model_selector to block shape (including extra column for semaphores) e.g. {‘school’: (num_zones, num_segments + 1)

model_selectorstr

location type model_selector (e.g. school or workplace)

Returns
shared_data, shared_data_lock

shared_data : multiprocessing.Array or None (if single process) shared_data_lock : numpy array wrapping multiprocessing.RawArray or None (if single process)

activitysim.abm.tables.shadow_pricing.size_table_name(model_selector)

Returns canonical name of injected destination desired_size table

Parameters
model_selectorstr

e.g. school or workplace

Returns
table_namestr

Auto Ownership

The auto ownership model selects a number of autos for each household in the simulation. The primary model components are household demographics, zonal density, and accessibility.

The main interface to the auto ownership model is the auto_ownership_simulate() function. This function is registered as an orca step in the example Pipeline.

Core Table: households | Result Field: auto_ownership | Skims Keys: NA

activitysim.abm.models.auto_ownership.auto_ownership_simulate(households, households_merged, chunk_size, trace_hh_id)

Auto ownership is a standard model which predicts how many cars a household with given characteristics owns

Free Parking Eligibility

The Free Parking Eligibility model predicts the availability of free parking at a person’s workplace. It is applied for people who work in zones that have parking charges, which are generally located in the Central Business Districts. The purpose of the model is to adequately reflect the cost of driving to work in subsequent models, particularly in mode choice.

The main interface to the free parking eligibility model is the free_parking() function. This function is registered as an orca step in the example Pipeline.

Core Table: persons | Result Field: free_parking_at_work | Skims Keys: NA

activitysim.abm.models.free_parking.free_parking(persons_merged, persons, chunk_size, trace_hh_id)

Coordinated Daily Activity Pattern

The Coordinated Daily Activity Pattern (CDAP) model predicts the choice of daily activity pattern (DAP) for each member in the household, simultaneously. The DAP is categorized in to three types as follows:

  • Mandatory: the person engages in travel to at least one out-of-home mandatory activity - work, university, or school. The mandatory pattern may also include non-mandatory activities such as separate home-based tours or intermediate stops on mandatory tours.

  • Non-mandatory: the person engages in only maintenance and discretionary tours, which, by definition, do not contain mandatory activities.

  • Home: the person does not travel outside the home.

The CDAP model is a sequence of vectorized table operations:

  • create a person level table and rank each person in the household for inclusion in the CDAP model. Priority is given to full time workers (up to two), then to part time workers (up to two workers, of any type), then to children (youngest to oldest, up to three). Additional members up to five are randomly included for the CDAP calculation.

  • solve individual M/N/H utilities for each person

  • take as input an interaction coefficients table and then programmatically produce and write out the expression files for households size 1, 2, 3, 4, and 5 models independent of one another

  • select households of size 1, join all required person attributes, and then read and solve the automatically generated expressions

  • repeat for households size 2, 3, 4, and 5. Each model is independent of one another.

The main interface to the CDAP model is the run_cdap() function. This function is called by the orca step cdap_simulate which is registered as an orca step in the example Pipeline. There are two cdap class definitions in ActivitySim. The first is at cdap() and contains the orca wrapper for running it as part of the model pipeline. The second is at cdap() and contains CDAP model logic.

Core Table: persons | Result Field: cdap_activity | Skims Keys: NA

activitysim.abm.models.cdap.cdap_simulate(persons_merged, persons, households, chunk_size, trace_hh_id)

CDAP stands for Coordinated Daily Activity Pattern, which is a choice of high-level activity pattern for each person, in a coordinated way with other members of a person’s household.

Because Python requires vectorization of computation, there are some specialized routines in the cdap directory of activitysim for this purpose. This module simply applies those utilities using the simulation framework.

Mandatory Tour Frequency

The individual mandatory tour frequency model predicts the number of work and school tours taken by each person with a mandatory DAP. The primary drivers of mandatory tour frequency are demographics, accessibility-based parameters such as drive time to work, and household automobile ownership. It also creates mandatory tours in the data pipeline.

The main interface to the mandatory tour purpose frequency model is the mandatory_tour_frequency() function. This function is registered as an orca step in the example Pipeline.

Core Table: persons | Result Fields: mandatory_tour_frequency | Skims Keys: NA

activitysim.abm.models.mandatory_tour_frequency.mandatory_tour_frequency(persons_merged, chunk_size, trace_hh_id)

This model predicts the frequency of making mandatory trips (see the alternatives above) - these trips include work and school in some combination.

Mandatory Tour Scheduling

The mandatory tour scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each mandatory tour. The primary drivers in the model are accessibility-based parameters such as the mode choice logsum for the departure/arrival hour combination, demographics, and time pattern characteristics such as the time windows available from previously scheduled tours. This model uses person Person Time Windows.

The main interface to the mandatory tour purpose scheduling model is the mandatory_tour_scheduling() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: TAZ, workplace_taz, school_taz, start, end

activitysim.abm.models.mandatory_scheduling.mandatory_tour_scheduling(tours, persons_merged, tdd_alts, chunk_size, trace_hh_id)

This model predicts the departure time and duration of each activity for mandatory tours

Joint Tour Frequency

The joint tour generation models are divided into three sub-models: the joint tour frequency model, the party composition model, and the person participation model. In the joint tour frequency model, the household chooses the purposes and number (up to two) of its fully joint travel tours. It also creates joints tours in the data pipeline.

The main interface to the joint tour purpose frequency model is the joint_tour_frequency() function. This function is registered as an orca step in the example Pipeline.

Core Table: households | Result Fields: num_hh_joint_tours | Skims Keys: NA

activitysim.abm.models.joint_tour_frequency.joint_tour_frequency(households, persons, chunk_size, trace_hh_id)

This model predicts the frequency of making fully joint trips (see the alternatives above).

Joint Tour Composition

In the joint tour party composition model, the makeup of the travel party (adults, children, or mixed - adults and children) is determined for each joint tour. The party composition determines the general makeup of the party of participants in each joint tour in order to allow the micro-simulation to faithfully represent the prevalence of adult-only, children-only, and mixed joint travel tours for each purpose while permitting simplicity in the subsequent person participation model.

The main interface to the joint tour composition model is the joint_tour_composition() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Fields: composition | Skims Keys: NA

activitysim.abm.models.joint_tour_composition.joint_tour_composition(tours, households, persons, chunk_size, trace_hh_id)

This model predicts the makeup of the travel party (adults, children, or mixed).

Joint Tour Participation

In the joint tour person participation model, each eligible person sequentially makes a choice to participate or not participate in each joint tour. Since the party composition model determines what types of people are eligible to join a given tour, the person participation model can operate in an iterative fashion, with each household member choosing to join or not to join a travel party independent of the decisions of other household members. In the event that the constraints posed by the result of the party composition model are not met, the person participation model cycles through the household members multiple times until the required types of people have joined the travel party.

This step also creates the joint_tour_participants table in the pipeline, which stores the person ids for each person on the tour.

The main interface to the joint tour participation model is the joint_tour_participation() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Fields: number_of_participants, person_id (for the point person) | Skims Keys: NA

activitysim.abm.models.joint_tour_participation.joint_tour_participation(tours, persons_merged, chunk_size, trace_hh_id)

Predicts for each eligible person to participate or not participate in each joint tour.

activitysim.abm.models.joint_tour_participation.participants_chooser(probs, choosers, spec, trace_label)

custom alternative to logit.make_choices for simulate.simple_simulate

Choosing participants for mixed tours is trickier than adult or child tours becuase we need at least one adult and one child participant in a mixed tour. We call logit.make_choices and then check to see if the tour statisfies this requirement, and rechoose for any that fail until all are satisfied.

In principal, this shold always occur eventually, but we fail after MAX_ITERATIONS, just in case there is some failure in program logic (haven’t seen this occur.)

Parameters
probspandas.DataFrame

Rows for choosers and columns for the alternatives from which they are choosing. Values are expected to be valid probabilities across each row, e.g. they should sum to 1.

chooserspandas.dataframe

simple_simulate choosers df

specpandas.DataFrame

simple_simulate spec df We only need spec so we can know the column index of the ‘participate’ alternative indicating that the participant has been chosen to participate in the tour

trace_labelstr
Returns - same as logit.make_choices
——-
choices, rands

choices, rands as returned by logit.make_choices (in same order as probs)

Joint Tour Destination Choice

The joint tour destination choice model operate similarly to the usual work and school location choice model, selecting the primary destination for travel tours. The only procedural difference between the models is that the usual work and school location choice model selects the usual location of an activity whether or not the activity is undertaken during the travel day, while the joint tour destination choice model selects the location for an activity which has already been generated.

The tour’s primary destination is the location of the activity that is assumed to provide the greatest impetus for engaging in the travel tour. In the household survey, the primary destination was not asked, but rather inferred from the pattern of stops in a closed loop in the respondents’ travel diaries. The inference was made by weighing multiple criteria including a defined hierarchy of purposes, the duration of activities, and the distance from the tour origin. The model operates in the reverse direction, designating the primary purpose and destination and then adding intermediate stops based on spatial, temporal, and modal characteristics of the inbound and outbound journeys to the primary destination.

The joint tour destination choice model is made up of three model steps:
  • sample - selects a sample of alternative locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative location.

  • simulate - starts with the table created above and chooses a final location, this time with the mode choice logsum included.

The main interface to the model is the joint_tour_destination() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tours | Result Fields: destination | Skims Keys: TAZ, alt_dest, MD time period

activitysim.abm.models.joint_tour_destination.joint_tour_destination(tours, persons_merged, households_merged, network_los, chunk_size, trace_hh_id)

Given the tour generation from the above, each tour needs to have a destination, so in this case tours are the choosers (with the associated person that’s making the tour)

activitysim.abm.models.joint_tour_destination.run_destination_logsums(tour_purpose, persons_merged, destination_sample, model_settings, network_los, chunk_size, trace_hh_id, trace_label)

add logsum column to existing tour_destination_sample table

logsum is calculated by running the mode_choice model for each sample (person, dest_zone_id) pair in destination_sample, and computing the logsum of all the utilities

activitysim.abm.models.joint_tour_destination.run_destination_simulate(spec_segment_name, tours, persons_merged, destination_sample, want_logsums, model_settings, network_los, destination_size_terms, estimator, chunk_size, trace_label)

run destination_simulate on tour_destination_sample annotated with mode_choice logsum to select a destination from sample alternatives

Joint Tour Scheduling

The joint tour scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each joint tour. This model uses person Person Time Windows. The primary drivers in the models are accessibility-based parameters such as the auto travel time for the departure/arrival hour combination, demographics, and time pattern characteristics such as the time windows available from previously scheduled tours. The joint tour scheduling model does not use mode choice logsums.

The main interface to the joint tour purpose scheduling model is the joint_tour_scheduling() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: `` TAZ, destination, MD time period, MD time period``

activitysim.abm.models.joint_tour_scheduling.joint_tour_scheduling(tours, persons_merged, tdd_alts, chunk_size, trace_hh_id)

This model predicts the departure time and duration of each joint tour

Non-Mandatory Tour Frequency

The non-mandatory tour frequency model selects the number of non-mandatory tours made by each person on the simulation day. It also adds non-mandatory tours to the tours in the data pipeline. The individual non-mandatory tour frequency model operates in two stages:

  • A choice is made using a random utility model between combinations of tours containing zero, one, and two or more escort tours, and between zero and one or more tours of each other purpose.

  • Up to two additional tours of each purpose are added according to fixed extension probabilities.

The main interface to the non-mandatory tour purpose frequency model is the non_mandatory_tour_frequency() function. This function is registered as an orca step in the example Pipeline.

Core Table: persons | Result Fields: non_mandatory_tour_frequency | Skims Keys: NA

activitysim.abm.models.non_mandatory_tour_frequency.extend_tour_counts(persons, tour_counts, alternatives, trace_hh_id, trace_label)

extend tour counts based on a probability table

counts can only be extended if original count is between 1 and 4 and tours can only be extended if their count is at the max possible (e.g. 2 for escort, 1 otherwise) so escort might be increased to 3 or 4 and other tour types might be increased to 2 or 3

Parameters
persons: pandas dataframe

(need this for join columns)

tour_counts: pandas dataframe

one row per person, once column per tour_type

alternatives

alternatives from nmtv interaction_simulate only need this to know max possible frequency for a tour type

trace_hh_id
trace_label
Returns
extended tour_counts
tour_counts looks like this:

escort shopping othmaint othdiscr eatout social

parent_id
2588676 2 0 0 1 1 0
2588677 0 1 0 1 0 0
activitysim.abm.models.non_mandatory_tour_frequency.non_mandatory_tour_frequency(persons, persons_merged, chunk_size, trace_hh_id)

This model predicts the frequency of making non-mandatory trips (alternatives for this model come from a separate csv file which is configured by the user) - these trips include escort, shopping, othmaint, othdiscr, eatout, and social trips in various combination.

Non-Mandatory Tour Destination Choice

The non-mandatory tour destination choice model chooses a destination zone for non-mandatory tours. The three step (sample, logsums, final choice) process also used for mandatory tour destination choice is used for non-mandatory tour destination choice.

The main interface to the non-mandatory tour destination choice model is the non_mandatory_tour_destination() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tours | Result Field: destination | Skims Keys: TAZ, alt_dest, MD time period, MD time period

activitysim.abm.models.non_mandatory_destination.non_mandatory_tour_destination(tours, persons_merged, network_los, chunk_size, trace_hh_id)

Given the tour generation from the above, each tour needs to have a destination, so in this case tours are the choosers (with the associated person that’s making the tour)

Non-Mandatory Tour Scheduling

The non-mandatory tour scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each non-mandatory tour. This model uses person Person Time Windows. The non-mandatory tour scheduling model does not use mode choice logsums.

The main interface to the non-mandatory tour purpose scheduling model is the non_mandatory_tour_scheduling() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: TAZ, destination, MD time period, MD time period

activitysim.abm.models.non_mandatory_scheduling.non_mandatory_tour_scheduling(tours, persons_merged, tdd_alts, chunk_size, trace_hh_id)

This model predicts the departure time and duration of each activity for non-mandatory tours

Tour Mode Choice

The mandatory, non-mandatory, and joint tour mode choice model assigns to each tour the “primary” mode that is used to get from the origin to the primary destination. The tour-based modeling approach requires a reconsideration of the conventional mode choice structure. Instead of a single mode choice model used in a four-step structure, there are two different levels where the mode choice decision is modeled: (a) the tour mode level (upper-level choice); and, (b) the trip mode level (lower-level choice conditional upon the upper-level choice).

The mandatory, non-mandatory, and joint tour mode level represents the decisions that apply to the entire tour, and that will affect the alternatives available for each individual trip or joint trip. These decisions include the choice to use a private car versus using public transit, walking, or biking; whether carpooling will be considered; and whether transit will be accessed by car or by foot. Trip-level decisions correspond to details of the exact mode used for each trip, which may or may not change over the trips in the tour.

The mandatory, non-mandatory, and joint tour mode choice structure is a nested logit model which separates similar modes into different nests to more accurately model the cross-elasticities between the alternatives. The eighteen modes are incorporated into the nesting structure specified in the model settings file. The first level of nesting represents the use a private car, non-motorized means, or transit. In the second level of nesting, the auto nest is divided into vehicle occupancy categories, and transit is divided into walk access and drive access nests. The final level splits the auto nests into free or pay alternatives and the transit nests into the specific line-haul modes.

The primary variables are in-vehicle time, other travel times, cost (the influence of which is derived from the automobile in-vehicle time coefficient and the persons’ modeled value of time), characteristics of the destination zone, demographics, and the household’s level of auto ownership.

The main interface to the mandatory, non-mandatory, and joint tour mode model is the tour_mode_choice_simulate() function. This function is called in the orca step tour_mode_choice_simulate and is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tours | Result Field: mode | Skims Keys: TAZ, destination, start, end

activitysim.abm.models.tour_mode_choice.logger = <Logger activitysim.abm.models.tour_mode_choice (WARNING)>

Tour mode choice is run for all tours to determine the transportation mode that will be used for the tour

activitysim.abm.models.tour_mode_choice.tour_mode_choice_simulate(tours, persons_merged, network_los, chunk_size, trace_hh_id)

Tour mode choice simulate

At-work Subtours Frequency

The at-work subtour frequency model selects the number of at-work subtours made for each work tour. It also creates at-work subtours by adding them to the tours table in the data pipeline. These at-work sub-tours are travel tours taken during the workday with their origin at the work location, rather than from home. Explanatory variables include employment status, income, auto ownership, the frequency of other tours, characteristics of the parent work tour, and characteristics of the workplace zone.

Choosers: work tours Alternatives: none, 1 eating out tour, 1 business tour, 1 maintenance tour, 2 business tours, 1 eating out tour + 1 business tour Dependent tables: household, person, accessibility Outputs: work tour subtour frequency choice, at-work tours table (with only tour origin zone at this point)

The main interface to the at-work subtours frequency model is the atwork_subtour_frequency() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Field: atwork_subtour_frequency | Skims Keys: NA

activitysim.abm.models.atwork_subtour_frequency.atwork_subtour_frequency(tours, persons_merged, chunk_size, trace_hh_id)

This model predicts the frequency of making at-work subtour tours (alternatives for this model come from a separate csv file which is configured by the user).

At-work Subtours Destination Choice

The at-work subtours destination choice model is made up of three model steps:

  • sample - selects a sample of alternative locations for the next model step. This selects X locations from the full set of model zones using a simple utility.

  • logsums - starts with the table created above and calculates and adds the mode choice logsum expression for each alternative location.

  • simulate - starts with the table created above and chooses a final location, this time with the mode choice logsum included.

Core Table: tours | Result Table: destination | Skims Keys: workplace_taz, alt_dest, MD time period

The main interface to the at-work subtour destination model is the atwork_subtour_destination() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

activitysim.abm.models.atwork_subtour_destination.atwork_subtour_destination_logsums(persons_merged, destination_sample, model_settings, network_los, chunk_size, trace_label)

add logsum column to existing atwork_subtour_destination_sample table

logsum is calculated by running the mode_choice model for each sample (person, dest_zone_id) pair in atwork_subtour_destination_sample, and computing the logsum of all the utilities

person_id

dest_zone_id

rand

pick_count

logsum (added)

23750

14

0.565502716034

4

1.85659498857

23750

16

0.711135838871

6

1.92315598631

23751

12

0.408038878552

1

2.40612135416

23751

14

0.972732479292

2

1.44009018355

activitysim.abm.models.atwork_subtour_destination.atwork_subtour_destination_simulate(subtours, persons_merged, destination_sample, want_logsums, model_settings, network_los, destination_size_terms, estimator, chunk_size, trace_label)

atwork_subtour_destination model on atwork_subtour_destination_sample annotated with mode_choice logsum to select a destination from sample alternatives

At-work Subtour Scheduling

The at-work subtours scheduling model selects a tour departure and duration period (and therefore a start and end period as well) for each at-work subtour. This model uses person Person Time Windows.

This model is the same as the mandatory tour scheduling model except it operates on the at-work tours and constrains the alternative set to available person Person Time Windows. The at-work subtour scheduling model does not use mode choice logsums. The at-work subtour frequency model can choose multiple tours so this model must process all first tours and then second tours since isFirstAtWorkTour is an explanatory variable.

Choosers: at-work tours Alternatives: alternative departure time and arrival back at origin time pairs WITHIN the work tour departure time and arrival time back at origin AND the person time window. If no time window is available for the tour, make the first and last time periods within the work tour available, make the choice, and log the number of times this occurs. Dependent tables: skims, person, land use, work tour Outputs: at-work tour departure time and arrival back at origin time, updated person time windows

The main interface to the at-work subtours scheduling model is the atwork_subtour_scheduling() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Field: start, end, duration | Skims Keys: workplace_taz, alt_dest, MD time period, MD time period

activitysim.abm.models.atwork_subtour_scheduling.atwork_subtour_scheduling(tours, persons_merged, tdd_alts, skim_dict, chunk_size, trace_hh_id)

This model predicts the departure time and duration of each activity for at work subtours tours

At-work Subtour Mode

The at-work subtour mode choice model assigns a travel mode to each at-work subtour using the Tour Mode Choice model.

The main interface to the at-work subtour mode choice model is the atwork_subtour_mode_choice() function. This function is called in the orca step atwork_subtour_mode_choice and is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: tour | Result Field: tour_mode | Skims Keys: workplace_taz, destination, start, end

activitysim.abm.models.atwork_subtour_mode_choice.atwork_subtour_mode_choice(tours, persons_merged, network_los, chunk_size, trace_hh_id)

At-work subtour mode choice simulate

Intermediate Stop Frequency

The stop frequency model assigns to each tour the number of intermediate destinations a person will travel to on each leg of the tour from the origin to tour primary destination and back. The model incorporates the ability for more than one stop in each direction, up to a maximum of 3, for a total of 8 trips per tour (four on each tour leg).

Intermediate stops are not modeled for drive-transit tours because doing so can have unintended consequences because of the difficulty of tracking the location of the vehicle. For example, consider someone who used a park and ride for work and then took transit to an intermediate shopping stop on the way home. Without knowing the vehicle location, it cannot be determined if it is reasonable to allow the person to drive home. Even if the tour were constrained to allow driving only on the first and final trip, the trip home from an intermediate stop may not use the same park and ride where the car was dropped off on the outbound leg, which is usually as close as possible to home because of the impracticality of coding drive access links from every park and ride lot to every zone.

This model also creates a trips table in the pipeline for later models.

The main interface to the intermediate stop frequency model is the stop_frequency() function. This function is registered as an orca step in the example Pipeline.

Core Table: tours | Result Field: stop_frequency | Skims Keys: NA

activitysim.abm.models.stop_frequency.stop_frequency(tours, tours_merged, stop_frequency_alts, network_los, chunk_size, trace_hh_id)

stop frequency model

For each tour, shoose a number of intermediate inbound stops and outbound stops. Create a trip table with inbound and outbound trips.

Thus, a tour with stop_frequency ‘2out_0in’ will have two outbound and zero inbound stops, and four corresponding trips: three outbound, and one inbound.

Adds stop_frequency str column to trips, with fields

creates trips table with columns:

- person_id
- household_id
- tour_id
- primary_purpose
- atwork
- trip_num
- outbound
- trip_count

Trip Purpose

For trip other than the last trip outbound or inbound, assign a purpose based on an observed frequency distribution. The distribution is segmented by tour purpose, tour direction and person type. Work tours are also segmented by departure or arrival time period.

The main interface to the trip purpose model is the trip_purpose() function. This function is registered as an orca step in the example Pipeline.

Core Table: trips | Result Field: purpose | Skims Keys: NA

Note

Trip purpose and trip destination choice can be run iteratively together via Trip Purpose and Destination.

activitysim.abm.models.trip_purpose.choose_intermediate_trip_purpose(trips, probs_spec, trace_hh_id, trace_label)

chose purpose for intermediate trips based on probs_spec which assigns relative weights (summing to 1) to the possible purpose choices

Returns
purpose: pandas.Series of purpose (str) indexed by trip_id
activitysim.abm.models.trip_purpose.run_trip_purpose(trips_df, chunk_size, trace_hh_id, trace_label)

trip purpose - main functionality separated from model step so it can be called iteratively

For each intermediate stop on a tour (i.e. trip other than the last trip outbound or inbound) Each trip is assigned a purpose based on an observed frequency distribution

The distribution is segmented by tour purpose, tour direction, person type, and, optionally, trip depart time .

Returns
purpose: pandas.Series of purpose (str) indexed by trip_id
activitysim.abm.models.trip_purpose.trip_purpose(trips, chunk_size, trace_hh_id)

trip purpose model step - calls run_trip_purpose to run the actual model

adds purpose column to trips

activitysim.abm.models.trip_purpose.trip_purpose_calc_row_size(choosers, spec, trace_label)

rows_per_chunk calculator for trip_purpose

Trip Destination Choice

The trip (or stop) location choice model predicts the location of trips (or stops) along the tour other than the primary destination. The stop-location model is structured as a multinomial logit model using a zone attraction size variable and route deviation measure as impedance. The alternatives are sampled from the full set of zones, subject to availability of a zonal attraction size term. The sampling mechanism is also based on accessibility between tour origin and primary destination, and is subject to certain rules based on tour mode.

All destinations are available for auto tour modes, so long as there is a positive size term for the zone. Intermediate stops on walk tours must be within X miles of both the tour origin and primary destination zones. Intermediate stops on bike tours must be within X miles of both the tour origin and primary destination zones. Intermediate stops on walk-transit tours must either be within X miles walking distance of both the tour origin and primary destination, or have transit access to both the tour origin and primary destination. Additionally, only short and long walk zones are available destinations on walk-transit tours.

The intermediate stop location choice model works by cycling through stops on tours. The level-of-service variables (including mode choice logsums) are calculated as the additional utility between the last location and the next known location on the tour. For example, the LOS variable for the first stop on the outbound direction of the tour is based on additional impedance between the tour origin and the tour primary destination. The LOS variable for the next outbound stop is based on the additional impedance between the previous stop and the tour primary destination. Stops on return tour legs work similarly, except that the location of the first stop is a function of the additional impedance between the tour primary destination and the tour origin. The next stop location is based on the additional impedance between the first stop on the return leg and the tour origin, and so on.

The main interface to the trip destination choice model is the trip_destination() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: trips | Result Field: (trip) destination | Skims Keys: origin, (tour primary) destination, dest_taz, trip_period

Note

Trip purpose and trip destination choice can be run iteratively together via Trip Purpose and Destination.

activitysim.abm.models.trip_destination.compute_logsums(primary_purpose, trips, destination_sample, tours_merged, model_settings, skims, chunk_size, trace_label)

Calculate mode choice logsums using the same recipe as for trip_mode_choice, but do it twice for each alternative since we need out-of-direction logsum (i.e . origin to alt_dest, and alt_dest to half-tour destination)

Returns
adds od_logsum and dp_logsum columns to trips (in place)
activitysim.abm.models.trip_destination.compute_ood_logsums(choosers, logsum_settings, od_skims, locals_dict, chunk_size, trace_label)

Compute one (of two) out-of-direction logsums for destination alternatives

Will either be trip_origin -> alt_dest or alt_dest -> primary_dest

activitysim.abm.models.trip_destination.run_trip_destination(trips, tours_merged, chunk_size, trace_hh_id, trace_label, fail_some_trips_for_testing=False)

trip destination - main functionality separated from model step so it can be called iteratively

Run the trip_destination model, assigning destinations for each (intermediate) trip (last trips already have a destination - either the tour primary destination or Home)

Set trip destination and origin columns, and a boolean failed flag for any failed trips (destination for flagged failed trips will be set to -1)

Parameters
trips
tours_merged
want_sample_table
chunk_size
trace_hh_id
trace_label
activitysim.abm.models.trip_destination.trip_destination(trips, tours_merged, chunk_size, trace_hh_id)

Choose a destination for all ‘intermediate’ trips based on trip purpose.

Final trips already have a destination (the primary tour destination for outbound trips, and home for inbound trips.)

activitysim.abm.models.trip_destination.trip_destination_sample(primary_purpose, trips, alternatives, model_settings, size_term_matrix, skims, chunk_size, trace_hh_id, trace_label)
Returns
destination_sample: pandas.dataframe

choices_df from interaction_sample with (up to) sample_size alts for each chooser row index (non unique) is trip_id from trips (duplicated for each alt) and columns dest_zone_id, prob, and pick_count

dest_zone_id: int

alt identifier from alternatives[<alt_col_name>]

prob: float

the probability of the chosen alternative

pick_countint

number of duplicate picks for chooser, alt

activitysim.abm.models.trip_destination.trip_destination_simulate(primary_purpose, trips, destination_sample, model_settings, want_logsums, size_term_matrix, skims, chunk_size, trace_hh_id, trace_label)

Chose destination from destination_sample (with od_logsum and dp_logsum columns added)

Returns
choices - pandas.Series

destination alt chosen

activitysim.abm.models.trip_destination.wrap_skims(model_settings, trace_label)

wrap skims of trip destination using origin, dest column names from model settings. Various of these are used by destination_sample, compute_logsums, and destination_simulate so we create them all here with canonical names.

Note that compute_logsums aliases their names so it can use the same equations to compute logsums from origin to alt_dest, and from alt_dest to primarly destination

odt_skims - Skim3dWrapper: trip origin, trip alt_dest, time_of_day dot_skims - Skim3dWrapper: trip alt_dest, trip origin, time_of_day dpt_skims - Skim3dWrapper: trip alt_dest, trip primary_dest, time_of_day pdt_skims - Skim3dWrapper: trip primary_dest,trip alt_dest, time_of_day od_skims - SkimWrapper: trip origin, trip alt_dest dp_skims - SkimWrapper: trip alt_dest, trip primary_dest

Parameters
model_settings
Returns
dict containing skims, keyed by canonical names relative to tour orientation

Trip Purpose and Destination

After running trip purpose and trip destination separately, the two model can be ran together in an iterative fashion on the remaining failed trips (i.e. trips that cannot be assigned a destination). Each iteration uses new random numbers.

The main interface to the trip purpose model is the trip_purpose_and_destination() function. This function is registered as an orca step in the example Pipeline.

Core Table: trips | Result Field: purpose, destination | Skims Keys: origin, (tour primary) destination, dest_taz, trip_period

Trip Scheduling (Probablistic)

For each trip, assign a departure hour based on an input lookup table of percents by tour purpose, direction (inbound/outbound), tour hour, and trip index.

  • The tour hour is the tour start hour for outbound trips and the tour end hour for inbound trips. The trip index is the trip sequence on the tour, with up to four trips per half tour

  • For outbound trips, the trip depart hour must be greater than or equal to the previously selected trip depart hour

  • For inbound trips, trips are handled in reverse order from the next-to-last trip in the leg back to the first. The tour end hour serves as the anchor time point from which to start assigning trip time periods.

  • Outbound trips on at-work subtours are assigned the tour depart hour and inbound trips on at-work subtours are assigned the tour end hour.

The assignment of trip depart time is run iteratively up to a max number of iterations since it is possible that the time period selected for an earlier trip in a half-tour makes selection of a later trip time period impossible (or very low probability). Thus, the sampling is re-run until a feasible set of trip time periods is found. If a trip can’t be scheduled after the max iterations, then the trip is assigned the previous trip’s choice (i.e. assumed to happen right after the previous trip) or dropped, as configured by the user. The trip scheduling model does not use mode choice logsums.

Alternatives: Available time periods in the tour window (i.e. tour start and end period). When processing stops on work tours, the available time periods is constrained by the at-work subtour start and end period as well.

The main interface to the trip scheduling model is the trip_scheduling() function. This function is registered as an orca step in the example Pipeline.

Core Table: trips | Result Field: depart | Skims Keys: NA

activitysim.abm.models.trip_scheduling.clip_probs(trips, probs, model_settings)

zero out probs before trips.earliest or after trips.latest

Parameters
trips: pd.DataFrame
probs: pd.DataFrame

one row per trip, one column per time period, with float prob of picking that time period

depart_alt_base: int

int to add to probs column index to get time period it represents. e.g. depart_alt_base = 5 means first column (column 0) represents 5 am

Returns
probs: pd.DataFrame

clipped version of probs

activitysim.abm.models.trip_scheduling.logger = <Logger activitysim.abm.models.trip_scheduling (WARNING)>

StopDepartArrivePeriodModel

StopDepartArriveProportions.csv tourpurp,isInbound,interval,trip,p1,p2,p3,p4,p5…p40

activitysim.abm.models.trip_scheduling.report_bad_choices(bad_row_map, df, filename, trace_label, trace_choosers=None)
Parameters
bad_row_map
dfpandas.DataFrame

utils or probs dataframe

trace_chooserspandas.dataframe

the choosers df (for interaction_simulate) to facilitate the reporting of hh_id because we can’t deduce hh_id from the interaction_dataset which is indexed on index values from alternatives df

activitysim.abm.models.trip_scheduling.schedule_nth_trips(trips, probs_spec, model_settings, first_trip_in_leg, report_failed_trips, trace_hh_id, trace_label)

We join each trip with the appropriate row in probs_spec by joining on probs_join_cols, which should exist in both trips, probs_spec dataframe.

Parameters
trips: pd.DataFrame
probs_spec: pd.DataFrame

Dataframe of probs for choice of depart times and join columns to match them with trips. Depart columns names are irrelevant. Instead, they are position dependent, time period choice is their index + depart_alt_base

depart_alt_base: int

int to add to probs column index to get time period it represents. e.g. depart_alt_base = 5 means first column (column 0) represents 5 am

report_failed_tripsbool
trace_hh_id
trace_label
Returns
choices: pd.Series

time periods depart choices, one per trip (except for trips with zero probs)

activitysim.abm.models.trip_scheduling.schedule_trips_in_leg(outbound, trips, probs_spec, model_settings, last_iteration, trace_hh_id, trace_label)
Parameters
outbound
trips
probs_spec
depart_alt_base
last_iteration
trace_hh_id
trace_label
Returns
choices: pd.Series

depart choice for trips, indexed by trip_id

activitysim.abm.models.trip_scheduling.set_tour_hour(trips, tours)

add columns ‘tour_hour’, ‘earliest’, ‘latest’ to trips

Parameters
trips: pd.DataFrame
tours: pd.DataFrame
Returns
modifies trips in place
activitysim.abm.models.trip_scheduling.trip_scheduling(trips, tours, chunk_size, trace_hh_id)

Trip scheduling assigns depart times for trips within the start, end limits of the tour.

The algorithm is simplistic:

The first outbound trip starts at the tour start time, and subsequent outbound trips are processed in trip_num order, to ensure that subsequent trips do not depart before the trip that preceeds them.

Inbound trips are handled similarly, except in reverse order, starting with the last trip, and working backwards to ensure that inbound trips do not depart after the trip that succeeds them.

The probability spec assigns probabilities for depart times, but those possible departs must be clipped to disallow depart times outside the tour limits, the departs of prior trips, and in the case of work tours, the start/end times of any atwork subtours.

Scheduling can fail if the probability table assigns zero probabilities to all the available depart times in a trip’s depart window. (This could be avoided by giving every window a small probability, rather than zero, but the existing mtctm1 prob spec does not do this. I believe this is due to the its having been generated from a small household travel survey sample that lacked any departs for some time periods.)

Rescheduling the trips that fail (along with their inbound or outbound leg-mates) can sometimes fix this problem, if it was caused by an earlier trip’s depart choice blocking a subsequent trip’s ability to schedule a depart within the resulting window. But it can also happen if a tour is very short (e.g. one time period) and the prob spec having a zero probability for that tour hour.

Therefor we need to handle trips that could not be scheduled. There are two ways (at least) to solve this problem:

1) CHOOSE_MOST_INITIAL simply assign a depart time to the trip, even if it has a zero probability. It makes most sense, in this case, to assign the ‘most initial’ depart time, so that subsequent trips are minimally impacted. This can be done in the final iteration, thus affecting only the trips that could no be scheduled by the standard approach

2) drop_and_cleanup drop trips that could no be scheduled, and adjust their leg mates, as is done for failed trips in trip_destination.

For now we are choosing among these approaches with a manifest constant, but this could be made a model setting…

Trip Scheduling Choice (Logit Choice)

This model uses a logit-based formulation to determine potential trip windows for the three main components of a tour.

  • Outbound Leg: The time from leaving the origin location to the time second to last outbound stop.

  • Main Leg: The time window from the last outbound stop through the main tour destination to the first inbound stop.

  • Inbound Leg: The time window from the first inbound stop to the tour origin location.

Core Table: tours | Result Field: outbound_duration, main_leg_duration, inbound_duration | Skims Keys: NA

Required YAML attributes:

  • SPECIFICATION

    This file defines the logit specification for each chooser segment.

  • COEFFICIENTS

    Specification coefficients

  • PREPROCESSOR:

    Preprocessor definitions to run on the chooser dataframe (trips) before the model is run

Trip Departure Choice (Logit Choice)

Used in conjuction with Trip Scheduling Choice (Logit Choice), this model chooses departure time periods consistent with the time windows for the appropriate leg of the trip.

Core Table: trips | Result Field: depart | Skims Keys: NA

Required YAML attributes:

  • SPECIFICATION

    This file defines the logit specification for each chooser segment.

  • COEFFICIENTS

    Specification coefficients

  • PREPROCESSOR:

    Preprocessor definitions to run on the chooser dataframe (trips) before the model is run

Trip Mode Choice

The trip mode choice model assigns a travel mode for each trip on a given tour. It operates similarly to the tour mode choice model, but only certain trip modes are available for each tour mode. The correspondence rules are defined according to the following principles:

  • Pay trip modes are only available for pay tour modes (for example, drive-alone pay is only available at the trip mode level if drive-alone pay is selected as a tour mode).

  • The auto occupancy of the tour mode is determined by the maximum occupancy across all auto trips that make up the tour. Therefore, the auto occupancy for the tour mode is the maximum auto occupancy for any trip on the tour.

  • Transit tours can include auto shared-ride trips for particular legs. Therefore, ‘casual carpool’, wherein travelers share a ride to work and take transit back to the tour origin, is explicitly allowed in the tour/trip mode choice model structure.

  • The walk mode is allowed for any trip.

  • The availability of transit line-haul submodes on transit tours depends on the skimming and tour mode choice hierarchy. Free shared-ride modes are also available in walk-transit tours, albeit with a low probability. Paid shared-ride modes are not allowed on transit tours because no stated preference data is available on the sensitivity of transit riders to automobile value tolls, and no observed data is available to verify the number of people shifting into paid shared-ride trips on transit tours.

The trip mode choice models explanatory variables include household and person variables, level-of-service between the trip origin and destination according to the time period for the tour leg, urban form variables, and alternative-specific constants segmented by tour mode.

The main interface to the trip mode choice model is the trip_mode_choice() function. This function is registered as an orca step in the example Pipeline. See Writing Logsums for how to write logsums for estimation.

Core Table: trips | Result Field: trip_mode | Skims Keys: origin, destination, trip_period

activitysim.abm.models.trip_mode_choice.trip_mode_choice(trips, tours_merged, network_los, chunk_size, trace_hh_id)

Trip mode choice - compute trip_mode (same values as for tour_mode) for each trip.

Modes for each primary tour putpose are calculated separately because they have different coefficient values (stored in trip_mode_choice_coeffs.csv coefficient file.)

Adds trip_mode column to trip table

Parking Location Choice

The parking location choice model selects a parking location for specified trips. While the model does not require parking location be applied to any specific set of trips, it is usually applied for drive trips to specific zones (e.g., CBD) in the model.

The model provides provides a filter for both the eligible choosers and eligible parking location zone. The trips dataframe is the chooser of this model. The zone selection filter is applied to the land use zones dataframe.

If this model is specified in the pipeline, the Write Trip Matrices step will using the parking location choice results to build trip tables in lieu of the trip destination.

The main interface to the trip mode choice model is the parking_location_choice() function. This function is registered as an orca step, and it is available from the pipeline. See Writing Logsums for how to write logsums for estimation.

Skims

  • odt_skims: Origin to Destination by Time of Day

  • dot_skims: Destination to Origin by Time of Day

  • opt_skims: Origin to Parking Zone by Time of Day

  • pdt_skims: Parking Zone to Destination by Time of Day

  • od_skims: Origin to Destination

  • do_skims: Destination to Origin

  • op_skims: Origin to Parking Zone

  • pd_skims: Parking Zone to Destination

Core Table: trips

Required YAML attributes:

  • SPECIFICATION

    This file defines the logit specification for each chooser segment.

  • COEFFICIENTS

    Specification coefficients

  • PREPROCESSOR:

    Preprocessor definitions to run on the chooser dataframe (trips) before the model is run

  • CHOOSER_FILTER_COLUMN_NAME

    Boolean field on the chooser table defining which choosers are eligible to parking location choice model. If no filter is specified, all choosers (trips) are eligible for the model.

  • CHOOSER_SEGMENT_COLUMN_NAME

    Column on the chooser table defining the parking segment for the logit model

  • SEGMENTS

    List of eligible chooser segments in the logit specification

  • ALTERNATIVE_FILTER_COLUMN_NAME

    Boolean field used to filter land use zones as eligible parking location choices. If no filter is specified, then all land use zones are considered as viable choices.

  • ALT_DEST_COL_NAME

    The column name to append with the parking location choice results. For choosers (trips) ineligible for this model, a -1 value will be placed in column.

  • TRIP_ORIGIN

    Origin field on the chooser trip table

  • TRIP_DESTINATION

    Destination field on the chooser trip table

activitysim.abm.models.parking_location_choice.parking_destination_simulate(segment_name, trips, destination_sample, model_settings, skims, chunk_size, trace_hh_id, trace_label)

Chose destination from destination_sample (with od_logsum and dp_logsum columns added)

Returns
choices - pandas.Series

destination alt chosen

activitysim.abm.models.parking_location_choice.parking_location(trips, trips_merged, land_use, network_los, chunk_size, trace_hh_id)

Given a set of trips, each trip needs to have a parking location if it is eligible for remote parking.

activitysim.abm.models.parking_location_choice.wrap_skims(model_settings)

wrap skims of trip destination using origin, dest column names from model settings. Various of these are used by destination_sample, compute_logsums, and destination_simulate so we create them all here with canonical names.

Note that compute_logsums aliases their names so it can use the same equations to compute logsums from origin to alt_dest, and from alt_dest to primarly destination

odt_skims - SkimStackWrapper: trip origin, trip alt_dest, time_of_day dot_skims - SkimStackWrapper: trip alt_dest, trip origin, time_of_day dpt_skims - SkimStackWrapper: trip alt_dest, trip primary_dest, time_of_day pdt_skims - SkimStackWrapper: trip primary_dest,trip alt_dest, time_of_day od_skims - SkimDictWrapper: trip origin, trip alt_dest dp_skims - SkimDictWrapper: trip alt_dest, trip primary_dest

Parameters
model_settings
Returns
dict containing skims, keyed by canonical names relative to tour orientation

Write Trip Matrices

Write open matrix (OMX) trip matrices for assignment. Reads the trips table post preprocessor and run expressions to code additional data fields, with one data fields for each matrix specified. The matrices are scaled by a household level expansion factor, which is the household sample rate by default, which is calculated when households are read in at the beginning of a model run. The main interface to write trip matrices is the write_trip_matrices() function. This function is registered as an orca step in the example Pipeline.

If the Parking Location Choice model is defined in the pipeline, the parking location zone will be used in lieu of the destination zone.

Core Table: trips | Result: omx trip matrices | Skims Keys: origin, destination

activitysim.abm.models.trip_matrices.annotate_trips(trips, network_los, model_settings)

Add columns to local trips table. The annotator has access to the origin/destination skims and everything defined in the model settings CONSTANTS.

Pipeline tables can also be accessed by listing them under TABLES in the preprocessor settings.

activitysim.abm.models.trip_matrices.write_matrices(aggregate_trips, zone_index, orig_index, dest_index, model_settings, is_tap=False)

Write aggregated trips to OMX format.

The MATRICES setting lists the new OMX files to write. Each file can contain any number of ‘tables’, each specified by a table key (‘name’) and a trips table column (‘data_field’) to use for aggregated counts.

Any data type may be used for columns added in the annotation phase, but the table ‘data_field’s must be summable types: ints, floats, bools.

activitysim.abm.models.trip_matrices.write_trip_matrices(trips, network_los)

Write trip matrices step.

Adds boolean columns to local trips table via annotation expressions, then aggregates trip counts and writes OD matrices to OMX. Save annotated trips table to pipeline if desired.

Writes taz trip tables for one and two zone system. Writes taz and tap trip tables for three zone system. Add is_tap:True to the settings file to identify an output matrix as tap level trips as opposed to taz level trips.

For one zone system, uses the land use table for the set of possible tazs. For two zone system, uses the taz skim zone names for the set of possible tazs. For three zone system, uses the taz skim zone names for the set of possible tazs and uses the tap skim zone names for the set of possible taps.

Util

Additional helper classes

CDAP

activitysim.abm.models.util.cdap.add_interaction_column(choosers, p_tup)

Add an interaction column in place to choosers, listing the ptypes of the persons in p_tup

The name of the interaction column will be determined by the cdap_ranks from p_tup, and the rows in the column contain the ptypes of those persons in that household row.

For instance, for p_tup = (1,3) choosers interaction column name will be ‘p1_p3’

For a household where person 1 is part-time worker (ptype=2) and person 3 is infant (ptype 8) the corresponding row value interaction code will be 28

We take advantage of the fact that interactions are symmetrical to simplify spec expressions: We name the interaction_column in increasing pnum (cdap_rank) order (p1_p2 and not p3_p1) And we format row values in increasing ptype order (28 and not 82) This simplifies the spec expressions as we don’t have to test for p1_p3 == 28 | p1_p3 == 82

Parameters
chooserspandas.DataFrame

household choosers, indexed on _hh_index_ choosers should contain columns ptype_p1, ptype_p2 for each cdap_rank person in hh

p_tupint tuple

tuple specifying the cdap_ranks for the interaction column p_tup = (1,3) means persons with cdap_rank 1 and 3

activitysim.abm.models.util.cdap.add_pn(col, pnum)

return the canonical column name for the indiv_util column or columns in merged hh_chooser df for individual with cdap_rank pnum

e.g. M_p1, ptype_p2 but leave _hh_id_ column unchanged

activitysim.abm.models.util.cdap.assign_cdap_rank(persons, person_type_map, trace_hh_id=None, trace_label=None)

Assign an integer index, cdap_rank, to each household member. (Starting with 1, not 0)

Modifies persons df in place

The cdap_rank order is important, because cdap only assigns activities to the first MAX_HHSIZE persons in each household.

This will preferentially be two working adults and the three youngest children.

Rank is assigned starting at 1. This necessitates some care indexing, but is preferred as it follows the convention of 1-based pnums in expression files.

According to the documentation of reOrderPersonsForCdap in mtctm2.abm.ctramp HouseholdCoordinatedDailyActivityPatternModel:

“Method reorders the persons in the household for use with the CDAP model, which only explicitly models the interaction of five persons in a HH. Priority in the reordering is first given to full time workers (up to two), then to part time workers (up to two workers, of any type), then to children (youngest to oldest, up to three). If the method is called for a household with less than 5 people, the cdapPersonArray is the same as the person array.”

We diverge from the above description in that a cdap_rank is assigned to all persons, including ‘extra’ household members, whose activity is assigned subsequently. The pair _hh_id_, cdap_rank will uniquely identify each household member.

Parameters
personspandas.DataFrame

Table of persons data. Must contain columns _hh_size_, _hh_id_, _ptype_, _age_

Returns
cdap_rankpandas.Series

integer cdap_rank of every person, indexed on _persons_index_

activitysim.abm.models.util.cdap.build_cdap_spec(interaction_coefficients, hhsize, trace_spec=False, trace_label=None, cache=True)

Build a spec file for computing utilities of alternative household member interaction patterns for households of specified size.

We generate this spec automatically from a table of rules and coefficients because the interaction rules are fairly simple and can be expressed compactly whereas there is a lot of redundancy between the spec files for different household sizes, as well as in the vectorized expression of the interaction alternatives within the spec file itself

interaction_coefficients has five columns:
activity

A single character activity type name (M, N, or H)

interaction_ptypes

List of ptypes in the interaction (in order of increasing ptype) or empty for wildcards (meaning that the interaction applies to all ptypes in that size hh)

cardinality

the number of persons in the interaction (e.g. 3 for a 3-way interaction)

slug

a human friendly efficient name so we can dump a readable spec trace file for debugging this slug is replaced with the numerical coefficient value after we dump the trace file

coefficient

The coefficient to apply for all hh interactions for this activity and set of ptypes

The generated spec will have the eval expression in the index, and a utility column for each alternative (e.g. [‘HH’, ‘HM’, ‘HN’, ‘MH’, ‘MM’, ‘MN’, ‘NH’, ‘NM’, ‘NN’] for hhsize 2)

In order to be able to dump the spec in a human-friendly fashion to facilitate debugging the cdap_interaction_coefficients table, we first populate utility columns in the spec file with the coefficient slugs, dump the spec file, and then replace the slugs with coefficients.

Parameters
interaction_coefficientspandas.DataFrame

Rules and coefficients for generating interaction specs for different household sizes

hhsizeint

household size for which the spec should be built.

Returns
spec: pandas.DataFrame
activitysim.abm.models.util.cdap.extra_hh_member_choices(persons, cdap_fixed_relative_proportions, locals_d, trace_hh_id, trace_label)

Generate the activity choices for the ‘extra’ household members who weren’t handled by cdap

Following the CTRAMP HouseholdCoordinatedDailyActivityPatternModel, “a separate, simple cross-sectional distribution is looked up for the remaining household members”

The cdap_fixed_relative_proportions spec is handled like an activitysim logit utility spec, EXCEPT that the values computed are relative proportions, not utilities (i.e. values are not exponentiated before being normalized to probabilities summing to 1.0)

Parameters
personspandas.DataFrame
Table of persons data indexed on _persons_index_

We expect, at least, columns [_hh_id_, _ptype_]

cdap_fixed_relative_proportions

spec to compute/specify the relative proportions of each activity (M, N, H) that should be used to choose activities for additional household members not handled by CDAP.

locals_dDict

dictionary of local variables that eval_variables adds to the environment for an evaluation of an expression that begins with @

Returns
choicespandas.Series

list of alternatives chosen for all extra members, indexed by _persons_index_

activitysim.abm.models.util.cdap.hh_choosers(indiv_utils, hhsize)

Build a chooser table for calculating house utilities for all households of specified hhsize

The choosers table will have one row per household with columns containing the indiv_utils for all non-extra (i.e. cdap_rank <- MAX_HHSIZE) persons. That makes 3 columns for each individual. e.g. the utilities of person with cdap_rank 1 will be included as M_p1, N_p1, H_p1

The chooser table will also contain interaction columns for all possible interactions involving from 2 to 3 persons (actually MAX_INTERACTION_CARDINALITY, which is currently 3).

The interaction columns list the ptypes of the persons in the interaction set, sorted by ptype. For instance the interaction between persons with cdap_rank 1 and three and ptypes will be listed in a column named ‘p1_p3’ and for a household where persons p1 and p3 are 2 and 4 will a row value of 24 in the p1_p3 column.

Parameters
indiv_utilspandas.DataFrame

CDAP utilities for each individual, ignoring interactions. ind_utils has index of _persons_index_ and a column for each alternative i.e. three columns ‘M’ (Mandatory), ‘N’ (NonMandatory), ‘H’ (Home)

hhsizeint

household size for which the choosers table should be built. Households with more than MAX_HHSIZE members will be included with MAX_HHSIZE choosers since the are handled the same, and the activities of the extra members are assigned afterwards

Returns
chooserspandas.DataFrame

choosers households of hhsize with activity utility columns interaction columns for all (non-extra) household members

activitysim.abm.models.util.cdap.household_activity_choices(indiv_utils, interaction_coefficients, hhsize, trace_hh_id=None, trace_label=None)

Calculate household utilities for each activity pattern alternative for households of hhsize The resulting activity pattern for each household will be coded as a string of activity codes. e.g. ‘MNHH’ for a 4 person household with activities Mandatory, NonMandatory, Home, Home

Parameters
indiv_utilspandas.DataFrame

CDAP utilities for each individual, ignoring interactions ind_utils has index of _persons_index_ and a column for each alternative i.e. three columns ‘M’ (Mandatory), ‘N’ (NonMandatory), ‘H’ (Home)

interaction_coefficientspandas.DataFrame

Rules and coefficients for generating interaction specs for different household sizes

hhsizeint

the size of household for which activity perttern should be calculated (1..MAX_HHSIZE)

Returns
choicespandas.Series

the chosen cdap activity pattern for each household represented as a string (e.g. ‘MNH’) with same index (_hh_index_) as utils

activitysim.abm.models.util.cdap.individual_utilities(persons, cdap_indiv_spec, locals_d, trace_hh_id=None, trace_label=None)

Calculate CDAP utilities for all individuals.

Parameters
personspandas.DataFrame

DataFrame of individual persons data.

cdap_indiv_specpandas.DataFrame

CDAP spec applied to individuals.

Returns
utilitiespandas.DataFrame

Will have index of persons and columns for each of the alternatives. plus some ‘useful columns’ [_hh_id_, _ptype_, ‘cdap_rank’, _hh_size_]

activitysim.abm.models.util.cdap.preprocess_interaction_coefficients(interaction_coefficients)

The input cdap_interaction_coefficients.csv file has three columns:

activity

A single character activity type name (M, N, or H)

interaction_ptypes

List of ptypes in the interaction (in order of increasing ptype) Stars (***) instead of ptypes means the interaction applies to all ptypes in that size hh.

coefficient

The coefficient to apply for all hh interactions for this activity and set of ptypes

To facilitate building the spec for a given hh ssize, we add two additional columns:

cardinality

the number of persons in the interaction (e.g. 3 for a 3-way interaction)

slug

a human friendly efficient name so we can dump a readable spec trace file for debugging this slug is then replaced with the numerical coefficient value prior to evaluation

activitysim.abm.models.util.cdap.run_cdap(persons, person_type_map, cdap_indiv_spec, cdap_interaction_coefficients, cdap_fixed_relative_proportions, locals_d, chunk_size=0, trace_hh_id=None, trace_label=None)

Choose individual activity patterns for persons.

Parameters
personspandas.DataFrame

Table of persons data. Must contain at least a household ID, household size, person type category, and age, plus any columns used in cdap_indiv_spec

cdap_indiv_specpandas.DataFrame

CDAP spec for individuals without taking any interactions into account.

cdap_interaction_coefficientspandas.DataFrame

Rules and coefficients for generating interaction specs for different household sizes

cdap_fixed_relative_proportionspandas.DataFrame

Spec to for the relative proportions of each activity (M, N, H) to choose activities for additional household members not handled by CDAP

locals_dDict

This is a dictionary of local variables that will be the environment for an evaluation of an expression that begins with @ in either the cdap_indiv_spec or cdap_fixed_relative_proportions expression files

chunk_size: int

Chunk size or 0 for no chunking

trace_hh_idint

hh_id to trace or None if no hh tracing

trace_labelstr

label for tracing or None if no tracing

Returns
choicespandas.DataFrame

dataframe is indexed on _persons_index_ and has two columns:

cdap_activitystr

activity for that person expressed as ‘M’, ‘N’, ‘H’

activitysim.abm.models.util.cdap.unpack_cdap_indiv_activity_choices(persons, hh_choices, trace_hh_id, trace_label)

Unpack the household activity choice list into choices for each (non-extra) household member

Parameters
personspandas.DataFrame

Table of persons data indexed on _persons_index_ We expect, at least, columns [_hh_id_, ‘cdap_rank’]

hh_choicespandas.Series

household activity pattern is encoded as a string (of length hhsize) of activity codes e.g. ‘MNHH’ for a 4 person household with activities Mandatory, NonMandatory, Home, Home

Returns
cdap_indiv_activity_choicespandas.Series

series contains one activity per individual hh member, indexed on _persons_index_

Estimation

See Estimation for more information.

Logsums

activitysim.abm.models.util.logsums.compute_logsums(choosers, tour_purpose, logsum_settings, model_settings, network_los, chunk_size, trace_label)
Parameters
choosers
tour_purpose
logsum_settings
model_settings
network_los
chunk_size
trace_hh_id
trace_label
Returns
logsums: pandas series

computed logsums with same index as choosers

Mode

activitysim.abm.models.util.mode.mode_choice_simulate(choosers, spec, nest_spec, skims, locals_d, chunk_size, mode_column_name, logsum_column_name, trace_label, trace_choice_name, trace_column_names=None, estimator=None)

common method for both tour_mode_choice and trip_mode_choice

Parameters
choosers
spec
nest_spec
skims
locals_d
chunk_size
mode_column_name
logsum_column_name
trace_label
trace_choice_name
estimator
activitysim.abm.models.util.mode.run_tour_mode_choice_simulate(choosers, tour_purpose, model_settings, mode_column_name, logsum_column_name, network_los, skims, constants, estimator, chunk_size, trace_label=None, trace_choice_name=None)

This is a utility to run a mode choice model for each segment (usually segments are tour/trip purposes). Pass in the tours/trip that need a mode, the Skim object, the spec to evaluate with, and any additional expressions you want to use in the evaluation of variables.

Overlap

activitysim.abm.models.util.overlap.p2p_time_window_overlap(p1_ids, p2_ids)
Parameters
p1_ids
p2_ids
activitysim.abm.models.util.overlap.rle(a)

Compute run lengths of values in rows of a two dimensional ndarry of ints.

We assume the first and last columns are buffer columns (because this is the case for time windows) and so don’t include them in results.

Return arrays giving row_id, start_pos, run_length, and value of each run of any length.

Parameters
anumpy.ndarray of int shape(n, <num_time_periods_in_a_day>)

The input array would normally only have values of 0 or 1 to detect overlapping time period availability but we don’t assume this, and will detect and report runs of any values. (Might prove useful in future?…)

Returns
row_idnumpy.ndarray int shape(<num_runs>)
start_posnumpy.ndarray int shape(<num_runs>)
run_lengthnumpy.ndarray int shape(<num_runs>)
run_valnumpy.ndarray int shape(<num_runs>)

Tour Destination

class activitysim.abm.models.util.tour_destination.SizeTermCalculator(size_term_selector)

convenience object to provide size_terms for a selector (e.g. non_mandatory) for various segments (e.g. tour_type or purpose) returns size terms for specified segment in df or series form

activitysim.abm.models.util.tour_destination.run_destination_logsums(tour_purpose, persons_merged, destination_sample, model_settings, network_los, chunk_size, trace_hh_id, trace_label)

add logsum column to existing tour_destination_sample table

logsum is calculated by running the mode_choice model for each sample (person, dest_zone_id) pair in destination_sample, and computing the logsum of all the utilities

activitysim.abm.models.util.tour_destination.run_destination_simulate(spec_segment_name, tours, persons_merged, destination_sample, want_logsums, model_settings, network_los, destination_size_terms, estimator, chunk_size, trace_label)

run destination_simulate on tour_destination_sample annotated with mode_choice logsum to select a destination from sample alternatives

Tour Frequency

activitysim.abm.models.util.tour_frequency.canonical_tours()

create labels for every the possible tour by combining tour_type/tour_num.

Returns
list of canonical tour labels in alphabetical order
activitysim.abm.models.util.tour_frequency.create_tours(tour_counts, tour_category, parent_col='person_id')

This method processes the tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the tours that were generated

Parameters
tour_counts: DataFrame

table specifying how many tours of each type to create one row per person (or parent_tour for atwork subtours) one (int) column per tour_type, with number of tours to create

tour_categorystr

one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

Returns
tourspandas.DataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

tours.tour_type - tour type (e.g. school, work, shopping, eat) tours.tour_type_num - if there are two ‘school’ type tours, they will be numbered 1 and 2 tours.tour_type_count - number of tours of tour_type parent has (parent’s max tour_type_num) tours.tour_num - index of tour (of any type) for parent tours.tour_count - number of tours of any type) for parent (parent’s max tour_num) tours.tour_category - one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

activitysim.abm.models.util.tour_frequency.process_atwork_subtours(work_tours, atwork_subtour_frequency_alts)

This method processes the atwork_subtour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the subtours tours that were generated

Parameters
work_tours: DataFrame

A series which has parent work tour tour_id as the index and columns with person_id and atwork_subtour_frequency.

atwork_subtour_frequency_alts: DataFrame

A DataFrame which has as a unique index with atwork_subtour_frequency values and frequency counts for the subtours to be generated for that choice

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

activitysim.abm.models.util.tour_frequency.process_joint_tours(joint_tour_frequency, joint_tour_frequency_alts, point_persons)

This method processes the joint_tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the joint tours that were generated

Parameters
joint_tour_frequencypandas.Series

household joint_tour_frequency (which came out of the joint tour frequency model) indexed by household_id

joint_tour_frequency_alts: DataFrame

A DataFrame which has as a unique index with joint_tour_frequency values and frequency counts for the tours to be generated for that choice

point_personspandas DataFrame

table with columns for (at least) person_ids and home_zone_id indexed by household_id

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a tour identifier, a household_id column, a tour_type column and tour_type_num and tour_num columns which is set to 1 or 2 depending whether it is the first or second joint tour made by the household.

activitysim.abm.models.util.tour_frequency.process_mandatory_tours(persons, mandatory_tour_frequency_alts)

This method processes the mandatory_tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the mandatory tours that were generated

Parameters
personsDataFrame

Persons is a DataFrame which has a column call mandatory_tour_frequency (which came out of the mandatory tour frequency model) and a column is_worker which indicates the person’s worker status. The only valid values of the mandatory_tour_frequency column to take are “work1”, “work2”, “school1”, “school2” and “work_and_school”

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a tour identifier, a person_id column, a tour_type column which is “work” or “school” and a tour_num column which is set to 1 or 2 depending whether it is the first or second mandatory tour made by the person. The logic for whether the work or school tour comes first given a “work_and_school” choice depends on the is_worker column: work tours first for workers, second for non-workers

activitysim.abm.models.util.tour_frequency.process_non_mandatory_tours(persons, tour_counts)

This method processes the non_mandatory_tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the non mandatory tours that were generated

Parameters
persons: pandas.DataFrame

persons table containing a non_mandatory_tour_frequency column which has the index of the chosen alternative as the value

non_mandatory_tour_frequency_alts: DataFrame

A DataFrame which has as a unique index which relates to the values in the series above typically includes columns which are named for trip purposes with values which are counts for that trip purpose. Example trip purposes include escort, shopping, othmaint, othdiscr, eatout, social, etc. A row would be an alternative which might be to take one shopping trip and zero trips of other purposes, etc.

Returns
toursDataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

activitysim.abm.models.util.tour_frequency.process_tours(tour_frequency, tour_frequency_alts, tour_category, parent_col='person_id')

This method processes the tour_frequency column that comes out of the model of the same name and turns into a DataFrame that represents the tours that were generated

Parameters
tour_frequency: Series

A series which has <parent_col> as the index and the chosen alternative index as the value

tour_frequency_alts: DataFrame

A DataFrame which has as a unique index which relates to the values in the series above typically includes columns which are named for trip purposes with values which are counts for that trip purpose. Example trip purposes include escort, shopping, othmaint, othdiscr, eatout, social, etc. A row would be an alternative which might be to take one shopping trip and zero trips of other purposes, etc.

tour_categorystr

one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

parent_col: str

the name of the index (parent_tour_id for atwork subtours, otherwise person_id)

Returns
tourspandas.DataFrame

An example of a tours DataFrame is supplied as a comment in the source code - it has an index which is a unique tour identifier, a person_id column, and a tour type column which comes from the column names of the alternatives DataFrame supplied above.

tours.tour_type - tour type (e.g. school, work, shopping, eat) tours.tour_type_num - if there are two ‘school’ type tours, they will be numbered 1 and 2 tours.tour_type_count - number of tours of tour_type parent has (parent’s max tour_type_num) tours.tour_num - index of tour (of any type) for parent tours.tour_count - number of tours of any type) for parent (parent’s max tour_num) tours.tour_category - one of ‘mandatory’, ‘non_mandatory’, ‘atwork’, or ‘joint’

activitysim.abm.models.util.tour_frequency.set_tour_index(tours, parent_tour_num_col=None, is_joint=False)

The new index values are stable based on the person_id, tour_type, and tour_num. The existing index is ignored and replaced.

This gives us a stable (predictable) tour_id with tours in canonical order (when tours are sorted by tour_id, tours for each person of the same type will be adjacent and in increasing tour_type_num order)

It also simplifies attaching random number streams to tours that are stable (even across simulations)

Parameters
toursDataFrame

Tours dataframe to reindex.

Trip

activitysim.abm.models.util.trip.cleanup_failed_trips(trips)

drop failed trips and cleanup fields in leg_mates:

trip_num assign new ordinal trip num after failed trips are dropped trip_count assign new count of trips in leg, sans failed trips first update first flag as we may have dropped first trip (last trip can’t fail) next_trip_id assign id of next trip in leg after failed trips are dropped

activitysim.abm.models.util.trip.flag_failed_trip_leg_mates(trips_df, col_name)

set boolean flag column of specified name to identify failed trip leg_mates in place

activitysim.abm.models.util.trip.generate_alternative_sizes(max_duration, max_trips)

Builds a lookup Numpy array pattern sizes based on the number of trips in the leg and the duration available to the leg. :param max_duration: :param max_trips: :return:

activitysim.abm.models.util.trip.get_time_windows(residual, level)
Parameters
  • residual

  • level

Returns

Vectorize Tour Scheduling

activitysim.abm.models.util.vectorize_tour_scheduling.compute_logsums(alt_tdd, tours_merged, tour_purpose, model_settings, skims, trace_label)

Compute logsums for the tour alt_tdds, which will differ based on their different start, stop times of day, which translate to different odt_skim out_period and in_periods.

In mtctm1, tdds are hourly, but there are only 5 skim time periods, so some of the tdd_alts will be the same, once converted to skim time periods. With 5 skim time periods there are 15 unique out-out period pairs but 190 tdd alternatives.

For efficiency, rather compute a lot of redundant logsums, we compute logsums for the unique (out-period, in-period) pairs and then join them back to the alt_tdds.

activitysim.abm.models.util.vectorize_tour_scheduling.get_previous_tour_by_tourid(current_tour_window_ids, previous_tour_by_window_id, alts)

Matches current tours with attributes of previous tours for the same person. See the return value below for more information.

Parameters
current_tour_window_idsSeries

A Series of parent ids for the tours we’re about make the choice for - index should match the tours DataFrame.

previous_tour_by_window_idSeries

A Series where the index is the parent (window) id and the value is the index of the alternatives of the scheduling.

altsDataFrame

The alternatives of the scheduling.

Returns
prev_altsDataFrame

A DataFrame with an index matching the CURRENT tours we’re making a decision for, but with columns from the PREVIOUS tour of the person associated with each of the CURRENT tours. Columns listed in PREV_TOUR_COLUMNS from the alternatives will have “_previous” added as a suffix to keep differentiated from the current alternatives that will be part of the interaction.

activitysim.abm.models.util.vectorize_tour_scheduling.schedule_tours(tours, persons_merged, alts, spec, logsum_tour_purpose, model_settings, timetable, timetable_window_id_col, previous_tour, tour_owner_id_col, estimator, chunk_size, tour_trace_label)

chunking wrapper for _schedule_tours

While interaction_sample_simulate provides chunking support, the merged tours, persons dataframe and the tdd_interaction_dataset are very big, so we want to create them inside the chunking loop to minimize memory footprint. So we implement the chunking loop here, and pass a chunk_size of 0 to interaction_sample_simulate to disable its chunking support.

activitysim.abm.models.util.vectorize_tour_scheduling.tdd_interaction_dataset(tours, alts, timetable, choice_column, window_id_col, trace_label)

interaction_sample_simulate expects alts index same as choosers (e.g. tour_id) name of choice column in alts

Parameters
tourspandas DataFrame

must have person_id column and index on tour_id

altspandas DataFrame

alts index must be timetable tdd id

timetableTimeTable object
choice_columnstr

name of column to store alt index in alt_tdd DataFrame (since alt_tdd is duplicate index on person_id but unique on person_id,alt_id)

Returns
alt_tddpandas DataFrame

columns: start, end , duration, <choice_column> index: tour_id

activitysim.abm.models.util.vectorize_tour_scheduling.vectorize_joint_tour_scheduling(joint_tours, joint_tour_participants, persons_merged, alts, persons_timetable, spec, model_settings, estimator, chunk_size=0, trace_label=None)

Like vectorize_tour_scheduling but specifically for joint tours

joint tours have a few peculiarities necessitating separate treatment:

Timetable has to be initialized to set all timeperiods…

Parameters
toursDataFrame

DataFrame of tours containing tour attributes, as well as a person_id column to define the nth tour for each person.

persons_mergedDataFrame

DataFrame of persons containing attributes referenced by expressions in spec

altsDataFrame

DataFrame of alternatives which represent time slots. Will be passed to interaction_simulate in batches for each nth tour.

specDataFrame

The spec which will be passed to interaction_simulate. (or dict of specs keyed on tour_type if tour_types is not None)

model_settingsdict
Returns
choicesSeries

A Series of choices where the index is the index of the tours DataFrame and the values are the index of the alts DataFrame.

persons_timetableTimeTable

timetable updated with joint tours (caller should replace_table for it to persist)

activitysim.abm.models.util.vectorize_tour_scheduling.vectorize_subtour_scheduling(parent_tours, subtours, persons_merged, alts, spec, model_settings, estimator, chunk_size=0, trace_label=None)

Like vectorize_tour_scheduling but specifically for atwork subtours

subtours have a few peculiarities necessitating separate treatment:

Timetable has to be initialized to set all timeperiods outside parent tour footprint as unavailable. So atwork subtour timewindows are limited to the foorprint of the parent work tour. And parent_tour_id’ column of tours is used instead of parent_id as timetable row_id.

Parameters
parent_toursDataFrame

parent tours of the subtours (because we need to know the tdd of the parent tour to assign_subtour_mask of timetable indexed by parent_tour id

subtoursDataFrame

atwork subtours to schedule

persons_mergedDataFrame

DataFrame of persons containing attributes referenced by expressions in spec

altsDataFrame

DataFrame of alternatives which represent time slots. Will be passed to interaction_simulate in batches for each nth tour.

specDataFrame

The spec which will be passed to interaction_simulate. (all subtours share same spec regardless of subtour type)

model_settingsdict
chunk_size
trace_label
Returns
choicesSeries

A Series of choices where the index is the index of the subtours DataFrame and the values are the index of the alts DataFrame.

activitysim.abm.models.util.vectorize_tour_scheduling.vectorize_tour_scheduling(tours, persons_merged, alts, timetable, tour_segments, tour_segment_col, model_settings, chunk_size=0, trace_label=None)

The purpose of this method is fairly straightforward - it takes tours and schedules them into time slots. Alternatives should be specified so as to define those time slots (usually with start and end times).

schedule_tours adds variables that can be used in the spec which have to do with the previous tours per person. Every column in the alternatives table is appended with the suffix “_previous” and made available. So if your alternatives table has columns for start and end, then start_previous and end_previous will be set to the start and end of the most recent tour for a person. The first time through, start_previous and end_previous are undefined, so make sure to protect with a tour_num >= 2 in the variable computation.

FIXME - fix docstring: tour_segments, tour_segment_col

Parameters
toursDataFrame

DataFrame of tours containing tour attributes, as well as a person_id column to define the nth tour for each person.

persons_mergedDataFrame

DataFrame of persons containing attributes referenced by expressions in spec

altsDataFrame

DataFrame of alternatives which represent time slots. Will be passed to interaction_simulate in batches for each nth tour.

specDataFrame

The spec which will be passed to interaction_simulate. (or dict of specs keyed on tour_type if tour_types is not None)

model_settingsdict
Returns
choicesSeries

A Series of choices where the index is the index of the tours DataFrame and the values are the index of the alts DataFrame.

timetableTimeTable

persons timetable updated with tours (caller should replace_table for it to persist)

Tests

See activitysim.abm.test and activitysim.abm.models.util.test