Sharrow Basics#

This notebook provides a short walkthrough of some of the basic features of the sharrow library.

from io import StringIO

import numba as nb
import numpy as np
import pandas as pd
import xarray as xr

import sharrow as sh

sh.__version__
'2.12.0'

Example Data#

We’ll begin by importing some example data to work with. We’ll be using some test data taken from the MTC example in the ActivitySim project, including tables of data for households and persons, as well as a set of skims containing transportation level of service information for travel around a tiny slice of San Francisco.

The households and persons are typical tabular data, and each can be read in and stored as a pandas.DataFrame.

households = sh.example_data.get_households()
households.head()
TAZ SERIALNO PUMA5 income PERSONS HHT UNITTYPE NOC BLDGSZ TENURE ... hschpred hschdriv htypdwel hownrent hadnwst hadwpst hadkids bucketBin originalPUMA hmultiunit
HHID
2717868 25 2715386 2202 361000 2 1 0 0 9 1 ... 0 0 2 1 0 0 0 3 2202 1
763899 6 5360279 2203 59220 1 4 0 0 9 3 ... 0 0 2 2 0 0 0 4 2203 1
2222791 9 77132 2203 197000 2 2 0 0 9 1 ... 0 0 2 1 0 0 1 5 2203 1
112477 17 3286812 2203 2200 1 6 0 0 8 3 ... 0 0 2 2 0 0 0 7 2203 1
370491 21 6887183 2203 16500 3 1 0 1 8 3 ... 1 0 2 2 0 0 0 7 2203 1

5 rows × 46 columns

persons = sh.example_data.get_persons()
persons.head()
household_id age RELATE ESR GRADE PNUM PAUG DDP sex WEEKS HOURS MSP POVERTY EARNS pagecat pemploy pstudent ptype padkid
PERID
25671 25671 47 1 6 0 1 0 0 1 0 0 6 39 0 6 3 3 4 2
25675 25675 27 1 6 7 1 0 0 2 52 40 2 84 7200 5 3 2 3 2
25678 25678 30 1 6 0 1 0 0 2 0 0 6 84 0 5 3 3 4 2
25683 25683 23 1 6 0 1 0 0 1 0 0 6 1 0 4 3 3 4 2
25684 25684 52 1 6 0 1 0 0 1 0 0 6 94 0 7 3 3 4 2

The skims, on the other hand, are not just simple tabular data, but rather a multi-dimensional representation of the transportation system, indexed by origin. destination, and time of day. Rather than using a single DataFrame for this data, we store it as a multi-dimensional xarray.Dataset.

skims = sh.example_data.get_skims()
skims
<xarray.Dataset> Size: 2MB
Dimensions:               (otaz: 25, dtaz: 25, time_period: 5)
Coordinates:
  * dtaz                  (dtaz) int64 200B 1 2 3 4 5 6 7 ... 20 21 22 23 24 25
  * otaz                  (otaz) int64 200B 1 2 3 4 5 6 7 ... 20 21 22 23 24 25
  * time_period           (time_period) <U2 40B 'EA' 'AM' 'MD' 'PM' 'EV'
Data variables: (12/170)
    DIST                  (otaz, dtaz) float32 2kB dask.array<chunksize=(25, 25), meta=np.ndarray>
    DISTBIKE              (otaz, dtaz) float32 2kB dask.array<chunksize=(25, 25), meta=np.ndarray>
    DISTWALK              (otaz, dtaz) float32 2kB dask.array<chunksize=(25, 25), meta=np.ndarray>
    DRV_COM_WLK_BOARDS    (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    DRV_COM_WLK_DDIST     (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    DRV_COM_WLK_DTIM      (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    ...                    ...
    WLK_TRN_WLK_IVT       (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    WLK_TRN_WLK_IWAIT     (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    WLK_TRN_WLK_WACC      (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    WLK_TRN_WLK_WAUX      (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    WLK_TRN_WLK_WEGR      (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>
    WLK_TRN_WLK_XWAIT     (otaz, dtaz, time_period) float32 12kB dask.array<chunksize=(25, 25, 5), meta=np.ndarray>

For tabular data, sharrow can be provided either pandas DataFrames or xarray Datasets, but to ensure consistency the former are converted into the latter automatically when they are used with sharrow. You can also easily manually make the conversion:

xr.Dataset(persons)
<xarray.Dataset> Size: 1MB
Dimensions:       (PERID: 8212)
Coordinates:
  * PERID         (PERID) int64 66kB 25671 25675 25678 ... 7554887 7554903
Data variables: (12/19)
    household_id  (PERID) int64 66kB 25671 25675 25678 ... 2863552 2863568
    age           (PERID) int64 66kB 47 27 30 23 52 19 54 ... 82 68 68 93 76 82
    RELATE        (PERID) int64 66kB 1 1 1 1 1 1 1 1 ... 22 22 22 22 22 22 22 22
    ESR           (PERID) int64 66kB 6 6 6 6 6 6 6 6 6 6 ... 6 6 6 6 6 6 6 6 6 6
    GRADE         (PERID) int64 66kB 0 7 0 0 0 6 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
    PNUM          (PERID) int64 66kB 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1 1
    ...            ...
    EARNS         (PERID) int64 66kB 0 7200 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    pagecat       (PERID) int64 66kB 6 5 5 4 7 4 7 6 7 7 ... 8 8 9 9 9 8 8 9 8 9
    pemploy       (PERID) int64 66kB 3 3 3 3 3 3 3 3 3 3 ... 3 3 3 3 3 3 3 3 3 3
    pstudent      (PERID) int64 66kB 3 2 3 3 3 2 3 3 3 3 ... 3 3 3 3 3 3 3 3 3 3
    ptype         (PERID) int64 66kB 4 3 4 4 4 3 4 4 4 4 ... 5 5 5 5 5 5 5 5 5 5
    padkid        (PERID) int64 66kB 2 2 2 2 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2

Suppose we’re wanting to simulate a tour mode choice. Normally we’d probably have run through a bunch of different models to generate these tours and their destinations first, but let’s just skip that for now and make up some random data to work with. We’ll just randomly choose (with replacement) 100,000 people, and send them to 100,000 zones, with random outbound and inbound time periods.

def random_tours(n_tours=100_000, seed=42):
    rng = np.random.default_rng(seed)
    n_zones = skims.sizes["dtaz"]
    return pd.DataFrame(
        {
            "PERID": rng.choice(persons.index, size=n_tours),
            "dest_taz_idx": rng.choice(n_zones, size=n_tours),
            "out_time_period": rng.choice(skims.time_period, size=n_tours),
            "in_time_period": rng.choice(skims.time_period, size=n_tours),
        }
    ).rename_axis("TOURIDX")


tours = random_tours()
tours.head()
PERID dest_taz_idx out_time_period in_time_period
TOURIDX
0 111378 18 EV AM
1 5058053 22 EV EV
2 3608229 14 EV EV
3 1874724 10 EV PM
4 1774303 0 EV PM

Of note in this table, we include include destination TAZ’s by index (position) not label, so we can observe a TAZ index of 0 even though the first TAZ ID is 1.

Spec Files#

Now that we’ve got our tours to work with, we’ll also need an expression “spec” file that defines the utility function terms and coefficients. Following the ActivitySim format, we can write a mini-spec file as appears below. Each line of this CSV file has an expression that can be evaluated in the context of the various tables and datasets shown above, plus a set of coefficients that apply for that expression across various modal alternatives (drive, walk, and transit in this example).

mini_spec = """
Label,Expression,DRIVE,WALK,TRANSIT
Drive Time,odt_skims['SOV_TIME'] + dot_skims['SOV_TIME'],-0.0134,,
Transit IVT,(odt_skims['WLK_LOC_WLK_TOTIVT']/100 + dot_skims['WLK_LOC_WLK_TOTIVT']/100),,,-0.0134
Transit Wait Time,short_i_wait_mult * ((odt_skims['WLK_LOC_WLK_IWAIT']/100).clip(upper=shortwait) + (dot_skims['WLK_LOC_WLK_IWAIT']/100).clip(upper=shortwait)),,,-0.0134
Income,hh.income > income_breakpoints[2],,-0.2,
Constant,one,,-0.4,-0.55
"""

We’ll use pandas to load these values into a DataFrame.

spec = pd.read_csv(StringIO(mini_spec), index_col="Label")
spec
Expression DRIVE WALK TRANSIT
Label
Drive Time odt_skims['SOV_TIME'] + dot_skims['SOV_TIME'] -0.0134 NaN NaN
Transit IVT (odt_skims['WLK_LOC_WLK_TOTIVT']/100 + dot_ski... NaN NaN -0.0134
Transit Wait Time short_i_wait_mult * ((odt_skims['WLK_LOC_WLK_I... NaN NaN -0.0134
Income hh.income > income_breakpoints[2] NaN -0.2 NaN
Constant one NaN -0.4 -0.5500

Data Trees and Flows#

Then, it’s time to prepare our data. We’ll create a DataTree that defines the relationships among all the datasets we’re working with. This is a tree in the mathematical sense, with nodes referencing the datasets and edges representing the relationships.

income_breakpoints = nb.typed.Dict.empty(nb.types.int32, nb.types.int32)
income_breakpoints[0] = 15000
income_breakpoints[1] = 30000
income_breakpoints[2] = 60000

tree = sh.DataTree(
    tour=tours,
    person=persons,
    hh=households,
    odt_skims=skims,
    dot_skims=skims,
    relationships=(
        "tour.PERID @ person.PERID",
        "person.household_id @ hh.HHID",
        "hh.TAZ @ odt_skims.otaz",
        "tour.dest_taz_idx -> odt_skims.dtaz",
        "tour.out_time_period @ odt_skims.time_period",
        "tour.dest_taz_idx -> dot_skims.otaz",
        "hh.TAZ @ dot_skims.dtaz",
        "tour.in_time_period @ dot_skims.time_period",
    ),
    extra_vars={
        "shortwait": 3,
        "one": 1,
    },
    aux_vars={
        "short_i_wait_mult": 0.75,
        "income_breakpoints": income_breakpoints,
    },
)
/home/runner/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/typed/typeddict.py:34: NumbaTypeSafetyWarning: unsafe cast from int64 to int32. Precision may be lost.
  d[key] = value

The first named dataset we include, tour, is by default the root node of this data tree. We then can define an arbitrary number of other named data nodes. Here, we add person, hh, odt_skims and odt_skims. Note that these last two are actually two different names for the same underlying dataset, and for each name we will next define a unique set of relationships.

All data nodes in this tree are stored as Dataset objects. We can give a pandas DataFrame in this contructor instead, but it will be automatically converted into a one-dimension Dataset. The conversion is no-copy if possible (and it is usually possible) so no additional memory is consumed in the conversion.

The relationships defines links of the data tree. Each relationship maps a particular variable in a named upstream dataset to a particular dimension of a named downstream dataset. For example, "person.household_id @ hh.HHID" tells the tree that the household_id variable in the person dataset contains labels (@) that map to the HHID dimension of the hh dataset.

In addition to mapping by label, we can also map by position, by using the -> operator in the relationship string instead of @. In the example above, we map the tour destination TAZ’s in this manner, as the dest_taz_idx variable in the tours dataset contains positional references instead of labels.

A special case for the relationship mapping is available when the source varibable in the upstream dataset is explicitly categorical. In this case, sharrow checks that the categories exactly match the labels in the referenced downstream dataset dimension, and that there are no missing categorical values. If they do match and there are no missing values, the code points of the categories are used as positional mapping references, which is both memory and runtime efficient. If they don’t match, an error is raised, as it is presumed that the user has made a mistake… in theory sharrow could unravel the category values and do the mapping by label, but this would be a cumbersome operation, contrary to the efficiency goals of the library.

Lastly, our tree definition includes a few named constants, that are just fixed values defined in a separate dictionary. These are shown in two groups, extra_vars and aux_vars. The values in extra_vars get hard-coded into the compiled results, effectively the same as if their values were expanded and written into exprssions in the spec directly. This is generally most efficient if the values will never change. On the other hand, aux_vars will be passed by reference into the compiled results. These values need to be numba-safe objects, so for instance a regular Python dictionary can’t be used, but a numba typed Dict is acceptable. So long as the data type and dimensionality of the values in aux_vars remains constant, the actual values can be changed later (i.e. after compilation).

Once we have defined our data tree, we can use it along with the spec, to compute the utility for various alternatives in the choice model. Sharrow allows us to compile this utility function into a Flow, which can be reused for massive speed gains on later utility evaluations.

flow = tree.setup_flow(spec.Expression)

To use a Flow for preparing the array of data that backs the utility function, we can call the load() method. The first time we call load(), it takes a (relatively) long time to evaluate, as the expressions are compiled and that compiled code is cached to disk.

%time flow.load()
CPU times: user 1.35 s, sys: 15.8 ms, total: 1.36 s
Wall time: 1.36 s
array([[ 9.4      , 16.9572   ,  4.5      ,  0.       ,  1.       ],
       [ 9.32     , 14.3628   ,  4.5      ,  1.       ,  1.       ],
       [ 7.62     , 11.0129   ,  4.5      ,  1.       ,  1.       ],
       ...,
       [ 8.52     , 11.6154995,  3.260325 ,  0.       ,  1.       ],
       [11.74     , 16.2798   ,  3.440325 ,  0.       ,  1.       ],
       [10.48     , 13.3974   ,  3.942825 ,  0.       ,  1.       ]],
      dtype=float32)

Subsequent calls to load() are much faster.

%time flow.load()
CPU times: user 21.7 ms, sys: 0 ns, total: 21.7 ms
Wall time: 21.4 ms
array([[ 9.4      , 16.9572   ,  4.5      ,  0.       ,  1.       ],
       [ 9.32     , 14.3628   ,  4.5      ,  1.       ,  1.       ],
       [ 7.62     , 11.0129   ,  4.5      ,  1.       ,  1.       ],
       ...,
       [ 8.52     , 11.6154995,  3.260325 ,  0.       ,  1.       ],
       [11.74     , 16.2798   ,  3.440325 ,  0.       ,  1.       ],
       [10.48     , 13.3974   ,  3.942825 ,  0.       ,  1.       ]],
      dtype=float32)

It’s not faster because it’s cached the data, but because it’s cached the compiled code. (Setting the compile_watch argument to a truthy value will trigger a check of the cache files and emit a warning message if recompilation was triggered.) We can swap out the tour node in the tree for a different set of (similarly formatted) tours, and re-evaluate at that fast speed.

tours_2 = random_tours(seed=43)
tours_2.head()
PERID dest_taz_idx out_time_period in_time_period
TOURIDX
0 2566803 6 EA AM
1 3596408 6 MD MD
2 1631117 8 EV EA
3 29658 13 EA EA
4 3138090 5 PM EA

Note that the flow requires not just a base dataset but a whole DataTree to operate, so to re-evaluate with a new tours we need to make a DataTree with replace_datasets. Fortuntately, this operation is no-copy so it doesn’t consume much memory. If all the datasets in a tree are linked by position (instead of by label) this would be almost instantaneous, but since our example tree here has tours linked by label it takes just a moment to rebuild the linkages.

tree_2 = tree.replace_datasets(tour=tours_2)
%time flow.load(tree_2)
CPU times: user 21.4 ms, sys: 0 ns, total: 21.4 ms
Wall time: 21.2 ms
array([[ 6.5299997,  9.7043   ,  4.3533   ,  0.       ,  1.       ],
       [ 4.91     ,  2.6404002,  1.16565  ,  1.       ,  1.       ],
       [ 4.8900003,  2.2564   ,  4.1078253,  0.       ,  1.       ],
       ...,
       [ 4.25     ,  7.6692   ,  2.50065  ,  0.       ,  1.       ],
       [ 7.3999996, 12.2662   ,  3.0513   ,  0.       ,  1.       ],
       [ 8.91     , 10.9822   ,  4.5      ,  0.       ,  1.       ]],
      dtype=float32)

When the spec is large, the compile time can be significant, as the compiler can spend a lot of time trying to optimize the large resulting function. In this case, it may be better to use a less optimized code path, which does not involve the compiler as much. This can be done by setting the use_array_maker argument of the load function to True. This will still compile the expressions to create individual array columns, but it will not try to optimize the concatenation of the columns into one large array. Similar to regular loading, the first call to the load function with the use_array_maker option set will have some (but less) compiler overhead.

%time flow.load(use_array_maker=True)
CPU times: user 12.6 ms, sys: 3 μs, total: 12.6 ms
Wall time: 12.5 ms
array([[ 9.4      , 16.9572   ,  4.5      ,  0.       ,  1.       ],
       [ 9.32     , 14.3628   ,  4.5      ,  1.       ,  1.       ],
       [ 7.62     , 11.0129   ,  4.5      ,  1.       ,  1.       ],
       ...,
       [ 8.52     , 11.6154995,  3.260325 ,  0.       ,  1.       ],
       [11.74     , 16.2798   ,  3.440325 ,  0.       ,  1.       ],
       [10.48     , 13.3974   ,  3.942825 ,  0.       ,  1.       ]],
      dtype=float32)

Subsequent runs are faster because the compiled function is cached.

%time flow.load(use_array_maker=True)
CPU times: user 10.9 ms, sys: 0 ns, total: 10.9 ms
Wall time: 10.6 ms
array([[ 9.4      , 16.9572   ,  4.5      ,  0.       ,  1.       ],
       [ 9.32     , 14.3628   ,  4.5      ,  1.       ,  1.       ],
       [ 7.62     , 11.0129   ,  4.5      ,  1.       ,  1.       ],
       ...,
       [ 8.52     , 11.6154995,  3.260325 ,  0.       ,  1.       ],
       [11.74     , 16.2798   ,  3.440325 ,  0.       ,  1.       ],
       [10.48     , 13.3974   ,  3.942825 ,  0.       ,  1.       ]],
      dtype=float32)

The load function also has some other features, like nicely formatting the output into a DataFrame.

df = flow.load_dataframe()
df
Drive Time Transit IVT Transit Wait Time Income Constant
0 9.40 16.957199 4.500000 0.0 1.0
1 9.32 14.362800 4.500000 1.0 1.0
2 7.62 11.012900 4.500000 1.0 1.0
3 4.25 7.669200 2.500650 0.0 1.0
4 6.16 8.218600 3.387825 0.0 1.0
... ... ... ... ... ...
99995 4.86 4.928800 4.500000 0.0 1.0
99996 1.07 0.000000 0.000000 0.0 1.0
99997 8.52 11.615499 3.260325 0.0 1.0
99998 11.74 16.279800 3.440325 0.0 1.0
99999 10.48 13.397400 3.942825 0.0 1.0

100000 rows × 5 columns

Linear-in-Parameters Functions#

When the spec represents a linear-in-parameters utility function, the data we get out of the load() function represents one matrix in a dot-product, and the coefficients in the spec provide the other matrix. We might look to use the efficient linear algebra algorithms embedded in np.dot to compute the utility, like this:

x = flow.load()
b = spec.iloc[:, 1:].fillna(0).astype(np.float32).values
np.dot(x, b)
array([[-0.12595999, -0.4       , -0.83752644],
       [-0.124888  , -0.6       , -0.80276155],
       [-0.10210799, -0.6       , -0.7578729 ],
       ...,
       [-0.114168  , -0.4       , -0.74933606],
       [-0.157316  , -0.4       , -0.8142497 ],
       [-0.14043199, -0.4       , -0.782359  ]], dtype=float32)

But sharrow provides a substantially faster option, by embedding the dot product directly into the compiled code and never instantiating the full x array in memory at all.

%time flow.dot(b)
CPU times: user 2.04 s, sys: 12.1 ms, total: 2.05 s
Wall time: 1.73 s
array([[-0.12595999, -0.4       , -0.83752644],
       [-0.124888  , -0.6       , -0.80276155],
       [-0.10210799, -0.6       , -0.7578729 ],
       ...,
       [-0.114168  , -0.4       , -0.74933606],
       [-0.157316  , -0.4       , -0.8142497 ],
       [-0.14043199, -0.4       , -0.782359  ]], dtype=float32)
u = flow.dot(b)
u
array([[-0.12595999, -0.4       , -0.83752644],
       [-0.124888  , -0.6       , -0.80276155],
       [-0.10210799, -0.6       , -0.7578729 ],
       ...,
       [-0.114168  , -0.4       , -0.74933606],
       [-0.157316  , -0.4       , -0.8142497 ],
       [-0.14043199, -0.4       , -0.782359  ]], dtype=float32)

As before, the compiler runs only the first time we apply the this function with this structure, and subsequent runs are faster, even with different source data.

%time flow.dot(b, source=tree_2)
CPU times: user 127 ms, sys: 0 ns, total: 127 ms
Wall time: 31.4 ms
array([[-0.087502  , -0.4       , -0.73837185],
       [-0.065794  , -0.6       , -0.6010011 ],
       [-0.065526  , -0.4       , -0.6352806 ],
       ...,
       [-0.05695   , -0.4       , -0.68627596],
       [-0.09915999, -0.4       , -0.7552545 ],
       [-0.119394  , -0.4       , -0.7574615 ]], dtype=float32)

As for the plain load method, the dot method also has some formatted output versions. For example, the dot_dataarray returns a DataArray.

flow.dot_dataarray(b, source=tree_2)
<xarray.DataArray (TOURIDX: 100000, ALT_COL: 3)> Size: 1MB
array([[-0.087502  , -0.4       , -0.73837185],
       [-0.065794  , -0.6       , -0.6010011 ],
       [-0.065526  , -0.4       , -0.6352806 ],
       ...,
       [-0.05695   , -0.4       , -0.68627596],
       [-0.09915999, -0.4       , -0.7552545 ],
       [-0.119394  , -0.4       , -0.7574615 ]], dtype=float32)
Coordinates:
  * TOURIDX  (TOURIDX) int64 800kB 0 1 2 3 4 5 ... 99995 99996 99997 99998 99999
Dimensions without coordinates: ALT_COL

This works even better if the coefficients are given as a DataArray too, so it can harvest dimension names and coordinates as appropriate.

B = xr.DataArray(
    spec.iloc[:, 1:].fillna(0).astype(np.float32), dims=("expressions", "modes")
)
flow.dot_dataarray(B, source=tree_2)
<xarray.DataArray (TOURIDX: 100000, modes: 3)> Size: 1MB
array([[-0.087502  , -0.4       , -0.73837185],
       [-0.065794  , -0.6       , -0.6010011 ],
       [-0.065526  , -0.4       , -0.6352806 ],
       ...,
       [-0.05695   , -0.4       , -0.68627596],
       [-0.09915999, -0.4       , -0.7552545 ],
       [-0.119394  , -0.4       , -0.7574615 ]], dtype=float32)
Coordinates:
  * TOURIDX  (TOURIDX) int64 800kB 0 1 2 3 4 5 ... 99995 99996 99997 99998 99999
Dimensions without coordinates: modes

Multinomial Logit Simulation#

The next level of flow evaluation is made by treating the dot-product as a linear-in-parameters multinomial logit (MNL) utility function, and making simulated choices based on that model. To do this, we’ll need to provide the random draws as a function input (which also lets us attach any randomization engine we prefer, e.g. a reproducible random generator). For this example, we’ll create one random (uniform) draw for each tour.

draws = np.random.default_rng(321).random(size=tree.shape[0])
/home/runner/work/sharrow/sharrow/sharrow/relationships.py:402: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  self.root_dataset.dims[i] for i in dim_order if i not in self.dim_exclude

Given those draws, we use the logit_draws method to build and apply a MNL simulator, which returns to us both the choices and the probability that was computed for each chosen alternative. As previously, the first call to this function triggers compilation that may take a noticable amount of time.

%time choices, choice_probs = flow.logit_draws(b, draws)
CPU times: user 7.01 s, sys: 43.5 ms, total: 7.05 s
Wall time: 7.02 s

Subsequent calls to the same method are much faster.

%time choices, choice_probs = flow.logit_draws(b, draws)
CPU times: user 57.8 ms, sys: 0 ns, total: 57.8 ms
Wall time: 17.8 ms

As this is the most complex flow processor, it takes the longest to compile, but after compilation it runs quite efficiently. We can see here the whole MNL simulation process for this data requires only a few milliseconds more time than just computing the utilities. The same is true even after we change the data source, as we are using cached compiled code, not cached data:

%time choices2, choice_probs2 = flow.logit_draws(b, draws, source=tree_2)
CPU times: user 91.4 ms, sys: 0 ns, total: 91.4 ms
Wall time: 25.6 ms

The resulting choices are the index position of the choices, not the labels.

choices
array([1, 2, 1, ..., 0, 1, 0], dtype=int8)

But if we want the labels, it’s easy enough to convert these indexes into labels.

B.modes[choices]
<xarray.DataArray 'modes' (modes: 100000)> Size: 800kB
array(['WALK', 'TRANSIT', 'WALK', ..., 'DRIVE', 'WALK', 'DRIVE'], dtype=object)
Coordinates:
  * modes    (modes) object 800kB 'WALK' 'TRANSIT' 'WALK' ... 'WALK' 'DRIVE'

Nested Logit Simulation#

Sharrow can also apply nested logit models. To do so, you’ll also need to install a recent version of larch (e.g. conda install "larch>=5.7.1" -c conda-forge).

The nesting tree can be defined as usual in Larch, or you can use the construct_nesting_tree convenience function to read in a nesting tree definition according to the usual ActivitySim yaml notation, like this:

nesting_settings = """
NESTS:
  name: root
  coefficient: coef_nest_root
  alternatives:
      - name: MOTORIZED
        coefficient: coef_nest_motor
        alternatives:
            - DRIVE
            - TRANSIT
      - WALK
"""

import yaml

from sharrow.nested_logit import construct_nesting_tree

nesting_settings = yaml.safe_load(nesting_settings)["NESTS"]
nest_tree = construct_nesting_tree(
    alternatives=spec.columns[1:], nesting_settings=nesting_settings
)
JAX not found. Some functionality will be unavailable.
/home/runner/miniconda3/envs/testing-env/lib/python3.10/site-packages/larch/model/numbamodel.py:32: UserWarning: 

#### larch v6 is experimental, and not feature-complete ####
the first time you import on a new system, this package will
compile optimized binaries for your specific machine,  which
may take a little while, please be patient ...

  warnings.warn(  ## Good news, everyone! ##  )
Exception ignored on calling ctypes callback function: <function ExecutionEngine._raw_object_cache_notify at 0x7facfc980940>
Traceback (most recent call last):
  File "/home/runner/miniconda3/envs/testing-env/lib/python3.10/site-packages/llvmlite/binding/executionengine.py", line 178, in _raw_object_cache_notify
    def _raw_object_cache_notify(self, data):
KeyboardInterrupt: 
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[46], line 19
     16 from sharrow.nested_logit import construct_nesting_tree
     18 nesting_settings = yaml.safe_load(nesting_settings)["NESTS"]
---> 19 nest_tree = construct_nesting_tree(
     20     alternatives=spec.columns[1:], nesting_settings=nesting_settings
     21 )

File ~/work/sharrow/sharrow/sharrow/nested_logit.py:161, in construct_nesting_tree(alternatives, nesting_settings)
    143 """
    144 Construct a larch NestingTree from ActivitySim settings.
    145 
   (...)
    158 NestingTree
    159 """
    160 try:
--> 161     from larch.model.tree import NestingTree
    162 except ImportError:
    163     raise ImportError("larch is required to construct nesting trees") from None

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/larch/__init__.py:14
     12 from .model import mixtures
     13 from .model.basemodel import BaseModel
---> 14 from .model.jaxmodel import Model
     15 from .model.latent_class import LatentClass, MixedLatentClass
     16 from .model.model_group import ModelGroup

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/larch/model/jaxmodel.py:14
     12 from ..folding import fold_dataset
     13 from ..optimize import OptimizeMixin
---> 14 from .numbamodel import NumbaModel
     16 logger = logging.getLogger(__name__)
     19 def _get_jnp_array(dataset, name):

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/larch/model/numbamodel.py:750
    746     d_loglike[:] += d_penalty
    747     bhhh[:] += np.outer(d_penalty, d_penalty)
--> 750 _numba_penalty_vectorized = guvectorize(
    751     _type_signatures("fffffFff"),
    752     ("(params),(params),(params),(),()->(params,params),(params),()"),
    753     nopython=True,
    754     fastmath=True,
    755     target="parallel",
    756     cache=True,
    757 )(
    758     _numba_penalty,
    759 )
    762 def model_co_slots(data_provider: Dataset, model: _BaseModel, dtype=np.float64):
    763     len_co = sum(len(_) for _ in model.utility_co.values())

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/np/ufunc/decorators.py:203, in guvectorize.<locals>.wrap(func)
    201 guvec = GUVectorize(func, signature, **kwargs)
    202 for fty in ftylist:
--> 203     guvec.add(fty)
    204 if len(ftylist) > 0:
    205     guvec.disable_compile()

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/np/ufunc/ufuncbuilder.py:257, in _BaseUFuncBuilder.add(self, sig)
    255 else:
    256     targetoptions = self.nb_func.targetoptions
--> 257 cres, args, return_type = _compile_element_wise_function(
    258     self.nb_func, targetoptions, sig)
    259 sig = self._finalize_signature(cres, args, return_type)
    260 self._sigs.append(sig)

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/np/ufunc/ufuncbuilder.py:175, in _compile_element_wise_function(nb_func, targetoptions, sig)
    172 def _compile_element_wise_function(nb_func, targetoptions, sig):
    173     # Do compilation
    174     # Return CompileResult to test
--> 175     cres = nb_func.compile(sig, **targetoptions)
    176     args, return_type = sigutils.normalize_signature(sig)
    177     return cres, args, return_type

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/np/ufunc/ufuncbuilder.py:123, in UFuncDispatcher.compile(self, sig, locals, **targetoptions)
    118 # Disable loop lifting
    119 # The feature requires a real
    120 #  python function
    121 flags.enable_looplift = False
--> 123 return self._compile_core(sig, flags, locals)

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/np/ufunc/ufuncbuilder.py:162, in UFuncDispatcher._compile_core(self, sig, flags, locals)
    156 cres = compiler.compile_extra(typingctx, targetctx,
    157                               self.py_func, args=args,
    158                               return_type=return_type,
    159                               flags=flags, locals=locals)
    161 # cache lookup failed before so safe to save
--> 162 self.cache.save_overload(sig, cres)
    164 return cres

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/core/caching.py:652, in Cache.save_overload(self, sig, data)
    648 """
    649 Save the data for the given signature in the cache.
    650 """
    651 with self._guard_against_spurious_io_errors():
--> 652     self._save_overload(sig, data)

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/core/caching.py:661, in Cache._save_overload(self, sig, data)
    659 self._impl.locator.ensure_cache_path()
    660 key = self._index_key(sig, data.codegen)
--> 661 data = self._impl.reduce(data)
    662 self._cache_file.save(key, data)

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/core/caching.py:389, in CompileResultCacheImpl.reduce(self, cres)
    385 def reduce(self, cres):
    386     """
    387     Returns a serialized CompileResult
    388     """
--> 389     return cres._reduce()

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/core/compiler.py:199, in CompileResult._reduce(self)
    195 def _reduce(self):
    196     """
    197     Reduce a CompileResult to picklable components.
    198     """
--> 199     libdata = self.library.serialize_using_object_code()
    200     # Make it (un)picklable efficiently
    201     typeann = str(self.type_annotation)

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/core/codegen.py:921, in CPUCodeLibrary.serialize_using_object_code(self)
    915 """
    916 Serialize this library using its object code as the cached
    917 representation.  We also include its bitcode for further inlining
    918 with other libraries.
    919 """
    920 self._ensure_finalized()
--> 921 data = (self._get_compiled_object(),
    922         self._get_module_for_linking().as_bitcode())
    923 return (self.name, 'object', data)

File ~/miniconda3/envs/testing-env/lib/python3.10/site-packages/numba/core/codegen.py:629, in CodeLibrary._get_compiled_object(self)
    627     raise ValueError("object caching not enabled in %s" % (self,))
    628 if self._compiled_object is None:
--> 629     raise RuntimeError("no compiled object yet for %s" % (self,))
    630 return self._compiled_object

RuntimeError: no compiled object yet for <Library '_numba_penalty' at 0x7fac824bcca0>
nest_tree

Once the nesting tree is defined, it needs to be converted to operating arrays, using the as_arrays method (available in larch 5.7.1 and later). Since we note estimating a nested logit model and just applying one, we can give the parameter values as a dictionary instead of a larch.Model to link against.

nesting = nest_tree.as_arrays(
    trim=True, parameter_dict={"coef_nest_motor": 0.5, "coef_nest_root": 1.0}
)

This dictionary of arrays can be passed in to the logit_draws function to compile a nested logit model intead of a simple MNL.

%time choices_nl, choice_probs_nl = flow.logit_draws(b, draws, nesting=nesting)
%time choices2_nl, choice2_probs_nl = flow.logit_draws(b, draws, source=tree_2, nesting=nesting)

For nested logit models, computing just the logsums is faster than generating probabilities (and making choices) so the logsums=1 argument allows you to short-circuit the computations if you only want the logsums.

flow.logit_draws(b, draws, source=tree_2, nesting=nesting, logsums=1)

Note that for consistency, the choices, probabilities of choices, and pick count arrays are still returned as the first three elements of the returned tuple, but they’re all empty (i.e., None).

To get both the logsums and the choices, set logsums=2.

flow.logit_draws(b, draws, source=tree_2, nesting=nesting, logsums=2)

Batch Simulation#

Suppose we want to compute logsums not just for one destination, but for many destinations. We can construct a Dataset with two dimensions to use at the top of our DataTree, one for the tours and one for the candidate destinations.

tour_by_dest = tree.subspaces["tour"]
tour_by_dest = tour_by_dest.assign_coords(
    {"CAND_DEST": xr.DataArray(np.arange(25), dims="CAND_DEST")}
)
tour_by_dest

Then we can create a very similar DataTree as above, using this two dimension root Dataset, but we will point to our destination zones from the new tour dimension. and then create a flow from that.

wide_tree = sh.DataTree(
    tour=tour_by_dest,
    person=persons,
    hh=households,
    odt_skims=skims,
    dot_skims=skims,
    relationships=(
        "tour.PERID @ person.PERID",
        "person.household_id @ hh.HHID",
        "hh.TAZ @ odt_skims.otaz",
        "tour.CAND_DEST -> odt_skims.dtaz",
        "tour.out_time_period @ odt_skims.time_period",
        "tour.CAND_DEST -> dot_skims.otaz",
        "hh.TAZ @ dot_skims.dtaz",
        "tour.in_time_period @ dot_skims.time_period",
    ),
    extra_vars={
        "shortwait": 3,
        "one": 1,
    },
    aux_vars={
        "short_i_wait_mult": 0.75,
        "income_breakpoints": income_breakpoints,
    },
    dim_order=("TOURIDX", "CAND_DEST"),
)
wide_flow = wide_tree.setup_flow(spec.Expression)
wide_logsums = wide_flow.logit_draws(b, logsums=1, compile_watch="simple")[-1]
%time wide_logsums = wide_flow.logit_draws(b, logsums=1, compile_watch="simple")[-1]
wide_logsums