Using Sharrow#
This page will walk through an exercise of running a model with sharrow
.
How it Works#
Sharrow accelerates ActivitySim in part by using numba to create optimized and pre-compiled versions of utility specification files, and caching those bits of code to disk.
Important
Running the compiler needs to be done in single-process mode, otherwise the various process all do the compiling and compete to write to the same cache location on disk, which is likely to fail. You can safely run in multiprocessing mode after all the compilation for all model components is complete.
Model Design Requirements#
Activating the sharrow
optimizations also requires using the new
SkimDataset
interface for skims instead of the legacy
SkimDict
, and internally
recoding zones into a zero-based contiguous indexing scheme.
Zero-based Recoding of Zones#
Using sharrow requires recoding zone id’s to be zero-based contiguous index
values, at least for internal usage. This recoding needs to be written into
the input table list explicitly. For example, the following snippet of a
settings.yaml
settings file shows the process of recoding zone ids in
the input files.
input_table_list:
- tablename: land_use
filename: land_use.csv
index_col: zone_id
recode_columns:
zone_id: zero-based
- tablename: households
filename: households.csv
index_col: household_id
recode_columns:
home_zone_id: land_use.zone_id
For the land_use
table, the zone_id
field is marked for recoding explicitly
as zero-based
, which will turn whatever nominal id’s appear in that column into
zero-based index values, as well as store a mapping of the original values that
is used to recode and decode zone id’s when used elsewhere.
The first “elsewhere” recoding is in the households input table, where we will
map the home_zone_id
to the new zone id’s by pointing the recoding instruction
at the land_use.zone_id
field. If zone id’s appear in other input files, they
need to be recoded in those fields as well, using the same process.
The other places where we need to handle zone id’s is in output files. The
following snippet of a settings.yaml
settings file shows how those id’s are
decoded in various output files. Generally, for columns that are fully populated
with zone id’s (e.g. tour and trip ends) we can apply the decode_columns
settings
to reverse the mapping and restore the nominal zone id’s globally for the entire
column of data. For columns where there is some missing data flagged by negative
values, the “nonnegative” filter is prepended to the instruction.
output_tables:
action: include
tables:
- tablename: land_use
decode_columns:
zone_id: land_use.zone_id
- tablename: households
decode_columns:
home_zone_id: land_use.zone_id
- tablename: persons
decode_columns:
home_zone_id: land_use.zone_id
school_zone_id: nonnegative | land_use.zone_id
workplace_zone_id: nonnegative | land_use.zone_id
- tablename: tours
decode_columns:
origin: land_use.zone_id
destination: land_use.zone_id
- tablename: trips
decode_columns:
origin: land_use.zone_id
destination: land_use.zone_id
Measuring Performance#
Testing with sharrow requires two steps: test mode and production mode.
In test mode, the code is run to compile all the spec files and ascertain whether the functions are working correctly. Test mode is expected to be slow, potentially much slower than older versions of ActivitySim, especially for models with small populations and zone systems, as the compile time is a function of the complexity of the utility functions and not a function of the number of households or zones. Once the compile and test is complete, production mode can then just run the pre-compiled functions with sharrow, which is much faster.
It is possible to run test mode and production mode independently using the
existing activitysim run
command line tool, pointing that tool to the test
and production configurations directories as appropriate.
To generate a meaningful measure of performance enhancement, it is necessary
to compare the runtimes in production mode against equivalent runtimes with
sharrow disabled. This is facilitated by the activitysim workflow
command
line tool, which permits the use of pre-made batches of activitysim runs, as
well as automatic report generation from the results. For more details on the
use of this tool, see workflows.
Digital Encoding#
Sharrow is compatible with and able to efficiently use
digital encoding.
These encodings are applied to data either prospectively (i.e. before ActivitySim
ever sees the skim data), or dynamically within a run using the
taz_skims.digital-encoding
or taz_skims.zarr-digital-encoding
settings in
the network_los.yaml
settings file. The only difference between these two
settings is that the former applies this digital encoding internally every
time you run the model, while the latter applies it prior to caching encoded
skims to disk in Zarr format (and then reuses those values without re-encoding
on subsequent runs with the same data). Dictionary encoding (especially joint
dictionary encodings) can take a long time to prepare, so caching the values can
be useful. As read/write speeds for zarr files are fast, you usually won’t
notice a meaningful performance degradation when caching, so the default is
generally to use zarr-digital-encoding
.
Very often, data can be expressed adequately with far less memory than is needed to store a standard 32-bit floating point representation. There are two simple ways to reduce the memory footprint for data: fixed point encoding, or dictionary encoding.
Fixed Point Encoding#
In fixed point encoding, which is also sometimes called scaled integers, data is multiplied by some factor, then rounded to the nearest integer. The integer is stored in memory instead of a floating point value, but the original value can be (approximately) restored by reversing the process. An offset factor can also be applied, so that the domain of values does not need to start at zero.
For example, instead of storing matrix table values as 32-bit floating point values, they could be multiplied by a scale factor (e.g., 100) and then converted to 16-bit integers. This uses half the RAM and can still express any value (to two decimal point precision) up to positive or negative 327.68. If the lowest values in that range are never needed, it can also be shifted, moving both the bottom and top limits by a fixed amount. Then, for a particular scale $\mu$ and shift $\xi$ (stored in metadata), from any array element $i$ the implied (original) value $x$ can quickly be recovered by evaluating $(i / \mu) - \xi$.
Fixed point digital encoding can be applied to matrix tables in the skims
using options in the network_los.yaml
settings file. Making transformations
currently also requires shifting the data from OMX to ZARR file formats;
future versions of ActivitySim may accept digitally encoded data directly
from external sources.
To apply the default 16-bit encoding to individual named skim variables in the
TAZ skims, just give their names under the zarr-digital-encoding
setting
like this:
taz_skims:
omx: skims.omx
zarr: skims.zarr
zarr-digital-encoding:
- name: SOV_TIME
- name: HOV2_TIME
If some variables can use less RAM and still be represented adequately with only 8-bit integers, you can specify the bitwidth as well:
taz_skims:
omx: skims.omx
zarr: skims.zarr
zarr-digital-encoding:
- name: SOV_TIME
- name: HOV2_TIME
- name: SOV_TOLL
bitwidth: 8
- name: HOV2_TOLL
bitwidth: 8
If groups of similarly named variables should have the same encoding applied, they can be identifed by regular expressions (“regex”) instead of explicitly giving each name. For example:
taz_skims:
omx: skims.omx
zarr: skims.zarr
zarr-digital-encoding:
- regex: .*_TIME
- regex: .*_TOLL
bitwidth: 8
Dictionary Encoding#
For dictionary encoding, a limited number of unique values are stored in a lookup array, and then each encoded value is stored as the position of the value (or its closest approximation) in the lookup array. If there are fewer than 256 unique values, this can allow the storage of those values to any level of precision (even float64 if needed) while using only a single byte per array element, plus a small fixed amount of overhead for the dictionary itself. The overhead memory doesn’t scale with the dimensions of the array, so this works particularly well for models with thousands of zones.
Dictionary encoding can be applied to a single variable in a similar fashion as
fixed point encoding, giving the dictionary bit width in the by_dict
setting,
or as an additional setting value.
taz_skims:
omx: skims.omx
zarr: skims.zarr
zarr-digital-encoding:
- name: TRANSIT_FARE
by_dict: 8
- name: TRANSIT_XFERS
by_dict: true
bitwidth: 8
The most dramatic memory savings can be found when the categorical correlation (also known as Cramér’s V) between multiple variables is high. In this case, we can encode more than one matrix table using the same dictionary lookup indexes. There may be some duplication in the lookup table, (e.g. if FARE and XFER are joint encoded, and if a FARE of 2.25 can be matched with either 0 or 1 XFER, the 2.25 would appear twice in the lookup array for FARE, once for each value of XFER.)
Since it is the lookup indexes that scale with the number of zones and consume most of the memory for large zone systems, putting multiple variables together into one set of indexes can save a ton of memory, so long as the overhead of the lookup array does not combinatorially explode (hence the need for categorical correlation).
Practical testing for large zone systems suggest this method of encoding can reduce the footprint of some low variance data tables (especially transit data) by 95% or more.
Applying joint dictionary encoding requires more than one variable name, so only
the regex
form works here. Use wildcards to match on name patterns, or select a
few specific names by joining them with the pipe operator (|).
taz_skims:
omx: skims.omx
zarr: skims.zarr
zarr-digital-encoding:
- regex: .*_FARE|.*_WAIT|.*_XFERS
joint_dict: true
- regex: FERRYTIME|FERRYFARE|FERRYWAIT
joint_dict: true
For more details on all the settings available for digital encoding, see DigitalEncoding.