FileSystem#
- settings activitysim.core.configuration.FileSystem#
Manage finding and loading files for ActivitySim’s command line interface.
- Config:
validate_assignment: bool = True
- Fields:
cache_dir (pathlib.Path)
configs_dir (tuple[pathlib.Path, ...])
data_dir (tuple[pathlib.Path, ...])
data_model_dir (tuple[pathlib.Path, ...])
output_dir (pathlib.Path)
pipeline_file_name (str)
profile_dir (pathlib.Path)
settings_file_name (str)
sharrow_cache_dir (pathlib.Path)
working_dir (pathlib.Path)
- field cache_dir: Path = None#
Name of the output directory for general cache files.
If not given, a directory named “cache” will be created inside the usual output directory.
- field output_dir: Path = 'output'#
Name of the output directory.
This directory will be created on access if it does not exist.
- field profile_dir: Path = None#
Name of the output directory for pyinstrument profiling files.
If not given, a unique time-stamped directory will be created inside the usual output directory.
- field sharrow_cache_dir: Path = None#
Name of the output directory for sharrow cache files.
If not given, the sharrow cache is stored in a run-independent persistent location, according to platformdirs.user_cache_dir. See persist_sharrow_cache.
- field working_dir: DirectoryPath = None#
Name of the working directory.
All other directories (configs, data, output, cache), when given as relative paths, are assumed to be relative to this working directory. If it is not provided, the usual Python current working directory is used.
- Constraints:
path_type = dir
- classmethod configs_dirs_must_exist(configs_dir, values)#
- classmethod data_dirs_must_exist(data_dir, values)#
- classmethod data_model_dirs_must_exist(data_model_dir, values)#
- find_trace_file_path(file_name, trace_dir=None, return_all=False, file_type=None)#
Find the complete path to one or more existing trace file(s).
- Parameters:
- file_namestr
Base name of the trace file.
- trace_dirpath-like, optional
Construct the trace file path within this directory. If not provided (typically for normal operation) the “trace” sub-directory of the normal output directory given by get_output_dir is used. The option to give a different location is primarily used to conduct trace file validation testing.
- return_allbool, default False
By default, only a single matching filename is returned, otherwise an exception is raised. Alternatively, set this to true to return all matches.
- file_typestr, optional
If provided, ensure that the located file path(s) have this extension.
- Returns:
- Path or list[Path]
A single Path if return_all is False, otherwise a list
- Raises:
- FileNotFoundError
If there are zero OR multiple matches.
- get_cache_dir(subdir=None) Path #
Get the cache directory, creating it if needed.
- The cache directory is used to store:
skim memmaps created by skim+dict_factories
tvpb tap_tap table cache
pre-compiled sharrow modules
- Parameters:
- subdirPath-like, optional
If given, get this subdirectory of the output_dir.
- Returns:
- Path
- get_config_file_path(file_name: Path | str, mandatory: bool = True, allow_glob: bool = False) Path #
Find the first matching file among config directories.
- Parameters:
- file_namePath-like
The name of the file to match.
- mandatorybool, default True
Raise a FileNotFoundError if no match is found. If set to False, this method returns None when there is no match.
- allow_globbool, default False
Allow glob-style matches.
- Returns:
- Path or None
- get_data_file_path(file_name, mandatory=True, allow_glob=False, alternative_suffixes=()) Path #
Find the first matching file among data directories.
- Parameters:
- file_namePath-like
The name of the file to match.
- mandatorybool, default True
Raise a FileNotFoundError if no match is found. If set to False, this method returns None when there is no match.
- allow_globbool, default False
Allow glob-style matches.
- alternative_suffixesIterable[str], optional
Other file suffixes to search for, if the expected filename is not found. This allows, for example, the data files to be stored as compressed csv (”*.csv.gz”) without changing the config files.
- Returns:
- Path or None
- get_log_file_path(file_name) Path #
Get the complete path to a log file.
- Parameters:
- file_namestr
Base name of the log file.
- Returns:
- Path
- get_output_dir(subdir=None) Path #
Get an output directory, creating it if needed.
- Parameters:
- subdirPath-like, optional
If given, get this subdirectory of the output_dir.
- Returns:
- Path
- get_pipeline_filepath() Path #
Get the complete path to the pipeline file or directory.
- Returns:
- Path
- get_profiling_file_path(file_name) Path #
Get the complete path to a profile output file.
- Parameters:
- file_namestr
Base name of the profiling output file.
- Returns:
- Path
- get_sharrow_cache_dir() Path #
Get the sharrow cache directory, creating it if needed.
The sharrow cache directory is used to store only sharrow’s cache of pre-compiled functions.
- Returns:
- Path
- get_trace_file_path(file_name, tail=None, trace_dir=None, create_dirs=True, file_type=None)#
Get the complete path to a trace file.
- Parameters:
- file_namestr
Base name of the trace file.
- tailstr or False, optional
Add this suffix to filenames. If not given, a quasi-random short string is derived from the current time. Set to False to omit the suffix entirely. Having a unique suffix makes it easier to open multiple comparable trace files side-by-side in Excel, which doesn’t allow identically named files to be open simultaneously. Omitting the suffix can be valuable for using automated tools to find file differences across many files simultaneously.
- trace_dirpath-like, optional
Construct the trace file path within this directory. If not provided (typically for normal operation) the “trace” sub-directory of the normal output directory given by get_output_dir is used. The option to give a different location is primarily used to conduct trace file validation testing.
- create_dirsbool, default True
If the path to the containing directory of the trace file does not yet exist, create it.
- file_typestr, optional
If provided, ensure that the generated file path has this extension.
- Returns:
- Path
- open_log_file(file_name, mode, header=None, prefix=False)#
- classmethod parse_args(args)#
- parse_settings(settings)#
- persist_sharrow_cache() None #
Change the sharrow cache directory to a persistent, user-global location.
The change is made in-place to sharrow_cache_dir for this object. The location for the cache is selected by platformdirs.user_cache_dir. An extra directory layer based on the current numba version is also added to the cache directory, which allows for different sets of cache files to co-exist for different version of numba (i.e. different conda envs). This location is not configurable – to select a different location, change the value of FileSystem.sharrow_cache_dir itself.
See also
- read_model_coefficients(model_settings: LogitComponentSettings | dict[str, Any] | None = None, file_name: str | None = None) DataFrame #
- read_model_settings(file_name, mandatory=False)#
- read_settings_file(file_name: str, mandatory: bool = True, include_stack: bool = False, configs_dir_list: tuple[Path] | None = None, validator_class: type[BaseModel] | None = None) BaseModel | dict #
Load settings from one or more yaml files.
This method will look for first occurrence of a yaml file named <file_name> in the directories in configs_dir list, and read settings from that yaml file.
Settings file may contain directives that affect which file settings are returned:
- inherit_settings (boolean)
If found and set to true, this method will backfill settings in the current file with values from the next settings file in configs_dir list (if any)
- include_settings: string <include_file_name>
Read settings from specified include_file in place of the current file. To avoid confusion, this directive must appear ALONE in the target file, without any additional settings or directives.
- Parameters:
- file_namestr
- mandatoryboolean, default True
If true, raise SettingsFileNotFoundError if no matching settings file is found in any config directory, otherwise this method will return an empty dict or an all-default instance of the validator class.
- include_stackboolean or list
Only used for recursive calls, provides a list of files included so far to detect and prevent cycles.
- validator_classtype[pydantic.BaseModel], optional
This model is used to validate the loaded settings.
- Returns:
- dict or validator_class