mirror of
https://github.com/envmodules/modules.git
synced 2026-06-07 00:25:05 +08:00
398 lines
16 KiB
ReStructuredText
398 lines
16 KiB
ReStructuredText
.. _module-cache:
|
|
|
|
Module cache
|
|
============
|
|
|
|
Objective is to consolidate available module information in a few number of
|
|
files in order to process these files when doing global module queries like
|
|
done by :subcmd:`avail` or :subcmd:`whatis` sub-commands instead of walking
|
|
through the whole modulepath directories to compute this information.
|
|
|
|
With cache files, it is expected to have lesser IO load and more efficiently
|
|
handle environments with thousands modulefiles stored in a parallel
|
|
filesystem.
|
|
|
|
How to store cache
|
|
------------------
|
|
|
|
Module cache will be a file aggregating available module information. Module
|
|
are organized into modulepaths that are enabled in user environment. Goal is
|
|
to have one module cache file per modulepath. This file will be stored under
|
|
the modulepath root with the ``.modulecache`` file name.
|
|
|
|
Depending on modulepath ownership, modulepath may or may not have a cache file
|
|
built. It seems more efficient to save cache on a per-modulepath basis as a
|
|
user may enable diverse modulepaths, some of them not maintained by the admins
|
|
of the machine they are running on.
|
|
|
|
Having a central system cache may miss some user-managed modulepaths.
|
|
|
|
Having a per-user cache file does not seem a good idea. If built for the user,
|
|
a lot more information may be pre-processed: dynamic information from the
|
|
modulerc files for instance may be processed at cache file creation, not when
|
|
using the cache file. But it cannot be expected from the user to build its
|
|
cache and keep it up to date. Having the cache on the user side can lead
|
|
to additional issue with users complaining to the support that they cannot
|
|
find a given module, but due to an outdated cache file.
|
|
|
|
In definitive, there will be one cache file per modulepath directory.
|
|
|
|
Cache content
|
|
-------------
|
|
|
|
As cache files should be used instead of walking through modulepath directory
|
|
to search for valid modulercs and modulefiles. Cache file should contain
|
|
everything to be autonomous to proceed ``avail`` or ``whatis`` queries.
|
|
|
|
Yet module search can be very dynamic as things set for instance in modulerc
|
|
files may dynamically produce new modulefiles or define properties specific to
|
|
some users or groups.
|
|
|
|
So the produced cache file cannot be the static view generated for the user
|
|
who has built the cache file. Thus dynamic shaping is preserved by storing
|
|
scripts and commands in the cache file.
|
|
|
|
Modulerc files may be highly dynamic so their full content is purely saved
|
|
into the cache file. An entry is set for each valid modulerc file. Same goes
|
|
for modulefiles as some conditional structure in their code can dynamically
|
|
change dependency, whatis or other definitions.
|
|
|
|
It is expected that modulefile will be evaluated on ``avail`` or ``whatis``
|
|
commands for complex queries. So for such use case it is important to have
|
|
modulefile content cached.
|
|
|
|
Thus by reading the cache file, in one IO the content of all the modulerc and
|
|
modulefiles of the modulepath will be fetched.
|
|
|
|
Goal is to record in cache file every information obtained as a result of
|
|
``findModules`` procedure. Thus cache file contains modulefile modification
|
|
times, which is reported on ``--long`` output. Modulefile validity check
|
|
result is also recorded if modulefile is not valid. Modulerc validity check
|
|
is not recorded as it is not an information produced by ``findModules``.
|
|
|
|
Note: modulercs and modulefiles are not tested when generating cache file to
|
|
see if they are valid, if their code is correctly executed. Their bare content
|
|
is recorded. Same result is obtained this way whether a cache file is used or
|
|
not.
|
|
|
|
Cache format
|
|
------------
|
|
|
|
Goal is to make the cache file a Tcl script, like modulefiles. It will not
|
|
be exactly a modulefile as specific commands will be setup to evaluate cache
|
|
file and cache file will not make use of modulefile commands. See `Cache
|
|
evaluation`_ section for details on how a cache file is evaluated.
|
|
|
|
Cache file starts with a ``#%Module`` string like modulefiles. A version
|
|
number is appended next to these first characters (for instance
|
|
``%Module5.3``). See `Cache validity`_ section for details on the cache file
|
|
*prefix* string.
|
|
|
|
Modulefile content in modulepath is defined in cache with the
|
|
``modulefile-content`` command. Which takes in this order:
|
|
|
|
* relative file path
|
|
* file last modification date (as a Unix epoch time)
|
|
* file module header
|
|
* file body content.
|
|
|
|
For instance:
|
|
|
|
* ``modulefile-content foo/1.0 1234567890 #%Module {body}``
|
|
* or ``modulefile-content {foo/w s} 1234567890 #%Module5.2 {body}``
|
|
|
|
Similar command for modulerc: ``modulerc-content`` which accepts following
|
|
syntax:
|
|
|
|
* relative file path
|
|
* file module header
|
|
* file body content.
|
|
|
|
For instance: ``modulerc-content foo/.modulerc #%Module {body}``. Modification
|
|
time is not needed for modulerc as this information is not reported on an
|
|
``avail`` sub-command in long mode.
|
|
|
|
Every modulercs and modulefiles file contained in modulepath are recorded
|
|
in cache with ``modulerc-content`` or ``modulefile-content`` command.
|
|
Modulefiles whose name start with a dot character are also recorded in cache.
|
|
|
|
Invalid modulefiles are recorded in cache file through ``modulefile-invalid``
|
|
command rather ``modulefile-content``. It accepts following arguments:
|
|
|
|
* relative file path
|
|
* invalidity kind
|
|
* error message
|
|
|
|
For instance: ``modulefile-invalid foo/2.0 invalid {Magic cookie '#%Module'
|
|
missing}``
|
|
|
|
Files or directories that have limited access are recorded with specific
|
|
commands:
|
|
|
|
* ``limited-access-file foo/1.0``
|
|
* ``limited-access-directory foo``
|
|
|
|
Limited access means for a file that it cannot be read by user that builds
|
|
cache or other users. For a directory it means that it cannot be either read
|
|
or searched by user that builds cache or other users.
|
|
|
|
A modulefile or a modulerc is not recorded with ``modulefile-content`` or
|
|
``modulerc-content`` if file or one of its parent directory has limited
|
|
access. This way only content that can be read by everyone is recorded into
|
|
cache file. Sensitive information are excluded from cache file.
|
|
|
|
Note: a cache file generated by a privileged user (which has access to every
|
|
thing) and a cache file generated by a less privileged user will be the same
|
|
as limited-access information will not be included in cache file.
|
|
|
|
Files or directories recorded as limited access will need to be tested (and
|
|
walked down for directories) when cache file will be evaluated. It is
|
|
important to distinguish files from directories to save some file stat test
|
|
to determine if an element is a directory when limited access elements will be
|
|
tested. No need for files to distinguish modulefile from modulerc as this
|
|
difference is visible with file name.
|
|
|
|
Note: Limited access tests are skipped on Windows platform as Unix-style file
|
|
permission cannot be tested there.
|
|
|
|
Recording full modulefile content or subset of elements
|
|
-------------------------------------------------------
|
|
|
|
It was initially drafted that only a subset of element of modulefiles would be
|
|
recorded in cache file, to reduce size of this file and reduce its evaluation
|
|
time.
|
|
|
|
Recording full modulefile content is in the end preferred as:
|
|
|
|
* this solution is simpler to implement
|
|
* cache file size is not too big in the end (~ hundreds of KB for a thousand
|
|
of modulefiles)
|
|
* evaluation time of large cache file is acceptable (time taken to evaluate
|
|
cannot be noticed by user)
|
|
* simpler to understand and manage for sites
|
|
|
|
Producing a cache entry for a modulefile with only a subset of commands
|
|
recorded (like variant or requirement) is only feasible for modulefiles not
|
|
using conditionals or specific evaluation scheme. Recording full content will
|
|
work in any scenario whereas recording a subset limits cache usage. Moreover
|
|
it is hard to determine, depending of the modulefile set, where cache can be
|
|
used or not if only a subset of elements is recorded.
|
|
|
|
Cache validity
|
|
--------------
|
|
|
|
Cache file header indicates a Modules version number. It corresponds to the
|
|
Modules version:
|
|
|
|
* the cache file has been built with
|
|
* the cache file is compatible with
|
|
|
|
It seems reasonable to ask for a cache file update every time Modules is
|
|
upgraded to a newer minor version (for instance from 5.3 to 5.4):
|
|
|
|
* it is simpler to understand for staff and user when the cache is taken into
|
|
account, when it is ignored
|
|
* better to ensure cache file is accurate for the Modules version as modulerc
|
|
and modulefile commands may evolve from one version to another
|
|
|
|
Cache usage
|
|
-----------
|
|
|
|
Any time a modulepath directory is opened to get its content, the module cache
|
|
file will be used instead if available.
|
|
|
|
Modulepath content analysis is performed by ``findModules`` procedures. So
|
|
any sub-command calling it (directly or through ``getModules`` or
|
|
``getPathToModule``) will use the cache file. It corresponds to the following
|
|
sub-commands:
|
|
|
|
* lint
|
|
* paths
|
|
* search
|
|
* whatis
|
|
* aliases
|
|
* avail
|
|
* switch
|
|
* restore
|
|
* save
|
|
* display
|
|
* path
|
|
* source
|
|
* load
|
|
* test
|
|
* edit
|
|
* help
|
|
|
|
It may also occur during other sub-commands that evaluates modulefiles using
|
|
the ``is-avail`` command: like unload or refresh.
|
|
|
|
Cache files are ignored if :mconfig:`ignore_cache` configuration option is
|
|
enabled. This option can also be enabled just for one execution with the
|
|
:option:`--ignore-cache` command line option.
|
|
|
|
Cache files are ignored if cache expiry mechanism is enabled through the
|
|
:mconfig:`cache_expiry_secs` configuration option. When this option is set to
|
|
0, it means a cache file never expires. This is the default behavior. If set
|
|
to something else, cache file is expired if its last modification time is
|
|
older than the number of seconds defined in ``cache_expiry_secs``. Option
|
|
value is an integer between 1 and 31536000, which is the number of seconds
|
|
during 1 year.
|
|
|
|
Is there an impact at evaluating the full cache file rather making a directory
|
|
walk-through to find a module? Cache file is fully read, but not all the files
|
|
described in it are evaluated. Just those corresponding to the search, like it
|
|
is done when walking modulepath directory and evaluating only the modulerc
|
|
files corresponding to the query. So results between using cache file or not
|
|
should be the same: compared to a search without cache, no extra modulefile or
|
|
modulerc evaluation will be performed when a cache file is used.
|
|
|
|
As cache is recorded with both ``mcookie_check`` and ``mcookie_version_check``
|
|
options enabled, these two options are not honored (if disabled) when a cache
|
|
file is used. They are primarily useful to skip I/O tests when walking through
|
|
the content of a modulepath directory. As these I/O tests are done during the
|
|
cache build process, the options are useless when using cache files.
|
|
|
|
When cache file is not in sync
|
|
------------------------------
|
|
|
|
Files or directories are freely available through cache when used even if
|
|
after cache being built:
|
|
|
|
* their access is limited
|
|
* they are deleted
|
|
* their content changes and is not anymore valid
|
|
|
|
When files or directories have their access limited prior building cache, but
|
|
afterward these access limitations are lifted. These elements will require
|
|
an access test to check if they are available. This test will always be
|
|
successful as element accesses are not anymore limited.
|
|
|
|
If files or directories do not exist when cache is built, they will not be
|
|
found when cache is used.
|
|
|
|
If modulefile is recorded in cache as invalid, it will stay invalid if cache
|
|
is used even the modulefile is fixed. Cache need to be regenerated.
|
|
|
|
Read/write performances
|
|
-----------------------
|
|
|
|
:mconfig:`cache_buffer_bytes` configuration option defines size of the buffer
|
|
when reading or writing cache files.
|
|
|
|
With a bigger buffer, fewer read or write system calls are needed to read or
|
|
write cache file. On busy storage systems it can improve I/O performances.
|
|
|
|
Cache evaluation
|
|
----------------
|
|
|
|
A Tcl sub-interpreter is created to analyze cache files. This sub-interp is
|
|
setup to evaluate cache file-specific commands, like ``modulefile-content``.
|
|
|
|
When evaluated, ``modulefile-content``, ``modulerc-content`` and
|
|
``modulefile-invalid`` commands populate the read cache structure of modulerc
|
|
and modulefiles. This way when the modulefile for instance need to be read,
|
|
its content is already found in memory cache structure. It corresponds to the
|
|
following global variables:
|
|
|
|
* ``::g_modfileContent``
|
|
* ``::g_fileMtime`` (only for valid modulefile)
|
|
* ``::g_modfileValid`` (only for modulefile, valid or not)
|
|
|
|
In addition a ``::g_cacheModpath`` array is filled with an entry dedicated for
|
|
each modulepath. The content of this entry mimics the result list returned by
|
|
``findModules`` procedures with information for the whole content of the
|
|
modulepath.
|
|
|
|
Limited access files and directories described in cache by
|
|
``limited-access-file`` and ``limited-access-directory`` commands populate
|
|
specific structures to indicate some entries in modulepath have to be tested
|
|
(and walked down for directories) to determine if they are available to
|
|
current user:
|
|
|
|
* ``g_cacheFLimitedModpath``
|
|
* ``g_cacheDLimitedModpath``
|
|
|
|
These two structures are arrays with one entry per cached modulepath. Limited
|
|
access entries are tested if they match search query. Test is done through
|
|
``findModulesFromDirsAndFiles`` procedure which corresponds to the walk down
|
|
code extracted from ``findModules``.
|
|
|
|
This specific interpreter is re-used between different cache file evaluations.
|
|
As for modulefile interpreter, a consistency check is performed before each
|
|
reuse to test that the cache file-specific commands have not be rewritten
|
|
during previous cache file evaluation.
|
|
|
|
Cache file evaluation is tracked to avoid evaluating twice the same cache
|
|
file.
|
|
|
|
Cache evaluation stops if an erroneous command or syntax is encountered. Like
|
|
for erroneous modulerc, error is not reported during ``avail`` or ``whatis``
|
|
commands unless if ran in debug mode. Error is reporting during a ``load``
|
|
evaluation. Cache evaluation is considered failed if there is an error in the
|
|
cache file, thus a non-cache module search will occur instead of relying on
|
|
cache module listing. However descriptions of modulefile and modulerc
|
|
evaluated in cache prior the error occurs are retained.
|
|
|
|
cachebuild sub-command
|
|
----------------------
|
|
|
|
:subcmd:`cachebuild` sub-command creates a module cache file in modulepaths.
|
|
Without arguments, it attempts to create cache in every enabled modulepaths
|
|
where running user has the right to write. If arguments are provided, cache
|
|
is build in the directories pointed by these arguments.
|
|
|
|
General properties:
|
|
|
|
* Shortcut name: none
|
|
* Accepted option: none
|
|
* Expected number of argument: 0 to N
|
|
* Accept boolean variant specification: no
|
|
* Parse module version specification: no
|
|
* Fully read modulefile when checking validity: yes
|
|
* Sub-command only called from top level: yes
|
|
* Lead to modulefile evaluation: yes (``cachebuild``)
|
|
|
|
An error is returned for each specified directories where current user has no
|
|
write access.
|
|
|
|
An error is returned if a modulefile or a modulerc cannot be read. This error
|
|
ends cache content generation for current modulepath. Build continues with
|
|
next modulepath after this error.
|
|
|
|
Modulepaths where current user has no write rights are skipped and reported
|
|
with a warning notice.
|
|
|
|
Reports a ``Creating <modulepath>`` block header message for each cache file
|
|
created or updated. This report is made when verbosity is set to ``normal``
|
|
or higher mode.
|
|
|
|
:mconfig:`mcookie_check` and :mconfig:`mcookie_version_check` options are both
|
|
enabled when recording cache. This is produced with exact same content whether
|
|
these options are enabled or not.
|
|
|
|
cacheclear sub-command
|
|
----------------------
|
|
|
|
:subcmd:`cacheclear` sub-command deletes all module cache file in enabled
|
|
modulepaths.
|
|
|
|
General properties:
|
|
|
|
* Shortcut name: none
|
|
* Accepted option: none
|
|
* Expected number of argument: 0
|
|
* Accept boolean variant specification: no
|
|
* Parse module version specification: no
|
|
* Fully read modulefile when checking validity: no
|
|
* Sub-command only called from top level: yes
|
|
* Lead to modulefile evaluation: no
|
|
|
|
Modulepaths where current user has no write rights on the modulepath directory
|
|
are skipped and reported with a warning notice.
|
|
|
|
Reports a ``Deleting <modulepath>`` block header message for each cache file
|
|
created or updated. This report is made when verbosity is set to ``normal``
|
|
or higher mode.
|
|
|
|
.. vim:set tabstop=2 shiftwidth=2 expandtab autoindent:
|