You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
415 lines
19 KiB
415 lines
19 KiB
2 years ago
|
/**
|
||
|
* \page H5AC-cache-config-t Metadata Cache Configuration
|
||
|
* \tableofcontents
|
||
|
*
|
||
|
* \section gcf General configuration fields
|
||
|
*
|
||
|
* \par version
|
||
|
* Integer field containing the version number of this version
|
||
|
* of the H5AC_cache_config_t structure. Any instance of
|
||
|
* H5AC_cache_config_t passed to the cache must have a known
|
||
|
* version number, or an error will be flagged.
|
||
|
*
|
||
|
* \par rpt_fcn_enabled
|
||
|
* \parblock
|
||
|
* Boolean field used to enable and disable the default
|
||
|
* reporting function. This function is invoked every time the
|
||
|
* automatic cache resize code is run, and reports on its activities.
|
||
|
*
|
||
|
* This is a debugging function, and should normally be turned off.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par open_trace_file
|
||
|
* \parblock
|
||
|
* Boolean field indicating whether the trace_file_name
|
||
|
* field should be used to open a trace file for the cache.
|
||
|
*
|
||
|
* \Emph{*** DEPRECATED ***} Use \Code{H5Fstart/stop} logging functions instead
|
||
|
*
|
||
|
* The trace file is a debugging feature that allow the capture of
|
||
|
* top level metadata cache requests for purposes of debugging and/or
|
||
|
* optimization. This field should normally be set to \c FALSE, as
|
||
|
* trace file collection imposes considerable overhead.
|
||
|
*
|
||
|
* This field should only be set to \c TRUE when the trace_file_name
|
||
|
* contains the full path of the desired trace file, and either
|
||
|
* there is no open trace file on the cache, or the \c close_trace_file
|
||
|
* field is also \c TRUE.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par close_trace_file
|
||
|
* \parblock
|
||
|
* Boolean field indicating whether the current trace
|
||
|
* file (if any) should be closed.
|
||
|
*
|
||
|
* \Emph{*** DEPRECATED ***} Use \Code{H5Fstart/stop} logging functions instead
|
||
|
*
|
||
|
* See the above comments on the open_trace_file field. This field
|
||
|
* should be set to \c FALSE unless there is an open trace file on the
|
||
|
* cache that you wish to close.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par trace_file_name
|
||
|
* \parblock
|
||
|
* Full path of the trace file to be opened if the
|
||
|
* open_trace_file field is \c TRUE.
|
||
|
*
|
||
|
* \Emph{*** DEPRECATED ***} Use \Code{H5Fstart/stop} logging functions instead
|
||
|
*
|
||
|
* In the parallel case, an ascii representation of the mpi rank of
|
||
|
* the process will be appended to the file name to yield a unique
|
||
|
* trace file name for each process.
|
||
|
*
|
||
|
* The length of the path must not exceed #H5AC__MAX_TRACE_FILE_NAME_LEN
|
||
|
* characters.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par evictions_enabled
|
||
|
* \parblock
|
||
|
* Boolean field used to either report the current
|
||
|
* evictions enabled status of the cache, or to set the cache's
|
||
|
* evictions enabled status.
|
||
|
*
|
||
|
* In general, the metadata cache should always be allowed to
|
||
|
* evict entries. However, in some cases it is advantageous to
|
||
|
* disable evictions briefly, and thereby postpone metadata
|
||
|
* writes. However, this must be done with care, as the cache
|
||
|
* can grow quickly. If you do this, re-enable evictions as
|
||
|
* soon as possible and monitor cache size.
|
||
|
*
|
||
|
* At present, evictions can only be disabled if automatic
|
||
|
* cache resizing is also disabled (that is, \Code{(incr_mode ==
|
||
|
* H5C_incr__off ) && ( decr_mode == H5C_decr__off )}). There
|
||
|
* is no logical reason why this should be so, but it simplifies
|
||
|
* implementation and testing, and I can't think of any reason
|
||
|
* why it would be desirable. If you can think of one, I'll
|
||
|
* revisit the issue. (JM)
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par set_initial_size
|
||
|
* Boolean flag indicating whether the size of the
|
||
|
* initial size of the cache is to be set to the value given in
|
||
|
* the initial_size field. If set_initial_size is \c FALSE, the
|
||
|
* initial_size field is ignored.
|
||
|
*
|
||
|
* \par initial_size
|
||
|
* If enabled, this field contain the size the cache is
|
||
|
* to be set to upon receipt of this structure. Needless to say,
|
||
|
* initial_size must lie in the closed interval \Code{[min_size, max_size]}.
|
||
|
*
|
||
|
* \par min_clean_fraction
|
||
|
* \c double in the range 0 to 1 indicating the fraction
|
||
|
* of the cache that is to be kept clean. This field is only used
|
||
|
* in parallel mode. Typical values are 0.1 to 0.5.
|
||
|
*
|
||
|
* \par max_size
|
||
|
* Maximum size to which the cache can be adjusted. The
|
||
|
* supplied value must fall in the closed interval
|
||
|
* \Code{[MIN_MAX_CACHE_SIZE, MAX_MAX_CACHE_SIZE]}. Also, \c max_size must
|
||
|
* be greater than or equal to \c min_size.
|
||
|
*
|
||
|
* \par min_size
|
||
|
* Minimum size to which the cache can be adjusted. The
|
||
|
* supplied value must fall in the closed interval
|
||
|
* \Code{[H5C__MIN_MAX_CACHE_SIZE, H5C__MAX_MAX_CACHE_SIZE]}. Also, \c min_size
|
||
|
* must be less than or equal to \c max_size.
|
||
|
*
|
||
|
* \par epoch_length
|
||
|
* \parblock
|
||
|
* Number of accesses on the cache over which to collect
|
||
|
* hit rate stats before running the automatic cache resize code,
|
||
|
* if it is enabled.
|
||
|
*
|
||
|
* At the end of an epoch, we discard prior hit rate data and start
|
||
|
* collecting afresh. The epoch_length must lie in the closed
|
||
|
* interval \Code{[H5C__MIN_AR_EPOCH_LENGTH, H5C__MAX_AR_EPOCH_LENGTH]}.
|
||
|
* \endparblock
|
||
|
*
|
||
|
*
|
||
|
* \section csicf Cache size increase control fields
|
||
|
*
|
||
|
* \par incr_mode
|
||
|
* Instance of the \c H5C_cache_incr_mode enumerated type whose
|
||
|
* value indicates how we determine whether the cache size should be
|
||
|
* increased. At present there are two possible values:
|
||
|
* \li \c H5C_incr__off: Don't attempt to increase the size of the cache
|
||
|
* automatically.\n
|
||
|
* When this increment mode is selected, the remaining fields
|
||
|
* in the cache size increase section ar ignored.
|
||
|
* \li \c H5C_incr__threshold: Attempt to increase the size of the cache
|
||
|
* whenever the average hit rate over the last epoch drops
|
||
|
* below the value supplied in the \c lower_hr_threshold
|
||
|
* field.\n
|
||
|
* Note that this attempt will fail if the cache is already
|
||
|
* at its maximum size, or if the cache is not already using
|
||
|
* all available space.
|
||
|
*
|
||
|
* Note that you must set \c decr_mode to \c H5C_incr__off if you
|
||
|
* disable metadata cache entry evictions.
|
||
|
*
|
||
|
* \par lower_hr_threshold
|
||
|
* \parblock
|
||
|
* Lower hit rate threshold. If the increment mode
|
||
|
* (\c incr_mode) is \c H5C_incr__threshold and the hit rate drops below the
|
||
|
* value supplied in this field in an epoch, increment the cache size by
|
||
|
* \c size_increment. Note that cache size may not be incremented above
|
||
|
* \c max_size, and that the increment may be further restricted by the
|
||
|
* \c max_increment field if it is enabled.
|
||
|
*
|
||
|
* When enabled, this field must contain a value in the range [0.0, 1.0].
|
||
|
* Depending on the \c incr_mode selected, it may also have to be less than
|
||
|
* \c upper_hr_threshold.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par increment
|
||
|
* \parblock
|
||
|
* Double containing the multiplier used to derive the new
|
||
|
* cache size from the old if a cache size increment is triggered.
|
||
|
* The increment must be greater than 1.0, and should not exceed 2.0.
|
||
|
*
|
||
|
* The new cache size is obtained my multiplying the current max cache
|
||
|
* size by the increment, and then clamping to \c max_size and to stay
|
||
|
* within the \c max_increment as necessary.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par apply_max_increment
|
||
|
* Boolean flag indicating whether the \c max_increment
|
||
|
* field should be used to limit the maximum cache size increment.
|
||
|
*
|
||
|
* \par max_increment
|
||
|
* If enabled by the \c apply_max_increment field described
|
||
|
* above, this field contains the maximum number of bytes by which the
|
||
|
* cache size can be increased in a single re-size.
|
||
|
*
|
||
|
* \par flash_incr_mode
|
||
|
* \parblock
|
||
|
* Instance of the \c H5C_cache_flash_incr_mode enumerated
|
||
|
* type whose value indicates whether and by which algorithm we should
|
||
|
* make flash increases in the size of the cache to accommodate insertion
|
||
|
* of large entries and large increases in the size of a single entry.
|
||
|
*
|
||
|
* The addition of the flash increment mode was occasioned by performance
|
||
|
* problems that appear when a local heap is increased to a size in excess
|
||
|
* of the current cache size. While the existing re-size code dealt with
|
||
|
* this eventually, performance was very bad for the remainder of the
|
||
|
* epoch.
|
||
|
*
|
||
|
* At present, there are two possible values for the \c flash_incr_mode:
|
||
|
*
|
||
|
* \li \c H5C_flash_incr__off: Don't perform flash increases in the size of the cache.
|
||
|
*
|
||
|
* \li \c H5C_flash_incr__add_space: Let \c x be either the size of a newly
|
||
|
* newly inserted entry, or the number of bytes by which the
|
||
|
* size of an existing entry has been increased.\n
|
||
|
* If \Code{x > flash_threshold * current max cache size},
|
||
|
* increase the current maximum cache size by \Code{x * flash_multiple}
|
||
|
* less any free space in the cache, and star a new epoch. For
|
||
|
* now at least, pay no attention to the maximum increment.
|
||
|
*
|
||
|
* In both of the above cases, the flash increment pays no attention to
|
||
|
* the maximum increment (at least in this first incarnation), but DOES
|
||
|
* stay within max_size.
|
||
|
*
|
||
|
* With a little thought, it should be obvious that the above flash
|
||
|
* cache size increase algorithm is not sufficient for all circumstances
|
||
|
* -- for example, suppose the user round robins through
|
||
|
* \Code{(1/flash_threshold) +1} groups, adding one data set to each on each
|
||
|
* pass. Then all will increase in size at about the same time, requiring
|
||
|
* the max cache size to at least double to maintain acceptable
|
||
|
* performance, however the above flash increment algorithm will not be
|
||
|
* triggered.
|
||
|
*
|
||
|
* Hopefully, the add space algorithms detailed above will be sufficient
|
||
|
* for the performance problems encountered to date. However, we should
|
||
|
* expect to revisit the issue.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par flash_multiple
|
||
|
* Double containing the multiple described above in the
|
||
|
* \c H5C_flash_incr__add_space section of the discussion of the
|
||
|
* \c flash_incr_mode section. This field is ignored unless \c flash_incr_mode
|
||
|
* is \c H5C_flash_incr__add_space.
|
||
|
*
|
||
|
* \par flash_threshold
|
||
|
* Double containing the factor by which current max cache
|
||
|
* size is multiplied to obtain the size threshold for the add_space flash
|
||
|
* increment algorithm. The field is ignored unless \c flash_incr_mode is
|
||
|
* \c H5C_flash_incr__add_space.
|
||
|
*
|
||
|
*
|
||
|
* \section csdcf Cache size decrease control fields
|
||
|
*
|
||
|
* \par decr_mode
|
||
|
* \parblock
|
||
|
* Instance of the \c H5C_cache_decr_mode enumerated type whose
|
||
|
* value indicates how we determine whether the cache size should be
|
||
|
* decreased. At present there are four possibilities.
|
||
|
*
|
||
|
* \li \c H5C_decr__off: Don't attempt to decrease the size of the cache
|
||
|
* automatically.\n
|
||
|
* When this increment mode is selected, the remaining fields
|
||
|
* in the cache size decrease section are ignored.
|
||
|
* \li \c H5C_decr__threshold: Attempt to decrease the size of the cache
|
||
|
* whenever the average hit rate over the last epoch rises
|
||
|
* above the value supplied in the \c upper_hr_threshold
|
||
|
* field.
|
||
|
* \li \c H5C_decr__age_out: At the end of each epoch, search the cache for
|
||
|
* entries that have not been accessed for at least the number
|
||
|
* of epochs specified in the epochs_before_eviction field, and
|
||
|
* evict these entries. Conceptually, the maximum cache size
|
||
|
* is then decreased to match the new actual cache size. However,
|
||
|
* this reduction may be modified by the \c min_size, the
|
||
|
* \c max_decrement, and/or the \c empty_reserve.
|
||
|
* \li \c H5C_decr__age_out_with_threshold: Same as age_out, but we only
|
||
|
* attempt to reduce the cache size when the hit rate observed
|
||
|
* over the last epoch exceeds the value provided in the
|
||
|
* \c upper_hr_threshold field.
|
||
|
*
|
||
|
* Note that you must set \c decr_mode to \c H5C_decr__off if you
|
||
|
* disable metadata cache entry evictions.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par upper_hr_threshold
|
||
|
* \parblock
|
||
|
* Upper hit rate threshold. The use of this field
|
||
|
* varies according to the current \c decr_mode :
|
||
|
*
|
||
|
* \c H5C_decr__off or \c H5C_decr__age_out: The value of this field is
|
||
|
* ignored.
|
||
|
*
|
||
|
* \li \c H5C_decr__threshold: If the hit rate exceeds this threshold in any
|
||
|
* epoch, attempt to decrement the cache size by size_decrement.\n
|
||
|
* Note that cache size may not be decremented below \c min_size.\n
|
||
|
* Note also that if the \c upper_threshold is 1.0, the cache size\n
|
||
|
* will never be reduced.
|
||
|
*
|
||
|
* \li \c H5C_decr__age_out_with_threshold: If the hit rate exceeds this
|
||
|
* threshold in any epoch, attempt to reduce the cache size
|
||
|
* by evicting entries that have not been accessed for more
|
||
|
* than the specified number of epochs.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par decrement
|
||
|
* \parblock
|
||
|
* This field is only used when the decr_mode is
|
||
|
* \c H5C_decr__threshold.
|
||
|
*
|
||
|
* The field is a double containing the multiplier used to derive the
|
||
|
* new cache size from the old if a cache size decrement is triggered.
|
||
|
* The decrement must be in the range 0.0 (in which case the cache will
|
||
|
* try to contract to its minimum size) to 1.0 (in which case the
|
||
|
* cache will never shrink).
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par apply_max_decrement
|
||
|
* Boolean flag used to determine whether decrements
|
||
|
* in cache size are to be limited by the \c max_decrement field.
|
||
|
*
|
||
|
* \par max_decrement
|
||
|
* Maximum number of bytes by which the cache size can be
|
||
|
* decreased in a single re-size. Note that decrements may also be
|
||
|
* restricted by the \c min_size of the cache, and (in age out modes) by
|
||
|
* the \c empty_reserve field.
|
||
|
*
|
||
|
* \par epochs_before_eviction
|
||
|
* \parblock
|
||
|
* Integer field used in \c H5C_decr__age_out and
|
||
|
* \c H5C_decr__age_out_with_threshold decrement modes.
|
||
|
*
|
||
|
* This field contains the number of epochs an entry must remain
|
||
|
* unaccessed before it is evicted in an attempt to reduce the
|
||
|
* cache size. If applicable, this field must lie in the range
|
||
|
* \Code{[1, H5C__MAX_EPOCH_MARKERS]}.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par apply_empty_reserve
|
||
|
* Boolean field controlling whether the empty_reserve
|
||
|
* field is to be used in computing the new cache size when the
|
||
|
* decr_mode is H5C_decr__age_out or H5C_decr__age_out_with_threshold.
|
||
|
*
|
||
|
* \par empty_reserve
|
||
|
* \parblock
|
||
|
* To avoid a constant racheting down of cache size by small
|
||
|
* amounts in the \c H5C_decr__age_out and \c H5C_decr__age_out_with_threshold
|
||
|
* modes, this field allows one to require that any cache size
|
||
|
* reductions leave the specified fraction of unused space in the cache.
|
||
|
*
|
||
|
* The value of this field must be in the range [0.0, 1.0]. I would
|
||
|
* expect typical values to be in the range of 0.01 to 0.1.
|
||
|
* \endparblock
|
||
|
*
|
||
|
*
|
||
|
* \section pcf Parallel Configuration Fields
|
||
|
*
|
||
|
* In PHDF5, all operations that modify metadata must be executed collectively.
|
||
|
*
|
||
|
* We used to think that this was enough to ensure consistency across the
|
||
|
* metadata caches, but since we allow processes to read metadata individually,
|
||
|
* the order of dirty entries in the LRU list can vary across processes,
|
||
|
* which can result in inconsistencies between the caches.
|
||
|
*
|
||
|
* PHDF5 uses several strategies to prevent such inconsistencies in metadata,
|
||
|
* all of which use the fact that the same stream of dirty metadata is seen
|
||
|
* by all processes for purposes of synchronization. This is done by
|
||
|
* having each process count the number of bytes of dirty metadata generated,
|
||
|
* and then running a "sync point" whenever this count exceeds a user
|
||
|
* specified threshold (see \c dirty_bytes_threshold below).
|
||
|
*
|
||
|
* The current metadata write strategy is indicated by the
|
||
|
* \c metadata_write_strategy field. The possible values of this field, along
|
||
|
* with the associated metadata write strategies are discussed below.
|
||
|
*
|
||
|
* \par dirty_bytes_threshold
|
||
|
* \parblock
|
||
|
* Threshold of dirty byte creation used to
|
||
|
* synchronize updates between caches. (See above for outline and
|
||
|
* motivation.)
|
||
|
*
|
||
|
* This value MUST be consistent across all processes accessing the
|
||
|
* file. This field is ignored unless HDF5 has been compiled for
|
||
|
* parallel.
|
||
|
* \endparblock
|
||
|
*
|
||
|
* \par metadata_write_strategy
|
||
|
* Integer field containing a code indicating the
|
||
|
* desired metadata write strategy. The valid values of this field
|
||
|
* are enumerated and discussed below:
|
||
|
*
|
||
|
* \li #H5AC_METADATA_WRITE_STRATEGY__PROCESS_0_ONLY\n
|
||
|
* When metadata_write_strategy is set to this value, only process
|
||
|
* zero is allowed to write dirty metadata to disk. All other
|
||
|
* processes must retain dirty metadata until they are informed at
|
||
|
* a sync point that the dirty metadata in question has been written
|
||
|
* to disk.\n
|
||
|
* When the sync point is reached (or when there is a user generated
|
||
|
* flush), process zero flushes sufficient entries to bring it into
|
||
|
* compliance with its min clean size (or flushes all dirty entries in
|
||
|
* the case of a user generated flush), broad casts the list of
|
||
|
* entries just cleaned to all the other processes, and then exits
|
||
|
* the sync point.\n
|
||
|
* Upon receipt of the broadcast, the other processes mark the indicated
|
||
|
* entries as clean, and leave the sync point as well.
|
||
|
*
|
||
|
* \li #H5AC_METADATA_WRITE_STRATEGY__DISTRIBUTED\n
|
||
|
* In the distributed metadata write strategy, process zero still makes
|
||
|
* the decisions as to what entries should be flushed, but the actual
|
||
|
* flushes are distributed across the processes in the computation to
|
||
|
* the extent possible.\n
|
||
|
* In this strategy, when a sync point is triggered (either by dirty
|
||
|
* metadata creation or manual flush), all processes enter a barrier.\n
|
||
|
* On the other side of the barrier, process 0 constructs an ordered
|
||
|
* list of the entries to be flushed, and then broadcasts this list
|
||
|
* to the caches in all the processes.\n
|
||
|
* All processes then scan the list of entries to be flushed, flushing
|
||
|
* some, and marking the rest as clean. The algorithm for this purpose
|
||
|
* ensures that each entry in the list is flushed exactly once, and
|
||
|
* all are marked clean in each cache.\n
|
||
|
* Note that in the case of a flush of the cache, no message passing
|
||
|
* is necessary, as all processes have the same list of dirty entries,
|
||
|
* and all of these entries must be flushed. Thus in this case it is
|
||
|
* sufficient for each process to sort its list of dirty entries after
|
||
|
* leaving the initial barrier, and use this list as if it had been
|
||
|
* received from process zero.\n
|
||
|
* To avoid possible messages from the past/future, all caches must
|
||
|
* wait until all caches are done before leaving the sync point.
|
||
|
*/
|