utils_data

Utilities for Dataset Module .

utils_data.bisectionConcat(index_start, index_end, dim_input, cache_label_path, data_name)

Concat fragments of dataset by bisection method with compuation cost \(O(logN)\). Assume that there is a sequence of datasets in the directory Cache: subdata0.npy, subdata1.npy, …, subdata99.npy, and then data_name is subdata, cache_label_path is Cache.

Parameters
  • index_start (int) – The start index of dataset sequences.

  • index_end (int) – The end index of dataset sequences.

  • dim_input (int) – The dimension of dataset.

  • cache_label_path (str) – The folder containing subdatasets.

  • data_name (str) – Suffix of squential .npy files.

Returns

data – Final dataset concated by all the fragments.

Return type

numpy.ndarray

utils_data.mpiSplitData(sample_num, cpu_id, cpu_size)

The size will be splitted into cpu_size parts and the index of the cpu_id part will be returned.

Parameters
  • sample_num (int) – The size of given dataset.

  • cpu_id (int) – The id of a cpu.

  • cpu_size (int) – Total cpu size.

Returns

Split_array_index – The index of the cpu_id-th part.

Return type

numpy.ndarray

utils_data.allowConcatForMPI(batch_num, cache_path, data_name, time_out=72000)

Circularly check whether all the sub-datasets have been created. If so, return True.

Parameters
  • batch_num (int) – Total number of sub-datasets expected to be created.

  • cache_path (str) – The folder containing all the sub-datasets.

  • data_name (str) – Suffix of squential .npy files.

  • time_out (float,optional) – Max time for waiting sub-dataset being created. Default 20h.

Returns

flag – Whether all the sub-datasets have been created.

Return type

bool

utils_data.writeMech(file_name, mech_path)

Change mechanism in the target file. Used for CanteraTools.py and SampleMethod.py.

Parameters
  • file_name (str) – The file name of target file.

  • mech_path (str) – Mechanism file path.

utils_data.setGlobalMech(mech_path)

Set mechanism in SampleMethod.py and CanteraTools.py

utils_data.mpiClearCache()

Use MPI parallelization to clear cache folders including CacheManifold*, CacheManifoldBatch*, CacheBatchData* and CacheLabel Data* .

utils_data.clearLog()

Clear the /log/ folder.