oneflow.one_embedding.make_cached_ssd_store_options¶
-
oneflow.one_embedding.
make_cached_ssd_store_options
(cache_budget_mb, persistent_path, capacity=None, size_factor=1, physical_block_size=512, host_cache_budget_mb=0)¶ make SSD use GPU and host as cache store_options param of MultiTableEmbedding. If cache_budget_mb > 0 and host_cache_budget_mb > 0, use GPU and host memory as multi-level cache.
- Parameters
cache_budget_mb (int) – the MB budget of per GPU as cache.
persistent_path (str, list) – persistent storage path of Embedding, must use fast SSD because of frequently random disk access during training. If passed a str, current rank Embedding will be saved in path/rank_id-num_ranks path. If passed a list, the list length must equals num_ranks, each elem of list represent the path of rank_id Embedding.
capacity (int) – total capacity of Embedding
size_factor (int, optional) – store size factor of embedding_dim, if SGD update, and momentum = 0, should be 1, if momentum > 0, it should be 2. if Adam, should be 3. Defaults to 1.
physical_block_size (int, optional) – physical_block_size should be sector size. Defaults to 512.
host_cache_budget_mb (int) – the MB budget of host memory as cache per rank. Defaults to 0.
- Returns
SSD use GPU and host as cache store_options param of MultiTableEmbedding
- Return type
For example:
>>> import oneflow as flow >>> store_options = flow.one_embedding.make_cached_ssd_store_options( >>> cache_budget_mb=8192, persistent_path="/your_path_to_ssd", capacity=vocab_size, >>> ) >>> # pass the store_options to the "store_options" param of flow.one_embedding.MultiTableEmbedding >>> # ...