This repository was archived by the owner on Feb 24, 2026. It is now read-only.
[Dev] Enhance Operator Cache to support multi-thread environments#205
Merged
LeiWang1999 merged 14 commits intomicrosoft:mainfrom Oct 1, 2024
LeiWang1999:tl_ops_dynamic
Merged
[Dev] Enhance Operator Cache to support multi-thread environments#205LeiWang1999 merged 14 commits intomicrosoft:mainfrom LeiWang1999:tl_ops_dynamic
LeiWang1999 merged 14 commits intomicrosoft:mainfrom
LeiWang1999:tl_ops_dynamic
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix ref to #204 #186
This pull request introduces several changes to improve thread safety, enhance scheduler functionality, and refactor code for better readability and maintainability. The most important changes include adding a lock to the
OperatorCacheclass, modifying theThreadPoolExecutorto use a variable number of workers, and introducing a new fine-grained matrix multiplication scheduler.Thread Safety Enhancements:
bitblas/cache/operator.py: Added acache_lockerusingthreading.RLockto synchronize access to the cache in methods likeadd,get,clear, andsave_into_database. [1] [2]Scheduler Improvements:
bitblas/base/utils.py: ModifiedThreadPoolExecutorto use a variable number of workers (max_workers) instead of a fixed number (4).bitblas/ops/base_scheduler.py: Added a methodget_hardware_aware_configsto raise aNotImplementedErrorfor hardware-aware tuning.bitblas/ops/general_matmul/tilelang/dense/matmul_simt.py: Introduced a newMatmulFineGrainSIMTSchedulerclass for fine-grained matrix multiplication scheduling.Code Refactoring:
bitblas/ops/general_matmul/tilelang/dense/matmul_tensorcore.py: Renamed frommatmul.pyand added imports and methods for hardware-aware configurations. [1] [2] [3]bitblas/ops/operator.py: Refactored multiple methods for better readability, includingapply_fast_tuning,hardware_aware_finetune, and_build_default_module. [1] [2]Additional Changes:
bitblas/ops/general_matmul/tilelang/dense/__init__.py: Updated imports to includematmul_simtandmatmul_tensorcore.bitblas/ops/operator.py: Added import fortl_apply_and_buildfrombitblas.tl.tuner.