You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 24, 2026. It is now read-only.
When utilize TileLang, some layout transformation like swizzling or padding will implicitly apply layout transformation, though this approach is efficient and powerful, but sometimes will lead to a crash for vectorization.
On Volta, applying a swizzle operation will adjust the memory layout to align with groups of 4 elements instead of 8 elements. This optimization enhances memory coalescing and data locality for efficient GPU execution.
We should enhance lower vectorize pass to automatically convert the vectorize stage into: