Skip to content
This repository was archived by the owner on Feb 24, 2026. It is now read-only.

[Dev][AMD] Implement conditional async load for AMD HIP Backend#250

Merged
LeiWang1999 merged 5 commits intomicrosoft:mainfrom
LeiWang1999:amd_fix_20241127
Nov 28, 2024
Merged

[Dev][AMD] Implement conditional async load for AMD HIP Backend#250
LeiWang1999 merged 5 commits intomicrosoft:mainfrom
LeiWang1999:amd_fix_20241127

Conversation

@LeiWang1999
Copy link
Contributor

Usage:

  #pragma unroll
  for (int i_3 = 0; i_3 < 2; ++i_3) {
    tl::cp_async_gs_conditional<16>(buf_dyn_shmem+((((i_3 * 4096) + ((((int)threadIdx.x) >> 2) * 64)) + (((((int)threadIdx.x) & 3) ^ (((((int)threadIdx.x) >> 2) / 2) & 3)) * 16)) + 8192), data+(((((((((int)blockIdx.y) * 16384) + (((k_iter + 1) / 12) * 8192)) + (i_3 * 8192)) + ((((int)threadIdx.x) >> 2) * 128)) + (((k_iter + 1) % 12) * 32)) + ((((int)threadIdx.x) & 3) * 8)) - 8320), ((((1 <= ((((((int)blockIdx.y) & 31) * 2) + ((k_iter + 1) / 12)) + i_3)) && (1 <= ((((int)threadIdx.x) >> 2) + (((k_iter + 1) % 12) >> 2)))) && (((((((int)blockIdx.y) & 31) * 2) + ((k_iter + 1) / 12)) + i_3) < 65)) && (((((int)threadIdx.x) >> 2) + (((k_iter + 1) % 12) >> 2)) < 65)));
  }

@LeiWang1999 LeiWang1999 merged commit 645ccd7 into microsoft:main Nov 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant