You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 24, 2026. It is now read-only.
While doing ci testing for GPTQModel we found that with cuda 12.5, bitblas is generating broken compiled codes via apache/tvm. There are no errors. The end result is no runtime error but cache model gpu code (in .cached folder) generates non-sense using backend=BACKEND.BITBLAS: failed our PPL sanity test. We copied over .cache/generated files from < cuda 12.5 and it works which isolates the issue to cuda 12.5/apache tvm.