Skip to content

ggml-vulkan: limit the number of command buffers to transfer data to 4#19956

Closed
mklimenk wants to merge 1 commit intoggml-org:masterfrom
mklimenk:mklimenk/transfer_cmdlist_cleanup
Closed

ggml-vulkan: limit the number of command buffers to transfer data to 4#19956
mklimenk wants to merge 1 commit intoggml-org:masterfrom
mklimenk:mklimenk/transfer_cmdlist_cleanup

Conversation

@mklimenk
Copy link

This PR introduces a limit to the amount of command buffers to force reusing them instead of creating a new command buffer for each megabyte of data to transfer.

I stumbled upon this limitation while trying to launch GPT-OSS-120B MXFP4 via Vulkan backend, which failed during the model loading stage.

Adding the proposed fix allows for a command buffer reuse instead of creating a new one for every transfer, which result in a lot of command buffers for larger models and might hit GPU hardware limitations. A number 4 is taken from the number of staging buffers for async uploads, but can be adjusted. No performance or accuracy implications were observed.

@mklimenk mklimenk requested a review from 0cc4m as a code owner February 27, 2026 16:31
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Feb 27, 2026
@0cc4m
Copy link
Collaborator

0cc4m commented Feb 27, 2026

We've been discussing possible solutions in #19420. This is one of them, yes.

@mklimenk
Copy link
Author

@0cc4m I checked PRs for duplicates, but didn't look into issues. Thank you for looking into it, closing this as a duplicate.
In case you need some additional info or help, feel free to ping me

@mklimenk mklimenk closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants