Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ggml : add metal backend registry / device #9713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Oct 7, 2024

Conversation

ggerganov
Copy link
Member

target #9707

Adapt the Metal backend to the new registry and device interfaces.

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Oct 2, 2024
@ggerganov ggerganov changed the title ggml-backend : add device and backend reg interfaces ggml : add metal backend registry / devic Oct 2, 2024
@ggerganov ggerganov changed the title ggml : add metal backend registry / devic ggml : add metal backend registry / device Oct 2, 2024
@slaren
Copy link
Member

slaren commented Oct 2, 2024

ggml_backend_metal_buffer_type() also needs to be updated to set the device field.

@slaren
Copy link
Member

slaren commented Oct 3, 2024

There have been a few minor changes to the interfaces:

  • The get_backend_reg function of the device interface has been removed, instead a pointer is stored directly in ggml_backend_device: d0c4954
  • Some functions have been renamed: cfef355

@ggerganov ggerganov force-pushed the sl/backend-registry-2-add-metal branch from 37de34c to a62ea59 Compare October 4, 2024 11:11
@github-actions github-actions bot added script Script related testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment Vulkan Issues specific to the Vulkan backend examples python python script changes devops improvements to build systems and github actions server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Kompute https://github.com/KomputeProject/kompute/ labels Oct 4, 2024
@ggerganov ggerganov changed the base branch from sl/backend-registry-2 to master October 4, 2024 11:11
@ggerganov ggerganov force-pushed the sl/backend-registry-2-add-metal branch 2 times, most recently from 058430f to ae56ec2 Compare October 4, 2024 12:15
Copy link

@mmtmn mmtmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@slaren
Copy link
Member

slaren commented Oct 5, 2024

This seems to be working now.

@slaren slaren force-pushed the sl/backend-registry-2-add-metal branch 2 times, most recently from 7e8d2a9 to 84c3b2a Compare October 5, 2024 22:47
@ggerganov ggerganov force-pushed the sl/backend-registry-2-add-metal branch from 84c3b2a to 6dcb899 Compare October 6, 2024 10:16
@ggerganov ggerganov marked this pull request as ready for review October 6, 2024 10:16
@ggerganov ggerganov requested a review from slaren October 6, 2024 10:17
@ggerganov
Copy link
Member Author


ggml_backend_t ggml_backend_reg_metal_init(const char * params, void * user_data) {
static const char * ggml_backend_metal_device_get_description(ggml_backend_dev_t dev) {
return [[g_state.mtl_device name] UTF8String];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a guarantee that mtl_device is initialized here, it probably needs a call to ggml_backend_metal_get_device/ggml_backend_metal_free_device like in ggml_backend_metal_device_get_memory. However, I imagine that could cause issues with the lifetime of the string returned by MTLDevice, so it may be necessary to keep a copy of the string in the context instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now.

I also reworked the implementation to avoid accessing g_state when we can get the device context locally. Should be much cleaner now and easier to add multi-GPU support in the future if needed.

Comment on lines 496 to 502
#if TARGET_OS_OSX || (TARGET_OS_IOS && __clang_major__ >= 15)
if (@available(macOS 10.12, iOS 16.0, *)) {
GGML_LOG_INFO("%s: recommendedMaxWorkingSetSize = %8.2f MB\n", __func__, ctx->device.recommendedMaxWorkingSetSize / 1e6);
GGML_LOG_INFO("%s: recommendedMaxWorkingSetSize = %8.2f MB\n", __func__, device.recommendedMaxWorkingSetSize / 1e6);
}
#elif TARGET_OS_OSX
if (ctx->device.maxTransferRate != 0) {
GGML_LOG_INFO("%s: maxTransferRate = %8.2f MB/s\n", __func__, ctx->device.maxTransferRate / 1e6);
if (device.maxTransferRate != 0) {
GGML_LOG_INFO("%s: maxTransferRate = %8.2f MB/s\n", __func__, device.maxTransferRate / 1e6);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this #if/#elif is correct, this will never be printed.

@slaren
Copy link
Member

slaren commented Oct 7, 2024

Should we put a deprecate notice for these API calls?

I think it may be too early for that, it's probably better to wait a bit until all the backends and the ggml examples are updated.

@ggerganov ggerganov merged commit d5ac8cf into master Oct 7, 2024
53 checks passed
@ggerganov ggerganov deleted the sl/backend-registry-2-add-metal branch October 7, 2024 15:27
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) devops improvements to build systems and github actions examples ggml changes relating to the ggml tensor library for machine learning Kompute https://github.com/KomputeProject/kompute/ nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment Nvidia GPU Issues specific to Nvidia GPUs python python script changes script Script related server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants