Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when retrieve `zesDeviceGetProperties on Windows MTL iGPU device #713

Open
sgwhat opened this issue Mar 25, 2024 · 3 comments
Open
Labels
in queue L0 Sysman Issue related to L0 Sysman

Comments

@sgwhat
Copy link

sgwhat commented Mar 25, 2024

Hi all, I encountered an issue where I could not retrieve the device modelName using zesDeviceGetProperties on Windows MTL iGPU (but it works well on Linux Arc770).

Platform: Core Ultra5 iGPU (Arc Graphics)
Os: Windows 11
iGPU Driver: 31.0.101.5333

Normally, by initializing the gpu driver with zesInit, I can successfully discover all the driver instances and get the properties of the driver instance (props.modelName).

For example, this code can run correctly on Linux Arc770, where I can get the following output:

discovered 1 Level-Zero drivers
discovered 1 Level-Zero devices
[0] oneAPI device name: Intel(R) Arc(TM) A770 Graphics
[0] oneAPI brand: unknown
[0] oneAPI vendor: Intel(R) Corporation
[0] oneAPI S/N: unknown
[0] oneAPI board number: unknown
discovered 1 Level-Zero memory modules

However, on Windows MTL iGPU, props.modelName is an empty value, and it could not find any Level-Zero memory modules.

discovered 1 Level-Zero drivers
discovered 1 Level-Zero devices
[0] oneAPI device name:
[0] oneAPI brand:
[0] oneAPI vendor:
[0] oneAPI S/N: unknown
[0] oneAPI board number: unknown
discovered 0 Level-Zero memory modules

For more details about my implementation, you may see my below comment.

@sgwhat
Copy link
Author

sgwhat commented Mar 25, 2024

Below is my implementation for initializing the GPU driver:

  // Initialize the gpu driver
  resp->oh.handle = LoadLibrary("C:\Windows\System32\...\ze_loader.dll");
  
  for (i = 0; l[i].s != NULL; i++) {
    *l[i].p = GetProcAddress(resp->oh.handle, l[i].s);
    if (!l[i].p) {
      resp->oh.handle = NULL;
      char *msg = LOAD_ERR();
      UNLOAD_LIBRARY(resp->oh.handle);
      free(msg);
      resp->err = strdup(buf);
      return;
    }
  }

  ret = (*resp->oh.zesInit)(0);

Below is my implementation for discovering the GPU driver and reach the properties:

  // Discover the gpu driver instance
  void oneapi_check_vram(oneapi_handle_t h, mem_info_t *resp) {
	  ze_result_t ret;
	  resp->err = NULL;
	  resp->igpu_index = -1;
	  uint64_t totalMem = 0;
	  uint64_t usedMem = 0;
	  const int buflen = 256;
	  char buf[buflen + 1];
	  int i, d, m;
	  uint32_t driversCount = 0;
	  ret = (*h.zesDriverGet)(&driversCount, NULL);
	  
	  zes_driver_handle_t *allDrivers =
	      malloc(driversCount * sizeof(zes_driver_handle_t));
	  (*h.zesDriverGet)(&driversCount, allDrivers);
	
	  resp->total = 0;
	  resp->free = 0;
	
	  for (d = 0; d < driversCount; d++) {
	    uint32_t deviceCount = 0;
	    ret = (*h.zesDeviceGet)(allDrivers[d], &deviceCount, NULL);
	
	    zes_device_handle_t *devices =
	        malloc(deviceCount * sizeof(zes_device_handle_t));
	    (*h.zesDeviceGet)(allDrivers[d], &deviceCount, devices);
	
	    for (i = 0; i < deviceCount; ++i) {
	      uint32_t globalDeviceIndex = resp->count;
	      resp->count++;
	
	      zes_device_ext_properties_t ext_props;
	      ext_props.stype = ZES_STRUCTURE_TYPE_DEVICE_EXT_PROPERTIES;
	      ext_props.pNext = NULL;
	
	      zes_device_properties_t props;
	      props.stype = ZES_STRUCTURE_TYPE_DEVICE_PROPERTIES;
	      props.pNext = &ext_props;
	
	      ret = (*h.zesDeviceGetProperties)(devices[i], &props);
	      if (ret != ZE_RESULT_SUCCESS) {
	        snprintf(buf, buflen, "unable to get device properties: %d", ret);
	        resp->err = strdup(buf);
	        free(allDrivers);
	        free(devices);
	        return;
	      }
	
	      if (h.verbose) {
	        LOG(h.verbose, "[%d] oneAPI device name: %s\n", globalDeviceIndex,
	            props.modelName);
	        LOG(h.verbose, "[%d] oneAPI brand: %s\n", globalDeviceIndex,
	            props.brandName);
	        LOG(h.verbose, "[%d] oneAPI vendor: %s\n", globalDeviceIndex,
	            props.vendorName);
                LOG(h.verbose, "[%d] oneAPI S/N: %s\n", globalDeviceIndex,
                    props.serialNumber);
                LOG(h.verbose, "[%d] oneAPI board number: %s\n", globalDeviceIndex,
                    props.boardNumber);
	      }
	      
	      uint32_t memCount = 0;
	      ret = (*h.zesDeviceEnumMemoryModules)(devices[i], &memCount, NULL);
	      LOG(h.verbose, "discovered %d Level-Zero memory modules\n", memCount);

@eero-t
Copy link

eero-t commented Mar 25, 2024

What's your MTL model number, and version of compute-runtime include with your Windows driver package (latest compute runtime versions start with 24.)?

PS. I'm compute-runtime Linux user, not its developer, but grepping the model ID from sources is trivial and one can check whether given release tag contains that commit.

@JablonskiMateusz JablonskiMateusz added the L0 Sysman Issue related to L0 Sysman label Mar 25, 2024
@saik-intel
Copy link
Contributor

we will check internally whether we could support the model name for MTL platform and comeback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in queue L0 Sysman Issue related to L0 Sysman
Projects
None yet
Development

No branches or pull requests

4 participants