5.6.1. SVM Sharing Granularity: Coarse- and Fine- Grained Sharing
5.6.1. SVM共享粒度:粗粒度和细粒度共享

OpenCL maintains memory consistency in a coarse-grained fashion in regions of buffers. We call this coarse-grained sharing. Many platforms such as those with integrated CPU-GPU processors and ones using the SVM-related PCI-SIG IOMMU services can do better, and can support sharing at a granularity smaller than a buffer. We call this fine-grained sharing.

OpenCL在缓冲区区域中以粗粒度的方式保持内存一致性。我们称之为粗粒度共享。许多平台,如具有集成CPU-GPU处理器的平台和使用SVM相关PCI-SIG IOMMU服务的平台,可以做得更好,并且可以支持比缓冲区更小粒度的共享。我们称之为细粒度共享。

  • Coarse-grained sharing: Coarse-grain sharing may be used for memory and virtual pointer sharing between multiple devices as well as between the host and one or more devices. The shared memory region is a memory buffer allocated using clSVMAlloc. Memory consistency is guaranteed at synchronization points and the host can use calls to clEnqueueSVMMap and clEnqueueSVMUnmap or create a cl_mem buffer object using the SVM pointer and use OpenCL’s existing host API functions clEnqueueMapBuffer and clEnqueueUnmapMemObject to update regions of the buffer. What coarse-grain buffer SVM adds to OpenCL’s earlier buffer support are the ability to share virtual memory pointers and a guarantee that concurrent access to the same memory allocation from multiple kernels on a single device is valid. The coarse-grain buffer SVM provides a memory consistency model similar to the global memory consistency model described in sections 3.3.1 and 3.4.3 of the OpenCL 1.2 specification. This memory consistency applies to the regions of buffers being shared in a coarse-grained fashion. It is enforced at the synchronization points between commands enqueued to command-queues in a single context with the additional consideration that multiple kernels concurrently running on the same device may safely share the data.

  • ​粗粒度共享:粗粒度共享可用于多个设备之间以及主机与一个或多个设备间的内存和虚拟指针共享。共享内存区域是使用clSVMAlloc分配的内存缓冲区。在同步点保证内存一致性,主机可以使用对clEnqueueSVMMap和clEnqueueSVMUnmap的调用,或者使用SVM指针创建cl_mem缓冲区对象,并使用OpenCL现有的主机API函数clEnqueueMapBuffer和clEnQueue UnmapMemObject更新缓冲区的区域。粗粒度缓冲区SVM为OpenCL早期的缓冲区支持添加了共享虚拟内存指针的能力,并保证在单个设备上从多个内核并行访问相同的内存分配是有效的。粗粒度缓冲SVM提供了类似于OpenCL 1.2规范第3.3.1节和第3.4.3节中描述的全局内存一致性模型的内存一致性。这种内存一致性适用于以粗粒度方式共享的缓冲区区域。它是在单个上下文中排队到命令队列的命令之间的同步点强制执行的,同时考虑到在同一设备上同时运行的多个内核可以安全地共享数据。

  • Fine-grained sharing: Shared virtual memory where memory consistency is maintained at a granularity smaller than a buffer. How fine-grained SVM is used depends on whether the device supports SVM atomic operations.

  • 细粒度共享:共享虚拟内存,其中以小于缓冲区的粒度保持内存一致性。如何使用细粒度SVM取决于设备是否支持SVM原子操作。
    • If SVM atomic operations are supported, they provide memory consistency for loads and stores by the host and kernels executing on devices supporting SVM. This means that the host and devices can concurrently read and update the same memory. The consistency provided by SVM atomics is in addition to the consistency provided at synchronization points. There is no need for explicit calls to clEnqueueSVMMap and clEnqueueSVMUnmap or clEnqueueMapBuffer and clEnqueueUnmapMemObject on a cl_mem buffer object created using the SVM pointer.

    • ​如果支持SVM原子操作,则它们为主机和在支持SVM的设备上执行的内核的加载和存储提供内存一致性。这意味着主机和设备可以同时读取和更新同一内存。SVM原子提供的一致性是对同步点提供的一致性的补充。对于使用SVM指针创建的cl_mem缓冲区对象,不需要显式调用clEnqueueSVMMap和clEnqueueSVMUnmap或clEnqueueMapBuffer和clEnQueueUnmapMemObject。

    • If SVM atomic operations are not supported, the host and devices can concurrently read the same memory locations and can concurrently update non-overlapping memory regions, but attempts to update the same memory locations are undefined. Memory consistency is guaranteed at synchronization points without the need for explicit calls to clEnqueueSVMMap and clEnqueueSVMUnmap or clEnqueueMapBuffer and clEnqueueUnmapMemObject on a cl_mem buffer object created using the SVM pointer.

    • ​如果不支持SVM原子操作,则主机和设备可以同时读取相同的存储器位置,并且可以同时更新不重叠的存储器区域,但是更新相同存储器位置的尝试是未定义的。在同步点保证内存一致性,而无需显式调用使用SVM指针创建的cl_mem缓冲区对象上的clEnqueueSVMMap和clEnqueueSVMUnmap或clEnqueueMapBuffer和clEnQueue UnmapMemObject。

  • There are two kinds of fine-grain sharing support. Devices may support either fine-grain buffer sharing or fine-grain system sharing.

  • 有两种细粒度共享支持。设备可以支持细粒度缓冲区共享或细粒度系统共享。
    • Fine-grain buffer sharing provides fine-grain SVM only within buffers and is an extension of coarse-grain sharing. To support fine-grain buffer sharing in an OpenCL context, all devices in the context must support CL_DEVICE_SVM_FINE_GRAIN_BUFFER.

    • ​细粒度缓冲区共享仅在缓冲区内提供细粒度SVM,是粗粒度共享的扩展。要在OpenCL上下文中支持细粒度缓冲区时共享,上下文中的所有设备都必须支持CL_DEVICE_SVM_FINE_GRAIN_BUFFER。

    • Fine-grain system sharing enables fine-grain sharing of the host’s entire virtual memory, including memory regions allocated by the system malloc API. OpenCL buffer objects are unnecessary and programmers can pass pointers allocated using malloc to OpenCL kernels.

    • 细粒度系统共享实现了主机整个虚拟内存的细粒度共享,包括系统malloc API分配的内存区域。OpenCL缓冲区对象是不必要的,程序员可以将使用malloc分配的指针传递给OpenCL内核。

As an illustration of fine-grain SVM using SVM atomic operations to maintain memory consistency, consider the following example. The host and a set of devices can simultaneously access and update a shared work-queue data structure holding work-items to be done. The host can use atomic operations to insert new work-items into the queue at the same time as the devices using similar atomic operations to remove work-items for processing.

作为使用SVM原子操作来保持内存一致性的细粒度SVM的示例,请考虑以下示例。主机和一组设备可以同时访问和更新包含要完成的工作项的共享工作队列数据结构。主机可以使用原子操作将新的工作项插入队列,同时设备也可以使用类似的原子操作删除工作项进行处理。

It is the programmer’s responsibility to ensure that no host code or executing kernels attempt to access a shared memory region after that memory is freed. We require the SVM implementation to work with either 32- or 64- bit host applications subject to the following requirement: the address space size must be the same for the host and all OpenCL devices in the context.

程序员有责任确保在释放共享内存区域后,没有主机代码或执行内核试图访问该内存区域。我们要求SVM实现与32位或64位主机应用程序一起工作,但需满足以下要求:主机和上下文中所有OpenCL设备的地址空间大小必须相同。

To allocate a shared virtual memory buffer (referred to as a SVM buffer) that can be shared by the host and all devices in an OpenCL context that support shared virtual memory, call the function

要分配可由主机和OpenCL上下文中支持共享虚拟内存的所有设备共享的共享虚拟内存缓冲区(称为SVM缓冲区),请调用函数

// Provided by CL_VERSION_2_0
void* clSVMAlloc(
    cl_context context,
    cl_svm_mem_flags flags,
    size_t size,
    cl_uint alignment);

clSVMAlloc is missing before version 2.0.

clSVMAlloc在版本2.0之前丢失。

  • context is a valid OpenCL context used to create the SVM buffer.

  • context是用于创建SVM缓冲区的有效OpenCL上下文。

  • flags is a bit-field that is used to specify allocation and usage information. The SVM Memory Flags table describes the possible values for flags.

  • ​flags是一个位字段,用于指定分配和使用信息。SVM内存标志表描述了标志的可能值。

  • size is the size in bytes of the SVM buffer to be allocated.

  • size是要分配的SVM缓冲器的以字节为单位的大小。

  • alignment is the minimum alignment in bytes that is required for the newly created buffers memory region. It must be a power of two up to the largest data type supported by the OpenCL device. For the full profile, the largest data type is long16. For the embedded profile, it is long16 if the device supports 64-bit integers; otherwise it is int16. If alignment is 0, a default alignment will be used that is equal to the size of largest data type supported by the OpenCL implementation.

  • alignment是新创建的缓冲存储器区域所需的以字节为单位的最小对齐。它必须是OpenCL设备支持的最大数据类型的二次方。对于完整配置文件,最大的数据类型是long16。对于嵌入式配置文件,如果设备支持64位整数,则为long16;否则为int16。如果对齐方式为0,则将使用默认对齐方式,该默认对齐方式等于OpenCL实现所支持的最大数据类型的大小。

Table 40. List of supported SVM memory flag values

表40 支持的SVM内存标志值列表

SVM Memory Flags

SVM内存标志

Description

描述

CL_MEM_READ_WRITE

This flag specifies that the SVM buffer will be read and written by a kernel. This is the default.

这个标志指定SVM缓冲区将由内核读取和写入。这是默认设置。

CL_MEM_WRITE_ONLY

This flag specifies that the SVM buffer will be written but not read by a kernel.

此标志指定SVM缓冲区将由内核写入而不是读取。

Reading from a SVM buffer created with CL_MEM_WRITE_ONLY inside a kernel is undefined.


从内核内使用CL_MEM_WRITE_ONLY创建的SVM缓冲区读取是未定义的。

CL_MEM_READ_WRITE and CL_MEM_WRITE_ONLY are mutually exclusive.

CL_MEM_READ_WRITE和CL_MEM_WRITE_ONLY是互斥的。

CL_MEM_READ_ONLY

This flag specifies that the SVM buffer object is a read-only memory object when used inside a kernel.

此标志指定SVM缓冲区对象在内核内部使用时是只读内存对象。

Writing to a SVM buffer created with CL_MEM_READ_ONLY inside a kernel is undefined.


写入内核内使用CL_MEM_READ_ONLY创建的SVM缓冲区是未定义的。

CL_MEM_READ_WRITE or CL_MEM_WRITE_ONLY and CL_MEM_READ_ONLY are mutually exclusive.

CL_MEM_READ_WRITE或CL_MEM_WRITE_ONLY和CL_MEM_READ_ONLY是互斥的。

CL_MEM_SVM_FINE_GRAIN_BUFFER

missing before version 2.0.

This specifies that the application wants the OpenCL implementation to do a fine-grained allocation.

这指定应用程序希望OpenCL实现进行细粒度的分配。

CL_MEM_SVM_ATOMICS

missing before version 2.0.

This flag is valid only if CL_MEM_SVM_FINE_GRAIN_BUFFER is specified in flags. It is used to indicate that SVM atomic operations can control visibility of memory accesses in this SVM buffer.

​只有在标志中指定了CL_MEM_SVM_FINE_GRAIN_BUFFER时,此标志才有效。它用于指示SVM原子操作可以控制该SVM缓冲区中的存储器访问的可见性。

If CL_MEM_SVM_FINE_GRAIN_BUFFER is not specified, the buffer can be created as a coarse grained SVM allocation. Similarly, if CL_MEM_SVM_ATOMICS is not specified, the buffer can be created without support for SVM atomic operations (refer to an OpenCL kernel language specifications).

​如果未指定CL_MEM_SVM_FINE_GRAIN_BUFFER,则可以将缓冲区创建为粗粒度SVM分配。类似地,如果没有指定CL_MEM_SVM_ATOMICS,则可以在不支持SVM原子操作的情况下创建缓冲区(参考OpenCL内核语言规范)。

Calling clSVMAlloc does not itself provide consistency for the shared memory region. When the host cannot use the SVM atomic operations, it must rely on OpenCL’s guaranteed memory consistency at synchronization points.

​调用clSVMAlloc本身并不能为共享内存区域提供一致性。当主机不能使用SVM原子操作时,它必须依赖OpenCL在同步点保证的内存一致性。

For SVM to be used efficiently, the host and any devices sharing a buffer containing virtual memory pointers should have the same endianness. If the context passed to clSVMAlloc has devices with mixed endianness and the OpenCL implementation is unable to implement SVM because of that mixed endianness, clSVMAlloc will fail and return NULL.

​为了有效地使用SVM,主机和共享包含虚拟内存指针的缓冲区的任何设备都应该具有相同的字节序。如果传递给clSVMAlloc的上下文具有具有混合端序的设备,并且OpenCL实现由于该混合端序而无法实现SVM,则clSVMAalloc将失败并返回NULL。

Although SVM is generally not supported for image objects, clCreateImage and clCreateImageWithProperties may create an image from a buffer (a 1D image from a buffer or a 2D image from buffer) if the buffer specified in its image description parameter is a SVM buffer. Such images have a linear memory representation so their memory can be shared using SVM. However, fine grained sharing and atomics are not supported for image reads and writes in a kernel.

​尽管图像对象通常不支持SVM,但是如果在其图像描述参数中指定的缓冲器是SVM缓冲器,则clCreateImage和clCreateImageWithProperties可以从缓冲器创建图像(来自缓冲器的1D图像或来自缓冲器的2D图像)。这样的图像具有线性存储器表示,因此可以使用SVM来共享它们的存储器。但是,内核中的图像读取和写入不支持细粒度共享和原子操作。

clSVMAlloc returns a valid non-NULL shared virtual memory address if the SVM buffer is successfully allocated. Otherwise, like malloc, it returns a NULL pointer value. clSVMAlloc will fail if

如果成功分配了SVM缓冲区,clSVMAlloc将返回一个有效的非NULL共享虚拟内存地址。否则,就像malloc一样,它会返回一个NULL指针值。clSVMAlloc将失败,如果

  • context is not a valid context, or no devices in context support SVM.

  • context不是有效的上下文,或者上下文中没有设备支持SVM。

  • flags does not contain CL_MEM_SVM_FINE_GRAIN_BUFFER but does contain CL_MEM_SVM_ATOMICS.

  • flags不包含CL_MEM_SVM_FINE_GRAIN_BUFFER,但包含CL_MEM _SVM_ATOMICS。

  • Values specified in flags do not follow rules described for supported values in the SVM Memory Flags table.

  • flags中指定的值不遵循SVM内存标志表中为支持的值描述的规则。

  • CL_MEM_SVM_FINE_GRAIN_BUFFER or CL_MEM_SVM_ATOMICS is specified in flags and these are not supported by at least one device in context.

  • CL_MEM_SVM_FINE_GRAIN_BUFFER或CL_MEM_StVM_ATOMICS在flags中指定,并且flags中至少有一个设备不支持这些标志。

  • The values specified in flags are not valid, i.e. do not match those defined in the SVM Memory Flags table.

  • flags中指定的值无效,即与SVM内存标志表中定义的值不匹配。

  • size is 0 or > CL_DEVICE_MAX_MEM_ALLOC_SIZE value for any device in context.

  • ​对于context中的任何设备,size为0或大于CL_DEVICE_MAX_MEM_ALLOC_SIZE值。

  • alignment is not a power of two or the OpenCL implementation cannot support the specified alignment for at least one device in context.

  • alignment不是二次幂,或者OpenCL实现无法支持context中至少一个设备的指定对齐。

  • There was a failure to allocate resources.

  • 分配资源失败。

To free a shared virtual memory buffer allocated using clSVMAlloc, call the function

​要释放使用clSVMAlloc分配的共享虚拟内存缓冲区,请调用函数

// Provided by CL_VERSION_2_0
void clSVMFree(
    cl_context context,
    void* svm_pointer);

clSVMFree is missing before version 2.0.

clSVMFree在版本2.0之前丢失。

  • context is a valid OpenCL context used to create the SVM buffer. If no devices in context support SVM, no action occurs.

  • context是用于创建SVM缓冲区的有效OpenCL上下文。如果context中没有设备支持SVM,则不会发生任何操作。

  • svm_pointer must be the value returned by a call to clSVMAlloc. If a NULL pointer is passed in svm_pointer, no action occurs.

  • ​svm_pointer必须是调用clSVMAlloc返回的值。如果在svm_pointer中传递了NULL指针,则不会执行任何操作。

Note that clSVMFree does not wait for previously enqueued commands that may be using svm_pointer to finish before freeing svm_pointer. It is the responsibility of the application to make sure that enqueued commands that use svm_pointer have finished before freeing svm_pointer. This can be done by enqueuing a blocking operation such as clFinishclWaitForEventsclEnqueueReadBuffer or by registering a callback with the events associated with enqueued commands and when the last enqueued command has finished freeing svm_pointer.

​请注意,clSVMFree不会在释放svm_pointer之前等待先前排队的命令,这些命令可能正在使用svm_pointers来完成。在释放svm_pointer之前,应用程序有责任确保使用svm_pointers的排队命令已经完成。这可以通过将阻塞操作(如clFinish、clWaitForEvents、clEnqueueReadBuffer)排入队列来完成,也可以通过向与已排入队列的命令相关联的事件注册回调,以及在最后一个已排入队列命令已完成释放svm_pointer时完成。

The behavior of using svm_pointer after it has been freed is undefined. In addition, if a buffer object is created using clCreateBuffer or clCreateBufferWithProperties with svm_pointer, the buffer object must first be released before the svm_pointer is freed.

​释放svm_pointer后使用它的行为是未定义的。此外,如果使用clCreateBuffer或带有svm_pointer的clCreateBufferWithProperties创建缓冲区对象,则必须先释放缓冲区对象才能释放svm_pointers。

The clEnqueueSVMFree API can also be used to enqueue a callback to free the shared virtual memory buffer allocated using clSVMAlloc or a shared system memory pointer.

​clEnqueueSVMFree API还可以用于将回调排队以释放使用clSVMAlloc或共享系统内存指针分配的共享虚拟内存缓冲区。

To enqueue a command to free the shared virtual memory allocated using clSVMAlloc or a shared system memory pointer, call the function

​要将命令排队以释放使用clSVMAlloc或共享系统内存指针分配的共享虚拟内存,请调用函数

// Provided by CL_VERSION_2_0
cl_int clEnqueueSVMFree(
    cl_command_queue command_queue,
    cl_uint num_svm_pointers,
    void* svm_pointers[],
    void (CL_CALLBACK* pfn_free_func)(cl_command_queue queue, cl_uint num_svm_pointers, void* svm_pointers[], void* user_data),
    void* user_data,
    cl_uint num_events_in_wait_list,
    const cl_event* event_wait_list,
    cl_event* event);

clEnqueueSVMFree is missing before version 2.0.

clEnqueueSVMFree在版本2.0之前丢失。

  • command_queue is a valid host command-queue.

  • command_queue是一个有效的主机命令队列。

  • svm_pointers and num_svm_pointers specify shared virtual memory pointers to be freed. Each pointer in svm_pointers that was allocated using clSVMAlloc must have been allocated from the same context from which command_queue was created. The memory associated with svm_pointers can be reused or freed after the function returns.

  • ​svm_pointers和num_svm_pointers指定要释放的共享虚拟内存指针。svm_pointers中使用clSVMAlloc分配的每个指针必须是从创建command_queue的同一上下文中分配的。函数返回后,可以重用或释放与svm_pointers相关联的内存。

  • pfn_free_func specifies the callback function to be called to free the SVM pointers. This callback function may be called asynchronously by the OpenCL implementation. It is the application’s responsibility to ensure that the callback function is thread-safe. pfn_free_func takes four arguments: queue which is the command-queue in which clEnqueueSVMFree was enqueued, the count and list of SVM pointers to free and user_data which is a pointer to user specified data. If pfn_free_func is NULL, all pointers specified in svm_pointers must be allocated using clSVMAlloc and the OpenCL implementation will free these SVM pointers. pfn_free_func must be a valid callback function if any SVM pointer to be freed is a shared system memory pointer i.e. not allocated using clSVMAlloc. If pfn_free_func is a valid callback function, the OpenCL implementation will call pfn_free_func to free all the SVM pointers specified in svm_pointers.

  • ​pfn_free_func指定要调用以释放SVM指针的回调函数。这个回调函数可以由OpenCL实现异步调用。应用程序有责任确保回调函数是线程安全的。pfn_free_func采用四个参数:queue,它是clEnqueueSVMFree所在的命令队列,指向free的SVM指针的计数和列表,以及user_data,它是指向用户指定数据的指针。如果pfn_free_func为NULL,则必须使用clSVMAlloc分配svm_pointers中指定的所有指针,并且OpenCL实现将释放这些svm指针。如果要释放的任何SVM指针是共享系统内存指针,即未使用clSVMAlloc分配,则pfn_free_func必须是有效的回调函数。如果pfn_free_func是一个有效的回调函数,那么OpenCL实现将调用pfn_free_func来释放SVM_pointers中指定的所有SVM指针。

  • user_data will be passed as the user_data argument when pfn_free_func is called. user_data can be NULL.

  • 当调用pfn_free_func时,user_data将作为user_data参数传递。user_data可以为NULL。

  • event_wait_list and num_events_in_wait_list specify events that need to complete before clEnqueueSVMFree can be executed. If event_wait_list is NULL, then clEnqueueSVMFree does not wait on any event to complete. If event_wait_list is NULLnum_events_in_wait_list must be 0. If event_wait_list is not NULL, the list of events pointed to by event_wait_list must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points. The context associated with events in event_wait_list and command_queue must be the same. The memory associated with event_wait_list can be reused or freed after the function returns.

  • ​event_wait_list和num_events_in_wait_list指定在执行clEnqueueSVMFree之前需要完成的事件。如果event_wait_list为NULL,则clEnqueueSVMFree不等待任何事件完成。如果event_wait_list为NULL,则num_events_in_wait_list必须为0。如果event_wait_list不为NULL,则event_wail_list指向的事件列表必须有效,并且num_events_in_wait_list必须大于0。event_wait_list中指定的事件充当同步点。与event_wait_list和command_queue中的事件关联的上下文必须相同。函数返回后,可以重用或释放与event_wait_list关联的内存。

  • event returns an event object that identifies this command and can be used to query or queue a wait for this command to complete. If event is NULL or the enqueue is unsuccessful, no event will be created and therefore it will not be possible to query the status of this command or to wait for this command to complete. If event_wait_list and event are not NULLevent must not refer to an element of the event_wait_list array.

  • event返回一个标识此命令的事件对象,可用于查询或排队等待此命令完成。如果事件为NULL或排队不成功,则不会创建任何事件,因此无法查询此命令的状态或等待此命令完成。如果event_wait_list和event不为NULL,则event不能引用event_wail_list数组的元素。

clEnqueueSVMFree returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clEnqueueSVMFree将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_COMMAND_QUEUE if command_queue is not a valid host command-queue.

  • CL_INVALID_COMMAND_QUEUE(如果command_queue不是有效的主机命令队列)。

  • CL_INVALID_OPERATION if the device associated with command_queue does not support SVM.

  • CL_INVALID_OPERATION(如果与command_queue关联的设备不支持SVM)。

  • CL_INVALID_VALUE if num_svm_pointers is 0 and svm_pointers is non-NULLor if svm_pointers is NULL and num_svm_pointers is not 0.

  • 如果num_svm_pointers为0且svm_pointers为非NULL,或者如果svm_pointer为NULL且num_svm_prointers不为0,则CL_INVALID_VALUE。

  • CL_INVALID_EVENT_WAIT_LIST if event_wait_list is NULL and num_events_in_wait_list > 0, or event_wait_list is not NULL and num_events_in_wait_list is 0, or if event objects in event_wait_list are not valid events.

  • CL_INVALID_EVENT_WAIT_LIST如果event_wait_list为NULL且num_events_in_wait_list>0,或者event_wait_list不为NULL且num_events_in_wait_list为0,或者如果event_wait_list中的事件对象不是有效事件。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

To enqueue a command to do a memcpy operation, call the function

要将执行memcpy操作的命令排入队列,请调用函数

// Provided by CL_VERSION_2_0
cl_int clEnqueueSVMMemcpy(
    cl_command_queue command_queue,
    cl_bool blocking_copy,
    void* dst_ptr,
    const void* src_ptr,
    size_t size,
    cl_uint num_events_in_wait_list,
    const cl_event* event_wait_list,
    cl_event* event);

clEnqueueSVMMemcpy is missing before version 2.0.

clEnqueueSVMMemcpy在版本2.0之前丢失。

  • command_queue refers to the host command-queue in which the read / write command will be queued. If either dst_ptr or src_ptr is allocated using clSVMAlloc then the OpenCL context allocated against must match that of command_queue.

  • ​command_queue是指读/写命令将在其中排队的主机命令队列。如果使用clSVMAlloc分配dst_ptr或src_ptr,则针对其分配的OpenCL上下文必须与command_queue的上下文匹配。

  • blocking_copy indicates if the copy operation is blocking or non-blocking.

  • blocking_copy指示复制操作是阻塞还是非阻塞。

  • If blocking_copy is CL_TRUE i.e. the copy command is blocking, clEnqueueSVMMemcpy does not return until the buffer data has been copied into memory pointed to by dst_ptr.

  • ​如果blocking_copy为CL_TRUE,即copy命令为blocking,则clEnqueueSVMMemcpy不会返回,直到缓冲区数据已复制到dst_ptr指向的内存中。

  • size is the size in bytes of data being copied.

  • size是要复制的数据的字节大小。

  • dst_ptr is the pointer to a host or SVM memory allocation where data is copied to.

  • dst_ptr是指向将数据复制到其中的主机或SVM内存分配的指针。

  • src_ptr is the pointer to a host or SVM memory allocation where data is copied from.

  • src_ptr是指向从中复制数据的主机或SVM内存分配的指针。

  • event_wait_list and num_events_in_wait_list specify events that need to complete before this particular command can be executed. If event_wait_list is NULL, then this particular command does not wait on any event to complete. If event_wait_list is NULLnum_events_in_wait_list must be 0. If event_wait_list is not NULL, the list of events pointed to by event_wait_list must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points. The context associated with events in event_wait_list and command_queue must be the same. The memory associated with event_wait_list can be reused or freed after the function returns.

  • event_wait_list和num_events_in_wait_list指定在执行此特定命令之前需要完成的事件。如果event_wait_list为NULL,则此特定命令不等待任何事件完成。如果event_wait_list为NULL,则num_events_in_wait_list必须为0。如果event_wait_list不为NULL,则event_wail_list指向的事件列表必须有效,并且num_events_in_wait_list必须大于0。event_wait_list中指定的事件充当同步点。与event_wait_list和command_queue中的事件关联的上下文必须相同。函数返回后,可以重用或释放与event_wait_list关联的内存。

  • event returns an event object that identifies this read / write command and can be used to query or queue a wait for this command to complete. If event is NULL or the enqueue is unsuccessful, no event will be created and therefore it will not be possible to query the status of this command or to wait for this command to complete. If event_wait_list and event are not NULLevent must not refer to an element of the event_wait_list array.

  • event返回一个事件对象,该对象标识此读/写命令,可用于查询或排队等待此命令完成。如果事件为NULL或排队不成功,则不会创建任何事件,因此无法查询此命令的状态或等待此命令完成。如果event_wait_list和event不为NULL,则event不能引用event_wail_list数组的元素。

If blocking_copy is CL_FALSE i.e. the copy command is non-blocking, clEnqueueSVMMemcpy queues a non-blocking copy command and returns. The contents of the buffer that dst_ptr points to cannot be used until the copy command has completed. The event argument returns an event object which can be used to query the execution status of the read command. When the copy command has completed, the contents of the buffer that dst_ptr points to can be used by the application.

​如果blocking_copy是CL_FALSE,即复制命令是非阻塞的,则clEnqueueSVMMemcpy将非阻塞复制命令排队并返回。在复制命令完成之前,dst_ptr指向的缓冲区内容不能使用。事件参数返回一个事件对象,该对象可用于查询读取命令的执行状态。复制命令完成后,应用程序可以使用dst_ptr指向的缓冲区的内容。

If the memory allocation(s) containing dst_ptr and/or src_ptr are allocated using clSVMAlloc and either is not allocated from the same context from which command_queue was created the behavior is undefined.

​如果包含dst_ptr或src_ptr的内存分配是使用clSVMAlloc分配的,并且不是从创建command_queue的同一上下文中分配的,则行为未定义。

clEnqueueSVMMemcpy returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clEnqueueSVMMemcpy将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_COMMAND_QUEUE if command_queue is not a valid host command-queue.

  • CL_INVALID_COMMAND_QUEUE(如果command_queue不是有效的主机命令队列)。

  • CL_INVALID_OPERATION if the device associated with command_queue does not support SVM.

  • CL_INVALID_OPERATION(如果与command_queue关联的设备不支持SVM)。

  • CL_INVALID_CONTEXT if the context associated with command_queue and events in event_wait_list are not the same.

  • 如果与command_queue相关联的上下文和event_wait_list中的事件不相同,则CL_INVALID_CONTEXT。

  • CL_INVALID_EVENT_WAIT_LIST if event_wait_list is NULL and num_events_in_wait_list > 0, or event_wait_list is not NULL and num_events_in_wait_list is 0, or if event objects in event_wait_list are not valid events.

  • CL_INVALID_EVENT_WAIT_LIST如果event_wait_list为NULL且num_events_in_wait_list>0,或者event_wait_list不为NULL且num_events_in_wait_list为0,或者如果event_wait_list中的事件对象不是有效事件。

  • CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST if the copy operation is blocking and the execution status of any of the events in event_wait_list is a negative integer value.

  • CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST,如果复制操作正在阻止并且event_wait_list 中任何事件的执行状态为负整数值。

  • CL_INVALID_VALUE if dst_ptr or src_ptr is NULL.

  • 如果dst_ptr或src_ptr为NULL,则CL_INVALID_VALUE。

  • CL_MEM_COPY_OVERLAP if the values specified for dst_ptrsrc_ptr and size result in an overlapping copy.

  • 如果为dst_ptr、src_ptr和大小指定的值导致副本重叠,则CL_MEM_COPY_OVERLAP。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

To enqueue a command to fill a region in memory with a pattern of a given pattern size, call the function

要将命令排入队列,以使用给定模式大小的模式填充内存中的区域,请调用函数

// Provided by CL_VERSION_2_0
cl_int clEnqueueSVMMemFill(
    cl_command_queue command_queue,
    void* svm_ptr,
    const void* pattern,
    size_t pattern_size,
    size_t size,
    cl_uint num_events_in_wait_list,
    const cl_event* event_wait_list,
    cl_event* event);

clEnqueueSVMMemFill is missing before version 2.0.

clEnqueueSVMMemFill在版本2.0之前丢失。

  • command_queue refers to the host command-queue in which the fill command will be queued. The OpenCL context associated with command_queue and SVM pointer referred to by svm_ptr must be the same.

  • command_queue是指填充命令将在其中排队的主机命令队列。与command_queue关联的OpenCL上下文和svm_ptr引用的SVM指针必须相同。

  • svm_ptr is a pointer to a memory region that will be filled with pattern. It must be aligned to pattern_size bytes. If svm_ptr is allocated using clSVMAlloc then it must be allocated from the same context from which command_queue was created. Otherwise the behavior is undefined.

  • ​svm_ptr是指向将用模式填充的内存区域的指针。它必须与pattern_size字节对齐。如果svm_ptr是使用clSVMAlloc分配的,那么它必须从创建command_queue的同一上下文中分配。否则行为是未定义的。

  • pattern is a pointer to the data pattern of size pattern_size in bytes. pattern will be used to fill a region in buffer starting at svm_ptr and is size bytes in size. The data pattern must be a scalar or vector integer or floating-point data type supported by OpenCL as described in Shared Application Scalar Data Types and Supported Application Vector Data Types. For example, if region pointed to by svm_ptr is to be filled with a pattern of float4 values, then pattern will be a pointer to a cl_float4 value and pattern_size will be sizeof(cl_float4). The maximum value of pattern_size is the size of the largest integer or floating-point vector data type supported by the OpenCL device. The memory associated with pattern can be reused or freed after the function returns.

  • ​pattern是指向以字节为单位的patternsize大小的数据模式的指针。模式将用于填充缓冲区中从svmptr开始的区域,大小为size字节。数据模式必须是OpenCL支持的标量、矢量整数或浮点数据类型,如共享应用程序标量数据类型和支持的应用程序矢量数据类型中所述。例如,如果svm_ptr指向的区域要用float4值的模式填充,那么模式将是指向cl_float4值,pattern_size将是sizeof(cl_fload4)。pattern_size的最大值是OpenCL设备支持的最大整数或浮点矢量数据类型的大小。在函数返回后,可以重用或释放与模式相关联的内存。

  • size is the size in bytes of region being filled starting with svm_ptr and must be a multiple of pattern_size.

  • size是以svm_ptr开始填充的区域的字节大小,必须是pattern_size的倍数。

  • event_wait_list and num_events_in_wait_list specify events that need to complete before this particular command can be executed. If event_wait_list is NULL, then this particular command does not wait on any event to complete. If event_wait_list is NULLnum_events_in_wait_list must be 0. If event_wait_list is not NULL, the list of events pointed to by event_wait_list must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points. The context associated with events in event_wait_list and command_queue must be the same. The memory associated with event_wait_list can be reused or freed after the function returns.

  • event_wait_list和num_events_in_wait_list指定在执行此特定命令之前需要完成的事件。如果event_wait_list为NULL,则此特定命令不等待任何事件完成。如果event_wait_list为NULL,则num_events_in_wait_list必须为0。如果event_wait_list不为NULL,则event_wail_list指向的事件列表必须有效,并且num_events_in_wait_list必须大于0。event_wait_list中指定的事件充当同步点。与event_wait_list和command_queue中的事件关联的上下文必须相同。函数返回后,可以重用或释放与event_wait_list关联的内存。

  • event returns an event object that identifies this command and can be used to query or queue a wait for this command to complete. If event is NULL or the enqueue is unsuccessful, no event will be created and therefore it will not be possible to query the status of this command or to wait for this command to complete. If event_wait_list and event are not NULLevent must not refer to an element of the event_wait_list array.

  • event返回一个标识此命令的事件对象,可用于查询或排队等待此命令完成。如果事件为NULL或排队不成功,则不会创建任何事件,因此无法查询此命令的状态或等待此命令完成。如果event_wait_list和event不为NULL,则event不能引用event_wail_list数组的元素。

clEnqueueSVMMemFill returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clEnqueueSVMMemFill将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_COMMAND_QUEUE if command_queue is not a valid host command-queue.

  • CL_INVALID_COMMAND_QUEUE(如果command_queue不是有效的主机命令队列)。

  • CL_INVALID_OPERATION if the device associated with command_queue does not support SVM.

  • CL_INVALID_OPERATION(如果与command_queue关联的设备不支持SVM)。

  • CL_INVALID_CONTEXT if the context associated with command_queue and events in event_wait_list are not the same.

  • 如果与command_queue相关联的上下文和event_wait_list中的事件不相同,则CL_INVALID_CONTEXT。

  • CL_INVALID_VALUE if svm_ptr is NULL.

  • 如果svm_ptr为NULL,则CL_INVALID_VALUE。

  • CL_INVALID_VALUE if svm_ptr is not aligned to pattern_size bytes.

  • 如果svm_ptr未与pattern_size字节对齐,则CL_INVALID_VALUE。

  • CL_INVALID_VALUE if pattern is NULL or if pattern_size is 0 or if pattern_size is not one of {1, 2, 4, 8, 16, 32, 64, 128}.

  • CL_INVALID_VALUE如果pattern为NULL或者pattern_size为0或者pattern_size不是{1,2,4,8,16,32,64128}之一。

  • CL_INVALID_VALUE if size is not a multiple of pattern_size.

  • CL_INVALID_VALUE,如果大小不是pattern_size的倍数。

  • CL_INVALID_EVENT_WAIT_LIST if event_wait_list is NULL and num_events_in_wait_list > 0, or event_wait_list is not NULL and num_events_in_wait_list is 0, or if event objects in event_wait_list are not valid events.

  • CL_INVALID_EVENT_WAIT_LIST如果event_wait_list为NULL且num_events_in_wait_list>0,或者event_wait_list不为NULL且num_events_in_wait_list为0,或者如果event_wait_list中的事件对象不是有效事件。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

To enqueue a command that will allow the host to update a region of a SVM buffer, call the function

要将允许主机更新SVM缓冲区区域的命令排入队列,请调用函数

// Provided by CL_VERSION_2_0
cl_int clEnqueueSVMMap(
    cl_command_queue command_queue,
    cl_bool blocking_map,
    cl_map_flags flags,
    void* svm_ptr,
    size_t size,
    cl_uint num_events_in_wait_list,
    const cl_event* event_wait_list,
    cl_event* event);

clEnqueueSVMMap is missing before version 2.0.

clEnqueueSVMMap在版本2.0之前丢失。

  • command_queue must be a valid host command-queue.

  • command_queue必须是有效的主机命令队列。

  • blocking_map indicates if the map operation is blocking or non-blocking.

  • blocking_map指示映射操作是阻塞还是非阻塞。

  • map_flags is a bit-field and is described in the Memory Map Flags table.

  • ​map_flags是一个位字段,在Memory map flags表中进行了描述。

  • svm_ptr and size are a pointer to a memory region and size in bytes that will be updated by the host. If svm_ptr is allocated using clSVMAlloc then it must be allocated from the same context from which command_queue was created. Otherwise the behavior is undefined.

  • ​svm_ptr和size是指向将由主机更新的内存区域和大小(以字节为单位)的指针。如果svm_ptr是使用clSVMAlloc分配的,那么它必须从创建command_queue的同一上下文中分配。否则行为是未定义的。

  • event_wait_list and num_events_in_wait_list specify events that need to complete before this particular command can be executed. If event_wait_list is NULL, then this particular command does not wait on any event to complete. If event_wait_list is NULLnum_events_in_wait_list must be 0. If event_wait_list is not NULL, the list of events pointed to by event_wait_list must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points. The context associated with events in event_wait_list and command_queue must be the same. The memory associated with event_wait_list can be reused or freed after the function returns.

  • event_wait_list和num_events_in_wait_list指定在执行此特定命令之前需要完成的事件。如果event_wait_list为NULL,则此特定命令不等待任何事件完成。如果event_wait_list为NULL,则num_events_in_wait_list必须为0。如果event_wait_list不为NULL,则event_wail_list指向的事件列表必须有效,并且num_events_in_wait_list必须大于0。event_wait_list中指定的事件充当同步点。与event_wait_list和command_queue中的事件关联的上下文必须相同。函数返回后,可以重用或释放与event_wait_list关联的内存。

  • event returns an event object that identifies this command and can be used to query or queue a wait for this command to complete. If event is NULL or the enqueue is unsuccessful, no event will be created and therefore it will not be possible to query the status of this command or to wait for this command to complete. If event_wait_list and event are not NULLevent must not refer to an element of the event_wait_list array.

  • event返回一个标识此命令的事件对象,可用于查询或排队等待此命令完成。如果事件为NULL或排队不成功,则不会创建任何事件,因此无法查询此命令的状态或等待此命令完成。如果event_wait_list和event不为NULL,则event不能引用event_wail_list数组的元素。

If blocking_map is CL_TRUEclEnqueueSVMMap does not return until the application can access the contents of the SVM region specified by svm_ptr and size on the host.

​如果blocking_map是CL_TRUE,则clEnqueueSVMMap不会返回,直到应用程序可以访问主机上svm_ptr和大小指定的SVM区域的内容。

If blocking_map is CL_FALSE i.e. map operation is non-blocking, the region specified by svm_ptr and size cannot be used until the map command has completed. The event argument returns an event object which can be used to query the execution status of the map command. When the map command is completed, the application can access the contents of the region specified by svm_ptr and size.

​如果blocking_map为CL_FALSE,即map操作为非blocking,则在map命令完成之前,无法使用svm_ptr和大小指定的区域。事件参数返回一个事件对象,该对象可用于查询map命令的执行状态。当map命令完成时,应用程序可以访问svm_ptr和size指定的区域的内容。

Note that since we are enqueuing a command with a SVM buffer, the region is already mapped in the host address space.

请注意,由于我们将命令与SVM缓冲区排队,因此该区域已映射到主机地址空间中。

clEnqueueSVMMap returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clEnqueueSVMMap将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_COMMAND_QUEUE if command_queue is not a valid host command-queue.

  • CL_INVALID_COMMAND_QUEUE(如果command_queue不是有效的主机命令队列)。

  • CL_INVALID_OPERATION if the device associated with command_queue does not support SVM.

  • CL_INVALID_OPERATION(如果与command_queue关联的设备不支持SVM)。

  • CL_INVALID_CONTEXT if context associated with command_queue and events in event_wait_list are not the same.

  • 如果与command_queue相关联的上下文和event_wait_list中的事件不相同,则CL_INVALID_CONTEXT。

  • CL_INVALID_VALUE if svm_ptr is NULL.

  • 如果svm_ptr为NULL,则CL_INVALID_VALUE。

  • CL_INVALID_VALUE if size is 0 or if values specified in map_flags are not valid.

  • 如果大小为0或map_flags中指定的值无效,则CL_INVALID_VALUE。

  • CL_INVALID_EVENT_WAIT_LIST if event_wait_list is NULL and num_events_in_wait_list > 0, or event_wait_list is not NULL and num_events_in_wait_list is 0, or if event objects in event_wait_list are not valid events.

  • CL_INVALID_EVENT_WAIT_LIST如果event_wait_list为NULL且num_events_in_wait_list>0,或者event_wait_list不为NULL且num_events_in_wait_list为0,或者如果event_wait_list中的事件对象不是有效事件。

  • CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST if the map operation is blocking and the execution status of any of the events in event_wait_list is a negative integer value.

  • CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST,如果映射操作正在阻塞并且event_wait_list中任何事件的执行状态为负整数值。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

To enqueue a command to indicate that the host has completed updating the region given by svm_ptr and which was specified in a previous call to clEnqueueSVMMap, call the function

​要将指示主机已完成更新svm_ptr给定的区域的命令排入队列,并且该区域是在先前对clEnqueueSVMMap的调用中指定的,请调用函数

// Provided by CL_VERSION_2_0
cl_int clEnqueueSVMUnmap(
    cl_command_queue command_queue,
    void* svm_ptr,
    cl_uint num_events_in_wait_list,
    const cl_event* event_wait_list,
    cl_event* event);

clEnqueueSVMUnmap is missing before version 2.0.

clEnqueueSVMUnmap在版本2.0之前丢失。

  • command_queue must be a valid host command-queue.

  • command_queue必须是有效的主机命令队列。

  • svm_ptr is a pointer that was specified in a previous call to clEnqueueSVMMap. If svm_ptr is allocated using clSVMAlloc then it must be allocated from the same context from which command_queue was created. Otherwise the behavior is undefined.

  • ​svm_ptr是在上一次调用clEnqueueSVMMap时指定的指针。如果svm_ptr是使用clSVMAlloc分配的,那么它必须从创建command_queue的同一上下文中分配。否则行为是未定义的。

  • event_wait_list and num_events_in_wait_list specify events that need to complete before clEnqueueSVMUnmap can be executed. If event_wait_list is NULL, then clEnqueueSVMUnmap does not wait on any event to complete. If event_wait_list is NULLnum_events_in_wait_list must be 0. If event_wait_list is not NULL, the list of events pointed to by event_wait_list must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points. The context associated with events in event_wait_list and command_queue must be the same. The memory associated with event_wait_list can be reused or freed after the function returns.

  • ​event_wait_list和num_events_in_wait_list指定在执行clEnqueueSVMUnmap之前需要完成的事件。如果event_wait_list为NULL,则clEnqueueSVMUnmap不等待任何事件完成。如果event_wait_list为NULL,则num_events_in_wait_list必须为0。如果event_wait_list不为NULL,则event_wail_list指向的事件列表必须有效,并且num_events_in_wait_list必须大于0。event_wait_list中指定的事件充当同步点。与event_wait_list和command_queue中的事件关联的上下文必须相同。函数返回后,可以重用或释放与event_wait_list关联的内存。

  • event returns an event object that identifies this command and can be used to query or queue a wait for this command to complete. If event is NULL or the enqueue is unsuccessful, no event will be created and therefore it will not be possible to query the status of this command or to wait for this command to complete. If event_wait_list and event are not NULLevent must not refer to an element of the event_wait_list array.

  • event返回一个标识此命令的事件对象,可用于查询或排队等待此命令完成。如果事件为NULL或排队不成功,则不会创建任何事件,因此无法查询此命令的状态或等待此命令完成。如果event_wait_list和event不为NULL,则event不能引用event_wail_list数组的元素。

clEnqueueSVMMap and clEnqueueSVMUnmap act as synchronization points for the region of the SVM buffer specified in these calls.

clEnqueueSVMMap和clEnqueueSVMUnmap充当这些调用中指定的SVM缓冲区区域的同步点。

clEnqueueSVMUnmap returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clEnqueueSVMUnmap将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_COMMAND_QUEUE if command_queue is not a valid host command-queue.

  • CL_INVALID_COMMAND_QUEUE(如果command_queue不是有效的主机命令队列)。

  • CL_INVALID_OPERATION if the device associated with command_queue does not support SVM.

  • CL_INVALID_OPERATION(如果与command_queue关联的设备不支持SVM)。

  • CL_INVALID_CONTEXT if context associated with command_queue and events in event_wait_list are not the same.

  • 如果与command_queue相关联的上下文和event_wait_list中的事件不相同,则CL_INVALID_CONTEXT。

  • CL_INVALID_VALUE if svm_ptr is NULL.

  • 如果svm_ptr为NULL,则CL_INVALID_VALUE。

  • CL_INVALID_EVENT_WAIT_LIST if event_wait_list is NULL and num_events_in_wait_list > 0, or if event_wait_list is not NULL and num_events_in_wait_list is 0, or if event objects in event_wait_list are not valid events.

  • CL_INVALID_EVENT_WAIT_LIST如果event_wait_list为NULL且num_events_in_wait_list>0,或者如果event_wait_list不为NULL且num_events_in_wait_list为0,或者如果event_wait_list中的事件对象不是有效事件。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

If a coarse-grained SVM buffer is currently mapped for writing, the application must ensure that the SVM buffer is unmapped before any enqueued kernels or commands that read from or write to this SVM buffer or any of its associated cl_mem buffer objects begin execution; otherwise the behavior is undefined.

如果粗粒度SVM缓冲区当前被映射用于写入,则应用程序必须确保在从该SVM缓冲区或其任何相关联的cl_mem缓冲区对象中读取或写入的任何排队内核或命令开始执行之前,SVM缓冲区未被映射;否则行为是未定义的。

If a coarse-grained SVM buffer is currently mapped for reading, the application must ensure that the SVM buffer is unmapped before any enqueued kernels or commands that write to this memory object or any of its associated cl_mem buffer objects begin execution; otherwise the behavior is undefined.

如果粗粒度SVM缓冲区当前被映射用于读取,则应用程序必须确保在写入该存储器对象或其任何相关联的cl_mem缓冲区对象的任何排队内核或命令开始执行之前,SVM缓冲区未被映射;否则行为是未定义的。

A SVM buffer is considered as mapped if there are one or more active mappings for the SVM buffer irrespective of whether the mapped regions span the entire SVM buffer.

如果存在用于SVM缓冲器的一个或多个活动映射,则SVM缓冲器被认为是映射的,而不管映射的区域是否跨越整个SVM缓冲器。

The above note does not apply to fine-grained SVM buffers (fine-grained buffers allocated using clSVMAlloc or fine-grained system allocations).

​以上说明不适用于细粒度SVM缓冲区(使用clSVMAlloc或细粒度系统分配分配的细粒度缓冲区)。

To enqueue a command to indicate which device a set of ranges of SVM allocations should be associated with, call the function

要将命令排队以指示SVM分配范围的集合应与哪个设备相关联,请调用函数

// Provided by CL_VERSION_2_1
cl_int clEnqueueSVMMigrateMem(
    cl_command_queue command_queue,
    cl_uint num_svm_pointers,
    const void** svm_pointers,
    const size_t* sizes,
    cl_mem_migration_flags flags,
    cl_uint num_events_in_wait_list,
    const cl_event* event_wait_list,
    cl_event* event);

clEnqueueSVMMigrateMem is missing before version 2.1.

版本2.1之前缺少clEnqueueSVMMigrateMem。

  • command_queue is a valid host command-queue. The specified set of allocation ranges will be migrated to the OpenCL device associated with command_queue.

  • command_queue是一个有效的主机命令队列。指定的分配范围集将迁移到与command_queue关联的OpenCL设备。

  • num_svm_pointers is the number of pointers in the specified svm_pointers array, and the number of sizes in the sizes array, if sizes is not NULL.

  • num_svm_pointers是指定svm_pointers数组中的指针数,如果大小不为NULL,则是大小数组中的大小数。

  • svm_pointers is a pointer to an array of pointers. Each pointer in this array must be within an allocation produced by a call to clSVMAlloc.

  • ​svm_pointers是指向指针数组的指针。此数组中的每个指针都必须位于对clSVMAlloc的调用所产生的分配中。

  • sizes is an array of sizes. The pair svm_pointers[i] and sizes[i] together define the starting address and number of bytes in a range to be migrated. sizes may be NULL indicating that every allocation containing any svm_pointer[i] is to be migrated. Also, if sizes[i] is zero, then the entire allocation containing svm_pointer[i] is migrated.

  • size是一个大小数组。对svm_pointer[i]和sizes[i]一起定义要迁移的范围中的起始地址和字节数。size可以是NULL,指示要迁移包含任何svm_pointer[i]的每个分配。此外,如果sizes[i]为零,则迁移包含svm_pointer[i]的整个分配。

  • flags is a bit-field that is used to specify migration options. The Memory Migration Flags describes the possible values for flags.

  • ​flags是一个用于指定迁移选项的位字段。内存迁移标志描述了标志的可能值。

  • event_wait_list and num_events_in_wait_list specify events that need to complete before this particular command can be executed. If event_wait_list is NULL, then this particular command does not wait on any event to complete. If event_wait_list is NULLnum_events_in_wait_list must be 0. If event_wait_list is not NULL, the list of events pointed to by event_wait_list must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points. The context associated with events in event_wait_list and command_queue must be the same. The memory associated with event_wait_list can be reused or freed after the function returns.

  • event_wait_list和num_events_in_wait_list指定在执行此特定命令之前需要完成的事件。如果event_wait_list为NULL,则此特定命令不等待任何事件完成。如果event_wait_list为NULL,则num_events_in_wait_list必须为0。如果event_wait_list不为NULL,则event_wail_list指向的事件列表必须有效,并且num_events_in_wait_list必须大于0。event_wait_list中指定的事件充当同步点。与event_wait_list和command_queue中的事件关联的上下文必须相同。函数返回后,可以重用或释放与event_wait_list关联的内存。

  • event returns an event object that identifies this command and can be used to query or queue a wait for this command to complete. If event is NULL or the enqueue is unsuccessful, no event will be created and therefore it will not be possible to query the status of this command or to wait for this command to complete. If event_wait_list and event are not NULLevent must not refer to an element of the event_wait_list array.

  • event返回一个标识此命令的事件对象,可用于查询或排队等待此命令完成。如果事件为NULL或排队不成功,则不会创建任何事件,因此无法查询此命令的状态或等待此命令完成。如果event_wait_list和event不为NULL,则event不能引用event_wail_list数组的元素。

Once the event returned by clEnqueueSVMMigrateMem has become CL_COMPLETE, the ranges specified by svm pointers and sizes have been successfully migrated to the device associated with command-queue.

​一旦clEnqueueSVMMigrateMem返回的事件变为CL_COMPLETE,则svm指针和大小指定的范围已成功迁移到与命令队列关联的设备。

The user is responsible for managing the event dependencies associated with this command in order to avoid overlapping access to SVM allocations. Improperly specified event dependencies passed to clEnqueueSVMMigrateMem could result in undefined results.

​用户负责管理与该命令相关联的事件相关性,以避免对SVM分配的重叠访问。传递给clEnqueueSVMMigrateMem的指定不正确的事件依赖项可能会导致未定义的结果。

clEnqueueSVMMigrateMem returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clEnqueueSVMMigrateMem将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_COMMAND_QUEUE if command_queue is not a valid host command-queue.

  • CL_INVALID_COMMAND_QUEUE(如果command_queue不是有效的主机命令队列)。

  • CL_INVALID_OPERATION if the device associated with command_queue does not support SVM.

  • CL_INVALID_OPERATION(如果与command_queue关联的设备不支持SVM)。

  • CL_INVALID_CONTEXT if context associated with command_queue and events in event_wait_list are not the same.

  • 如果与command_queue相关联的上下文和event_wait_list中的事件不相同,则CL_INVALID_CONTEXT。

  • CL_INVALID_VALUE if num_svm_pointers is zero or svm_pointers is NULL.

  • 如果num_svm_pointers为零或svm_pointers为NULL,则CL_INVALID_VALUE。

  • CL_INVALID_VALUE if sizes[i] is non-zero range [svm_pointers[i], svm_pointers[i]+sizes[i]) is not contained within an existing clSVMAlloc allocation.

  • CL_INVALID_VALUE如果大小[i]是非零范围[svm_pointer[i],svm_pointer[i]+sizes[i])不包含在现有的clSVMAlloc分配中。

  • CL_INVALID_EVENT_WAIT_LIST if event_wait_list is NULL and num_events_in_wait_list > 0, or if event_wait_list is not NULL and num_events_in_wait_list is 0, or if event objects in event_wait_list are not valid events.

  • CL_INVALID_EVENT_WAIT_LIST如果event_wait_list 为NULL且num_events_in_wait_list>0,或者如果event_wait_list不为NULL且num_events_in_wait_list为0,或者如果event_wait_list中的事件对象不是有效事件。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐