การคูณเมทริกซ์ที่ชาญฉลาดสำหรับองค์ประกอบโลหะของ Apple (ผลิตภัณฑ์ Hadamard)

เป็นไปได้ไหมที่จะใช้ผลิตภัณฑ์ Hadamard โดยใช้ Metal Performance Shaders ของ Apple ฉันเห็นว่าการคูณเมทริกซ์ปกติสามารถทำได้โดยใช้ สิ่งนี้ แต่ฉันกำลังมองหาโดยเฉพาะ สำหรับการคูณตามองค์ประกอบ หรือวิธีที่ชาญฉลาดในการสร้างการคูณ (ตัวอย่างเช่น เป็นไปได้ไหมที่จะแปลง MPSMatrix เป็น MPSVector แล้วจึงดำเนินการผลิตภัณฑ์โดยใช้เวกเตอร์)

อัปเดต: ฉันขอขอบคุณคำแนะนำให้ใช้เชเดอร์! ฉันกำลังดำเนินการใช้งานอยู่ และดูเหมือนว่าจะมีแนวโน้มดี! ฉันจะโพสต์วิธีแก้ปัญหาเมื่อฉันมีบางอย่างทำงาน

mrplants 07.01.2019 แหล่งที่มา

comment

ทำไมไม่เขียนเชเดอร์ของคุณเองล่ะ? - Hamid Yusifli 07.01.2019

comment

MPSMatrix ได้รับการสนับสนุนโดย MTLBuffer ดังนั้นคุณจึงสามารถเขียนเคอร์เนลเล็กๆ น้อยๆ เพื่อทำการคูณได้ ฟังก์ชันเคอร์เนลจะใช้พารามิเตอร์บัฟเฟอร์สามตัว (สองเข้าหนึ่งออก) คุณอาจต้องระมัดระวังในการจัดทำดัชนีหากก้าวย่างของคุณไม่ตรงกับจำนวนคอลัมน์ของคุณ แต่นี่เป็นเรื่องเล็กน้อย (และ ขนานกันอย่างเขินอาย) - warrenm 08.01.2019

comment

มีMPSImageMultiplyด้วย - Matthijs Hollemans 08.01.2019

คำตอบ (2)

arrow_upward
2
arrow_downward

เอาล่ะ ตอบคำถามของฉันเองที่นี่ตามคำแนะนำของผู้แสดงความคิดเห็น ลองเขียนเชเดอร์ของตัวเองดูสิ!

นี่คือรหัสเชเดอร์:

#include <metal_stdlib>
using namespace metal;

/*
 hadamardProduct:
 Perform an element-wise multiplication (hadamard product) of the two input matrices A and B, store the result in C
 */
kernel void hadamardProductKernel(
texture_buffer<float, access::read> A [[texture(0)]],
texture_buffer<float, access::read> B [[texture(1)]],
texture_buffer<float, access::write> C [[texture(2)]],
uint gid [[thread_position_in_grid]]) {
    // C[i,j] = A[i,j] * B[i,j]
    C.write(A.read(gid) * B.read(gid), gid);
}

และความรวดเร็วที่รันเชเดอร์บนเมทริกซ์ 4x4 สองตัว:

import Foundation
import Metal
import MetalKit

guard
    let gpu = MTLCreateSystemDefaultDevice(),
    let commandQueue = gpu.makeCommandQueue(),
    let commandBuffer = commandQueue.makeCommandBuffer(),
    let defaultLibrary = gpu.makeDefaultLibrary(),
    let kernelFunction = defaultLibrary.makeFunction(name: "hadamardProductKernel")
else {exit(1)}

// Create the matrices to multiply (as row-major matrices)
var A:[Float] = [2,0,0,0,
                 0,2,0,0,
                 0,0,2,0,
                 0,0,0,2]

var B:[Float] = [1,0,0,0,
                 0,2,0,0,
                 0,0,3,0,
                 0,0,0,4]

let A_buffer = gpu.makeTexture(descriptor: MTLTextureDescriptor.textureBufferDescriptor(with: .r32Float,
                                                                                                                                                                                width: 16,
                                                                                                                                                                                resourceOptions: .storageModeManaged,
                                                                                                                                                                                usage: .shaderRead))
let B_buffer = gpu.makeTexture(descriptor: MTLTextureDescriptor.textureBufferDescriptor(with: .r32Float,
                                                                                                                                                                                width: 16,
                                                                                                                                                                                resourceOptions: .storageModeManaged,
                                                                                                                                                                                usage: .shaderRead))
let C_buffer = gpu.makeTexture(descriptor: MTLTextureDescriptor.textureBufferDescriptor(with: .r32Float,
                                                                                                                                                                                width: 16,
                                                                                                                                                                                resourceOptions: .storageModeManaged,
                                                                                                                                                                                usage: .shaderWrite))
A_buffer?.replace(region: MTLRegionMake1D(0, 16),
                  mipmapLevel: 0,
                  withBytes: UnsafeRawPointer(A),
                  bytesPerRow: 64)
B_buffer?.replace(region: MTLRegionMake1D(0, 16),
                  mipmapLevel: 0,
                  withBytes: UnsafeRawPointer(B),
                  bytesPerRow: 64)

let computePipelineState = try gpu.makeComputePipelineState(function: kernelFunction)
let computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.setComputePipelineState(computePipelineState)
computeEncoder?.setTexture(A_buffer, index: 0)
computeEncoder?.setTexture(B_buffer, index: 1)
computeEncoder?.setTexture(C_buffer, index: 2)
let threadGroupSize = MTLSize(width: 16, height: 1, depth: 1)
let threadGroupCount = MTLSize(width: 1, height: 1, depth: 1)
computeEncoder?.dispatchThreadgroups(threadGroupCount, threadsPerThreadgroup: threadGroupSize)
computeEncoder?.endEncoding()
commandBuffer.commit()
commandBuffer.waitUntilCompleted()

print("done")

ขอขอบคุณความคิดเห็นใด ๆ ที่เชื่อมโยงไปยังแหล่งข้อมูลเพื่อการเรียนรู้เพิ่มเติมเกี่ยวกับสิ่งนี้

mrplants 09.01.2019

arrow_upward
0
arrow_downward

ตัวเลือกอื่นคือการใช้ MTLBuffers (ในตัวอย่างของฉันฉันเก็บผลลัพธ์ไว้ในบัฟเฟอร์อินพุตแรก):

#include <metal_stdlib>
using namespace metal;

kernel void hadamardProductKernel(
    device float *a [[ buffer(0) ]],
    const device float *b [[ buffer(1) ]],
    uint id [[ thread_position_in_grid ]]
)
{
    a[id] = a[id] * b[id];
}

นี่คือรหัส Objective C ซึ่งดำเนินการผลิตภัณฑ์ Hadamard บนอาร์เรย์ float32 สองตัว (a->data และ b->data):

id<MTLLibrary> library = [device newDefaultLibrary];
id<MTLFunction> function = [library newFunctionWithName:@"hadamardProductKernel"];
id<MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer];
id<MTLComputePipelineState> computePipelineState = [device newComputePipelineStateWithFunction:function error:NULL];
id<MTLComputeCommandEncoder> computeCommandEncoder = [commandBuffer computeCommandEncoder];
[computeCommandEncoder setComputePipelineState:computePipelineState];
id<MTLBuffer> buffer_a = (__bridge id<MTLBuffer>)(a->data);
[computeCommandEncoder setBuffer:buffer_a offset:0 atIndex:0];
id<MTLBuffer> buffer_b = (__bridge id<MTLBuffer>)(b->data);
[computeCommandEncoder setBuffer:buffer_b offset:0 atIndex:1];
MTLSize threadGroupSize = MTLSizeMake(<<ELEMENTS COUNT HERE>>, 1, 1);
MTLSize threadGroupCount = MTLSizeMake(1, 1, 1);
[computeCommandEncoder dispatchThreadgroups:threadGroupSize threadsPerThreadgroup:threadGroupCount];
[computeCommandEncoder endEncoding];
[commandBuffer commit];
[commandBuffer waitUntilCompleted];

Kyrylo Polezhaiev 24.07.2020

การคูณเมทริกซ์ที่ชาญฉลาดสำหรับองค์ประกอบโลหะของ Apple (ผลิตภัณฑ์ Hadamard)

คำตอบ (2)

คำถามในหัวข้อ