Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unpredictable return time for currentDrawable

I’m writing a view to plot real time data in Metal. I’m drawing the samples using point primitives, and I’m triple buffering both the vertices, and the uniform data. The issue I’m having is that the time it takes for a call to currentDrawable to return seems to be unpredictable. It’s almost as if sometimes there are no drawables ready, and I have to wait a whole frame for one to become available. Usually the time for currentDrawable to return is ~0.07 ms (which is about what I would expect), but other times it’s a full 1/60 s. This causes the whole main thread to block, which is to say the least not very desirable.

I’m seeing this issue on an iPhone 6S Plus and an iPad Air. I have not yet seen this behavior an the Mac (I have a 2016 MPB with an AMD 460 GPU). My guess is that this somehow has to do with the fact that the GPUs in iOS devices are TBDR-based. I don’t think I’m bandwidth constrained, because I get the exact same behavior no matter how many or how few samples I’m drawing.

To illustrate the issue I wrote a minimal example that draws a static sine wave. This is a simplified example as I normally would have memcpy’ed the samples into the current vertexBuffer just like I do with the uniforms. This is why I’m triple buffering the vertex data as well as the uniforms. It’s still enough to illustrate the problem though. Just set this view as your base view in a storyboard, and run. On some runs it works just fine. Other times currentDrawable starts out with return time of 16.67 ms, then after a few seconds jumps to 0.07 ms, then after a while back to 16.67. It seems to jump from 16.67 to 0.07 if you rotate the device for some reason.

MTKView Subclass

import MetalKit

let N = 500

class MetalGraph: MTKView {
    typealias Vertex = Int32

    struct Uniforms {
        var offset: UInt32
        var numSamples: UInt32
    }

    // Data
    var uniforms = Uniforms(offset: 0, numSamples: UInt32(N))

    // Buffers
    var vertexBuffers  = [MTLBuffer]()
    var uniformBuffers = [MTLBuffer]()
    var inflightBufferSemaphore = DispatchSemaphore(value: 3)
    var inflightBufferIndex = 0

    // Metal State
    var commandQueue: MTLCommandQueue!
    var pipeline: MTLRenderPipelineState!


    // Setup

    override func awakeFromNib() {
        super.awakeFromNib()

        device = MTLCreateSystemDefaultDevice()
        commandQueue = device?.makeCommandQueue()
        colorPixelFormat = .bgra8Unorm

        setupPipeline()
        setupBuffers()
    }

    func setupPipeline() {
        let library = device?.newDefaultLibrary()

        let descriptor = MTLRenderPipelineDescriptor()
        descriptor.colorAttachments[0].pixelFormat = .bgra8Unorm
        descriptor.vertexFunction   = library?.makeFunction(name: "vertexFunction")
        descriptor.fragmentFunction = library?.makeFunction(name: "fragmentFunction")

        pipeline = try! device?.makeRenderPipelineState(descriptor: descriptor)
    }

    func setupBuffers() {
        // Produces a dummy sine wave with N samples, 2 periods, with a range of [0, 1000]
        let vertices: [Vertex] = (0..<N).map {
            let periods = 2.0
            let scaled = Double($0) / (Double(N)-1) * periods * 2 * .pi
            let value = (sin(scaled) + 1) * 500 // Transform from range [-1, 1] to [0, 1000]
            return Vertex(value)
        }

        let vertexBytes  = MemoryLayout<Vertex>.size * vertices.count
        let uniformBytes = MemoryLayout<Uniforms>.size

        for _ in 0..<3 {
            vertexBuffers .append(device!.makeBuffer(bytes: vertices,  length: vertexBytes))
            uniformBuffers.append(device!.makeBuffer(bytes: &uniforms, length: uniformBytes))
        }
    }



    // Drawing

    func updateUniformBuffers() {
        uniforms.offset = (uniforms.offset + 1) % UInt32(N)

        memcpy(
            uniformBuffers[inflightBufferIndex].contents(),
            &uniforms,
            MemoryLayout<Uniforms>.size
        )
    }

    override func draw(_ rect: CGRect) {
        _ = inflightBufferSemaphore.wait(timeout: .distantFuture)

        updateUniformBuffers()

        let start = CACurrentMediaTime()
        guard let drawable = currentDrawable else { return }
        print(String(format: "Grab Drawable: %.3f ms", (CACurrentMediaTime() - start) * 1000))

        guard let passDescriptor = currentRenderPassDescriptor else { return }

        passDescriptor.colorAttachments[0].loadAction = .clear
        passDescriptor.colorAttachments[0].storeAction = .store
        passDescriptor.colorAttachments[0].clearColor = MTLClearColorMake(0.2, 0.2, 0.2, 1)

        let commandBuffer = commandQueue.makeCommandBuffer()

        let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: passDescriptor)
        encoder.setRenderPipelineState(pipeline)
        encoder.setVertexBuffer(vertexBuffers[inflightBufferIndex],  offset: 0, at: 0)
        encoder.setVertexBuffer(uniformBuffers[inflightBufferIndex], offset: 0, at: 1)
        encoder.drawPrimitives(type: .point, vertexStart: 0, vertexCount: N)
        encoder.endEncoding()

        commandBuffer.addCompletedHandler { _ in
            self.inflightBufferSemaphore.signal()
        }
        commandBuffer.present(drawable)
        commandBuffer.commit()

        inflightBufferIndex = (inflightBufferIndex + 1) % 3
    }
}

Shaders

#include <metal_stdlib>
using namespace metal;

struct VertexIn {
    int32_t value;
};

struct VertexOut {
    float4 pos [[position]];
    float pointSize [[point_size]];
};

struct Uniforms {
    uint32_t offset;
    uint32_t numSamples;
};

vertex VertexOut vertexFunction(device   VertexIn *vertices [[buffer(0)]],
                                constant Uniforms *uniforms [[buffer(1)]],
                                uint vid [[vertex_id]])
{
    // I'm using the vertex index to evenly spread the
    // samples out in the x direction
    float xIndex = float((vid + (uniforms->numSamples - uniforms->offset)) % uniforms->numSamples);
    float x = (float(xIndex) / float(uniforms->numSamples - 1)) * 2.0f - 1.0f;

    // Transforming the values from the range [0, 1000] to [-1, 1]
    float y = (float)vertices[vid].value / 500.0f - 1.0f ;

    VertexOut vOut;
    vOut.pos = {x, y, 1, 1};
    vOut.pointSize = 3;

    return vOut;
}

fragment half4 fragmentFunction() {
    return half4(1, 1, 1, 1);
}

Possibly related to this: In all the examples I’ve seen, inflightBufferSemaphore is incremented inside the commandBuffer’s completionHandler, just before the semaphore is signaled (which makes sense to me). When I have that line there I get a weird jittering effect, almost as if the framebuffers are being displayed out of order. Moving this line to the bottom of the draw function fixes the issue, although it doesn’t make a lot of sense to me. I’m not sure if this is related to currentDrawable’s return time being so unpredictable, but I have a feeling these two issues are emerging from the same underlying problem.

Any help would be very much appreciated!

like image 241
vegather Avatar asked Feb 05 '17 01:02

vegather


2 Answers

[T]he time it takes for a call to currentDrawable to return seems to be unpredictable. It’s almost as if sometimes there are no drawables ready, and I have to wait a whole frame for one to become available.

Um, yes. This is explicitly documented. From the Metal Programming Guide:

Important: There are only a small set of drawable resources, so a long frame rendering time could temporarily exhaust those resources and cause a nextDrawable method call to block its CPU thread until the method is completed. To avoid expensive CPU stalls, perform all per-frame operations that do not need a drawable resource before calling the nextDrawable method of a CAMetalLayer object.

From the docs for CAMetalLayer.nextDrawable():

Calling this method blocks the current CPU thread until a new drawable object is available. There are only a small set of drawable resources, so a long GPU frame time could temporarily exhaust those resources, forcing this call to block until GPU rendering is complete. For best results, schedule your nextDrawable() call as late as possible relative to other per-frame CPU work.

Beyond that, there's something weird about your code. You are requesting the currentDrawable, but you're not doing anything with it. The currentRenderPassDescriptor is automatically configured to use the texture of the currentDrawable. So, what happens if you simply don't request the currentDrawable yourself?

like image 170
Ken Thomases Avatar answered Oct 13 '22 00:10

Ken Thomases


I solved this by adding an @autoreleasepool so the drawable is released as soon as I am done with it.

Note the use of @autoreleasepool here:

https://developer.apple.com/library/archive/documentation/3DDrawing/Conceptual/MTLBestPracticesGuide/Drawables.html

like image 33
TJez Avatar answered Oct 13 '22 00:10

TJez