• generic
    • os_x

      Currently we create a new MTLCommandBuffer each time when Prism makes a draw call: This results in several MTLCommandBuffers being committed per frame.
      but we should have one or two MTLCommandBuffers per frame, as per the best practices suggested by Apple here :
      https://developer.apple.com/library/archive/documentation/3DDrawing/Conceptual/MTLBestPracticesGuide/CommandBuffers.html

      We need to follow this guideline to improve performance.
      Below is an performance observation of basic trial of this change.
      Changes:
      1. Single CommandBuffer for each frame.
      2. Remove [commandBuffer waitUntilCompleted]
      3. Remove for loop for copying rtt data ( in Java_com_sun_prism_mtl_MTLRTTexture_nReadPixelsFromContextRTT) : displays black screen

      Tests execution and observation:
      40,000 Rectangles:
      ES2: Rectangle (Objects Frames FPS), 40000, 148, 14.741
      MTL: Rectangle (Objects Frames FPS), 40000, 174, 17.376 ( with all 3 changes above )
      MTL: Rectangle (Objects Frames FPS), 40000, 172, 17.102 ( with only 1 and 2 changes above )

      10,000 Rectangles:
      ES2: Rectangle (Objects Frames FPS), 10000, 548, 54.793
      MTL: Rectangle (Objects Frames FPS), 10000, 552, 55.114 ( with all 3 changes above )
      MTL: Rectangle (Objects Frames FPS), 10000, 454, 45.370 ( with only 1 and 2 changes above )

      more about above three tasks
      1. Single CommandBuffer for each frame-> We might need more than one CommandBuffer per frame. We need to identify all the scenarios which would lead to committing CommandBuffer.
      2. Remove [commandBuffer waitUntilCompleted] -> We need to use the CompletionHandler provided by MTLCommandBuffer : This will need some synchronisation logic among, blit from rtt to CAMetalLayer, Handling completion handlers, committing a new command buffer
      3. Remove for loop for copying rtt data: we will eventually use BlitEncoder.
      With above changes done correctly, Metal should be close to ES2 for large data.

      After that we can re-evaluate FPS for smaller data. ( which should definitely be more than current )

            arapte Ambarish Rapte
            arapte Ambarish Rapte
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: