VkCmdDrawIndexedIndirectCount Func… | Apple Developer Forums

  • Apple Developer
  • Apple Developer
Forums Search for a topic, subtopic, or tag Clear search query Local Nav Open Menu Local Nav Close Menu
  • Search
Post Profile
  • Sign in
  • Create account
vkCmdDrawIndexedIndirectCount functionality under Metal App & System Services Hardware External Graphics Processors Metal Graphics and Games 3D Graphics You’re now watching this thread. If you’ve opted in to email or web notifications, you’ll be notified when there’s activity. Click again to stop watching or visit your profile to manage watched threads and notifications. You’ve stopped watching this thread and will no longer receive emails or web notifications when there’s activity. Click again to start watching. frustumo OP Created May ’21 Replies 9 Boosts 0 Views 3.1k Participants 3 Hello,It looks like my previous question was closed without being resolved.https://developer.apple.com/forums/thread/668171There are FPS values from our new benchmark.Indirect command buffers are not working properly.So there is no way to emulate multi-draw indirect countfunctionality other than a loop of draw indirect commands. As you can see below, the same hardware is working three times slower under Metal because of it. And Apple M1 performance is worse than AMD integrated graphics performance.We have a buffer with multiple draw commands. How should we render it efficiently under Metal?AMD Vega 56 eGPU:Direct3D12: 94.0Direct3D11: 87.2Vulkan: 91.1Metal: 35.8AMD Ryzen™ 7 4800H:Direct3D12: 21.1Direct3D11: 19.4Vulkan: 20.5Apple M1:Metal: 16.9Thank you Boost Copy to clipboard Share this post Copied to Clipboard Replies 9 Boosts 0 Views 3.1k Participants 3 Graphics and Games Engineer OP Apple May ’21 What problems are you seeing with indirect command buffers? It would be helpful if you create a request via Feedback Assistant to see what might be going on. 0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment frustumo OP May ’21 Everything is described in the original post. The main problem is performance because even a loop of draw indirect is faster than an indirect command buffer:https://www.icloud.com/iclouddrive/0ICuhBkHgGuLjCxaJwRyHoLmw#execute_commands_in_bufferhttps://www.icloud.com/iclouddrive/0hDo_q0oXs4uzC25yZdKmL83A#multiple_draw_indirectI made Feedback Assistant more than half of year ago. There was no answer. After that, I wrote here.Thank you! 0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment Graphics and Games Engineer OP Apple Jun ’21

Hi frustumo,

I checked on each of the tickets mentioned in the last thread.

  • FB8254449 - Still under investigation, but as mentioned in the other thread, you should be able to use ICBs although it sounds like you weren't able to get the performance you wanted.
  • FB8638856 - Closed because you created 2 other FBA requests to the separate issues there.
  • FB8928674 - Got stuck because I guess the driver engineer thought he needed an Xcode project, I just pointed out that you attached a reproducer and am trying to get the driver team to look at it again.
  • FB8928678 - Was looked at some. One engineer suspects the perf issue is due to the low bandwidth to the eGPU and the GPU fetching ICB over the Thunderbolt bus. There no way for you to control the location of the ICB though, so this would be something the GPU driver team needs to handle. I have pushed this to the AMD driver team to look at. I'm asking what else you may be able to do in the mean time.
0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment frustumo OP Jun ’21

Thank you for your answer.

I have created a new FB9127527 issue with the benchmark and more information inside.

0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment Graphics and Games Engineer OP Apple Jun ’21

Would be able to provide an Xcode project to reproduce the issue?

0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment frustumo OP Jun ’21

There are links inside FB9127527 to a notarized application for macOS, Windows, and Linux. And multiple simple tests to reproduce the problem on macOS in other FB. Thank you.

3 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment Alecazam OP Jun ’21

So there is no way to emulate multi-draw indirect count

There is. But you have to call drawPrimitive or drawIndexedPrimitive multiple times, each one indexing into the next indirect draw in the buffer. I don't know why Metal left the drawCount out of the api, but the current implementation has a drawLimit of 1. Nice thing is indirect draw works back to iPhone 5S.

You can even do GPU buffers with compute, and then indirect draw them. But you have to call draw 10 times, even if compute culls and produces 5 results, so make sure to set numInstances to 0 on the remaining draws. Or if you can wait a frame, then you could return a count to the cpu.

1 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment frustumo OP Jul ’21

This is the same eGPU hardware with 3 times lower performance under Metal: https://gravitymark.tellusim.com/report/?id=bc453e851c5dede3cedef6c3ac9caca2f8dffa47 https://gravitymark.tellusim.com/report/?id=7f1b799adc588938fc02f140a2ee48dbd4f36e69

0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment frustumo OP Nov ’21

ICB is working stable with the last OS updates. We have updated our macOS benchmark and released the iOS version: https://apps.apple.com/us/app/gravitymark-gpu-benchmark/id1595186532 ICB is giving a 2.5 performance boost in comparison with the previous version. Thank you for the great improvements.

0 comments 0 Copy to clipboard Share this post Copied to Clipboard Load more Add comment vkCmdDrawIndexedIndirectCount functionality under Metal First post date Last post date Q

Từ khóa » Vulkan Indirect Drawing