Presentations and manuals - AMD GPUOpen
PERFORMANT REFLECTIVE BEAUTY: HYBRID RAY TRACED REFLECTIONS IN FAR CRY 6 STEPHANIE BRENHAM, UBISOFT TORONTO IHOR SZLACHTYCZ, AMD [AMD Official Use Only] REFLECTIONS BACK ON THE UBISOFT TEAM Stephanie Brenham · 3D Team Lead Programmer · Far Cry 6 · The Games Awards Future Class 2021 · Maya & mental ray for Maya Anton Remezenko · Senior 3D programmer · Far Cry 6 · Far Cry 5 Aleksei Shevchenko 3D programmer Mikhail Shostak 3D programmer AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 2 [AMD Official Use Only] FAR CRY 6 · FPS set in a fictional tropical island · Humid tropical setting · See "Simulating Tropical Weather in Far Cry 6" · Lots of wet and highly reflective surfaces AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 3 [AMD Official Use Only] AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 4 [AMD Official Use Only] REFLECTIONS IN FAR CRY 6 Generate Rays *SSLR Ray Trace & Lighting Particles Ray Trace Integrate 1. Generate Ray Buffer 2. SSLR 3. HW Ray trace 4. Particles HW Ray trace 5. Integrate results *SSLR based on Tomasz Stachowiak "Stochastic Screen Space Reflections" at SIGGRAPH 2015 AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 5 [AMD Official Use Only] IMPORTANCE SAMPLING Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V BRDF · GGX BRDF to obtain reflection lobe · Importance sampling reduction · Pre-generated Hammersley sequence · Store rays in buffer for later use Problem: Still too many rays per texel AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 6 [AMD Official Use Only] SOLUTION: REUSE NEIGHBOURS Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V BRDF · Wanted: one ray per texel per frame · Close neighbors use similar IS with Hammersley distribution · Collect hit points from neighbors' rays · reuse these values in the BRDF integration AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 7 [AMD Official Use Only] RAY BUFFER TO BE USED LATER Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V L Shrink Shrink Hemisphere Rays Buffer magnification Store in half resolution ray buffer AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 8 [AMD Official Use Only] SCREEN SPACE LOCAL REFLECTIONS Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · SS trace faster than HW ray tracing · SSLR is fallback while BVH builds · Green shows where we use SSLR Problem: SSLR found to be unstable with the limited trace steps AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 9 [AMD Official Use Only] SOLUTION: LINEAR TRACE WITH REFINE Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · Replaced Hierarchical Z with Linear Trace · 64 large steps, 8 refine steps · Stable with camera rotation · Same performance as Hierarchical Z Problem: Edges were being missed AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 10 [AMD Official Use Only] SSLR LINEAR TRACE EDGE CASES Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · The larger linear traces could step over edges · No geometry detected: edges are lost AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 11 [AMD Official Use Only] SOLUTION: RANDOM RAY OFFSETS Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · Random ray start offsets · Edges more stable Wanted: Better performance for glossy reflections AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 12 [AMD Official Use Only] SS TILE CLASSIFICATIONS Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · Smooth surfaces like chrome need higher precision · Fast trace green tiles · High precision trace red tiles AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 13 [AMD Official Use Only] SS FAST CONE TRACE Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · Cone angle varies by roughness · Select the mip of the depth map according to number of pixels in the circle Performant glossy reflection AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 14 [AMD Official Use Only] HW RAY TRACE ACCELERATION STRUCTURES Generate Rays SSLR Ray Trace & Lighting Particles Trace How to divide up all this space? · BVH is the industry standard Traversing the BVH · TLAS and BLAS Integrate AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 15 [AMD Official Use Only] HW RAY TRACING ONLY IN NECESSARY PLACES Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · Far distances: Only SSLR · BVH stops far from player · Close distances: Only HW RT · Likely to be offscreen reflections · Mid distances: it depends... AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 16 [AMD Official Use Only] REFLECTING WITH CONFIDENCE: SSLR Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · Changes in neighboring pixel depth are normally small · Large depth changes in neighbouring pixels · Missing offscreen details · Low SSLR confidence AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 17 [AMD Official Use Only] REFLECTING WITH CONFIDENCE: HW RAY TRACING Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · HW Ray trace confidence 100% close to player · Confidence gradually decreases to 0% at edge of BVH · Prevents popping AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 18 [AMD Official Use Only] REFLECTING WITH CONFIDENCE: HYBRID Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · Green is full confidence SSLR · Use only SSLR · Teal is mid confidence SSLR · Hybrid: use trace with highest confidence · Blue is no confidence SSLR · Use only HW RT · Other reflections: no confidence in SSLR nor HW RT · Use environment map reflections See "Simulating Tropical Weather in Far Cry 6" for more details on environment maps AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 19 [AMD Official Use Only] PEEK BEHIND THE CURTAIN AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 20 [AMD Official Use Only] RAYTRACE REFLECTION GRAPH GPU Async Build AS GPU Prepare CPU CPU CPU CPU CPU Render Thread Skinning Visibility Visibility Collect Clusters Collect Trees Collect Instances Primitives Build AS Visibility Collect Terrain Collect Particles Visibility BLAS requests TLAS request Skinning Primitives IS Gen Rays Particle AS ... Sync AS SSLR Trace ... DXR Trace DXR Particles Integrate AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 21 [AMD Official Use Only] BVH TREE TLAS BLAS 0 Inst ... Inst BLAS N Inst BLAS 0 VB0 ... VB31 IB0 ... IB31 BLAS N VB0 ... VB31 IB0 ... IB31 Frame N-X BLAS 0 ... BLAS 11 Frame N-1 BLAS 0 ... BLAS 11 Frame N BLAS 0 ... BLAS 11 VB Reskin, BLAS Rebuild · One BLAS per Object (or Material) · Tip: Per object is preferred · Static BLASes are built once · Skinned BLASes need endless rebuild loop · First person player character rebuilds each frame · Player weapon & attachments as well · 16 BLASes rebuild per frame (static/skinned) · Max 12 skinned AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 22 [AMD Official Use Only] HW RAY TRACING & LIGHTING TOGETHER Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate GPU HW Ray Trace Lighting each Material GPU Prebuild & Unify primitives HW Ray Trace & Lighting · Lighting shader and ray trace shader in one · Parameters for lighting unified and stored per vertex · Each material has specific shader to map its lighting to the unified lighting parameters · Object material parsed on BLAS creation stage · Performance hit of shader variations is a one-time cost during BLAS creation AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 23 [AMD Official Use Only] LIGHTING IN THE HW RAY TRACING PIPELINE Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · Graphics pipeline: interpolated values provided by the Rasterizer to Pixel Shaders · Ray tracing pipeline: interpolation done in the lighting shader · Unified lighting parameters stored in DXR primitives AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 24 [AMD Official Use Only] PRIMITIVE DATA Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate DXR Prim, 24byte DXR Vertex 0 - albedo, RGB8 - smoothness, A8 - normal(oct), RG8 - metallic, B8 - reflectance, A8 DXR Vertex 1 DXR Vertex 2 · Unwrap Vertex and Index buffers to Primitives before BLAS created · Causes duplication but ensures better coherency for interpolation · Index buffer not needed for lighting computations · Removes an additional indirection · Vertex data is packed AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 25 [AMD Official Use Only] PRIMITIVE POOL Generate Rays SSLR Ray Trace & Lighting Particles Trace · Primitive Buffer holds all primitive information from every BLAS · Primitive Offset Buffer redirection to primitives for BLAS instances in TLAS · Required to locate primitive data using the PrimitiveIndex supplied by DXR · Fixed Geometry count per BLAS max 32 Primitive Offset Buffer BLAS VB0 ... VB31 IB0 ... IB31 BLAS instance N G0 ... G31 BLAS instance N+1 G0..G31 BLAS instance N+2 G0..G31 Primitive Buffer, 192MB Primitives0 ... Primitives31 Integrate AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 26 [AMD Official Use Only] LIGHTING THE PRIMITIVE Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate V0 N V V1 V2 · Interpolate primitive vertex data using barycentric coordinates · Contributors to lighting the primitive · Direct Lighting(Sun) · Indirect Lighting(GI, Env Maps) · Shadows Future Work: support dynamic lights AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 27 [AMD Official Use Only] HW RAY TRACE BARYCENTRIC COORDINATES AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 28 [AMD Official Use Only] HW RAY TRACE PRIMITIVES LIGHTING AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 29 [AMD Official Use Only] PARTICLES IN FAR CRY 6 Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate · Far Cry games are known for · explosions · dynamic systemic fire · New for Far Cry 6: Poison · Particle reflections important for Far Cry 6 Problem: Particles are just 2D billboards AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 30 [AMD Official Use Only] PARTICLES HARDWARE RAY TRACE Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · Reuse ray length from SSLR/HW RT · clips invisible particles · BLAS Instance per Particle · 3 BLAS instances in this diagram · Collect all particles along the ray for blending Performance Tip: Build separate TLAS for particles AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 31 [AMD Official Use Only] PARTICLES: THE CASE OF THE MISSING SPRITES Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V 2D sprites are oriented to be visible to camera Problem: Sprites are often not visible to the reflection ray AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 32 [AMD Official Use Only] SOLUTION: EXTRA QUADS Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V Extra quads aligned by local space coordinate system planes(X,Y) for each particle Problem: expensive to test intersection on all the quads AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 33 [AMD Official Use Only] SOLUTION: MASK QUADS BY VIEW SPACE AXIS Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate N V · Quads masked by view space axis direction in BVH · Select quad perpendicular to ray · Simplified lighting - No Env Map - Low Quality Shadows AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 34 [AMD Official Use Only] SSLR PARTICLES RAY TRACED PARTICLES AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 35 [AMD Official Use Only] INTEGRATE Generate Rays V SSLR Ray Trace & Lighting V Particles Trace Integrate R L Frame 0,2,4... Frame 1,3,5... · Reuse neighbors' color results to integrate rays from multiple directions · 9 neighbors per frame · Central pixel required for mirror reflections · used every frame · Weighted by BRDF coefficient AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 36 [AMD Official Use Only] REFLECTION MOTION Generate Rays SSLR Ray Trace & Lighting Particles Trace Current frame N V R Integrate AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 37 [AMD Official Use Only] REFLECTION MOTION : REPROJECTION Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate Current frame N V R · Usually calculated with reprojection - Reproject reflection result through surface plane and get position in World Space AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 38 [AMD Official Use Only] REFLECTION MOTION: PREVIOUS FRAME Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate Current frame Previous frame N V R · Usually calculated with reprojection - Reproject reflection result through surface plane and get position in World Space - Put World Space position into Screen Space using previous frame's camera AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 39 [AMD Official Use Only] REFLECTION MOTION: BACK IN SCREEN SPACE Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate Current frame Previous frame Current Previous N V R · Usually calculated with reprojection - Reproject reflection result through surface plane and get position in World Space - Put World Space position into Screen Space using previous frame's camera - Take the difference in Screen Space · For objects that move with camera · Use GBuffer motion AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 40 [AMD Official Use Only] TEMPORAL ACCUMULATION Generate Rays SSLR Ray Trace & Lighting Particles Trace Integrate INVALID Previous frame reflection motion VALID Current frame reflection motion · Unlike TAA: history is not rejected based on motion vector length · Ignoring large reflection motion vectors can result in specular explosions on rough surfaces · History validated by reflection motion derivative/differential · Discard incoherent reflection motion AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 41 [AMD Official Use Only] SSLR AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 42 [AMD Official Use Only] SSLR/HW RAY TRACE MASK AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 43 [AMD Official Use Only] HYBRID RAY TRACED REFLECTIONS AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 44 OPTIMIZING RAYTRACING IN FAR CRY 6 [AMD Official Use Only] REFLECTIONS AT THE AMD TEAM · Ihor Szlachtycz (presenter) · AMD GPU Dev Tech for Far Cry 6 · Zhuo Chen · AMD GPU Dev Tech for Far Cry 6 · John Hartwig · AMD CPU Dev Tech for Far Cry 6 AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 46 [AMD Official Use Only] OPTIMIZATION OVERVIEW · We will be going over how Hybrid Reflections raytracing was optimized for Far Cry 6 · Shader table management · Hit shader design · BVH building · All performance captures for Far Cry 6 were done on a Radeon 6800XT GPU with a Ryzen 3970X Threadripper · But first, we will give a quick overview of Radeon GPU Profiler (RGP) AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 47 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 48 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW Wavefront occupancy AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 49 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW Wavefront occupancy Graphics wavefronts AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 50 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW Wavefront occupancy Async compute wavefronts AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 51 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW High Frequency Counters AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 52 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW Selected event Event timeline (draws, dispatches, barriers etc) AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 53 [AMD Official Use Only] RADEON GPU PROFILER (RGP) OVERVIEW Event details AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 54 [AMD Official Use Only] GENERAL RAYTRACING PERFORMANCE · Here we have an overview of what a RGP trace looks like with Hybrid RayTraced Reflections Shadows BVH Building DXR RayTrace AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 55 [AMD Official Use Only] RAY TRACING DIVERGENCE · Ray tracing is a latency heavy operation, very affected by divergence · We look at below forms of divergence and how Far Cry 6 tackles them · Ray Divergence · Shader Table Divergence · Resource Divergence Traverse Traverse Traverse Get Read Shade Shader Resource Entry Inputs Write Output AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 56 [AMD Official Use Only] RAY DIVERGENCE · Different rays can hit different BLAS's with number of levels that the ray has to traverse · Executing shader will have to traverse for the worst case 2 4 3 Wave 243... 1 32... 021 ... 01 0... 000... Traverse Traverse Traverse Traverse AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 57 [AMD Official Use Only] SHADER TABLE DIVERGENCE · Occurs when different rays evaluate different shader table entries for hit/miss · We have to evaluate all the hit materials in our thread group sequentially, like a branch Wave ... Wave ... Eval Eval Eval Eval AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 58 [AMD Official Use Only] RESOURCE DIVERGENCE · Occurs when a shader issues a resource read, but different threads in the wave can be accessing different resources · We need to loop through all unique resources in our wave and have each issue the read for the threads using that resource · In HLSL, this is done via NonUniformResourceIndex ABC D... Texture Descriptor Array 0 1 2 3 ... Threads Textures[textureId].Sample Read Texture A Read Texture B Read Texture D ... Texture.Sample Read Texture AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 59 [AMD Official Use Only] RAY TRACING HIT EXECUTION · Small shader table: · Everything gets inlined and the code behaves like a series of if statements · Large shader table: · Too many shaders to inline, compiler will emulate a function calling convention with loading/storing function arguments/return values in LDS · hit_table_index does not have to be uniform for a thread group AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 60 [AMD Official Use Only] SHADER TABLES (NON INLINED) · Example of what you will see in RGP when your shader table isn't inlined · Lots of LDS usage, instruction cache misses, wasted instructions for writing/loading function arguments · On a RadeonTM 6800XT at 4K, forcing the shader table to not be inlined is about 2-3x slower Excess instructions and LDS Traffic AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 61 [AMD Official Use Only] FAR CRY 6 SHADER TABLE · Far Cry 6 has a single hit shader which interpolates vertex attributes that store all material properties · Avoids lots of shader table divergence with only 1 hit and 1 empty miss shader · Avoids resource divergence since all DXR Prims are in global buffers · No per material textures get accessed · Ray divergence is partially mitigated by BVH building · We will get to this in next section AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 62 [AMD Official Use Only] AMD RAYTRACING PERFORMANCE GUIDE · You can find our performance recommendations at https://gpuopen.com/performance/ · We have raytracing and BVH recommendations in the Ray Tracing section AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 63 BVH OVERVIEW [AMD Official Use Only] BLAS VERTEX GENERATION · Far Cry 6 uses compute shader to generate vertices for BLAS building instead of a VS/GS/HS/DS · Can't do per vertex/object culling if done in a geometry pass · Avoids redundancy with vertex cache miss · Compute shader in Far Cry 6 stores material data per vertex which are calculated via reading material textures of the simplified models at the vertices · Vertices generated by CS for BLAS can be reused for shadows and potentially other passes Culled Geometry Vertex Cache Miss VS Write No Yes GS Write No Yes HS/DS Write No Yes CS Write N/A No AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 65 [AMD Official Use Only] BVH BUILDING · Looking at the RGP trace again, you can see that BVH building is executed on async compute Async BVH Building Reflection Raytracing AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 66 [AMD Official Use Only] ASYNC BVH BUILDING · BLAS/TLAS generation doesn't usually depend on other parts of the frame until you start ray tracing · Can overlap with Gbuffer pass/shadows/early frame workloads · Below is an image of non overlapped BVH generation (0.4-0.8ms slower): BVH Building Reflection Raytracing AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 67 [AMD Official Use Only] ASYNC BVH BUILDING · Low amounts of synchronization are needed between graphics and compute queue · On PC, sync between graphics and compute queue is higher than console · Far Cry 6 has one async compute command list with all the BVH building calls and syncs to just before SSLR AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 68 [AMD Official Use Only] ASYNC BVH BUILDING · Some frames have much higher BVH building times vs others, which async helps hide · Make sure you launch async early enough to cover the variance AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 69 [AMD Official Use Only] BVH TRAVERSAL COST · Rays that miss geometry still pay the traversal cost for the TLAS hierarchy · Part of why having particles in separate TLAS gave good perf wins · Best to avoid putting objects into TLAS if they are far away or unlikely to affect rendering result · Camera is usually deep in a tree, so all rays can be affected by the higher traversal cost Ray Divergence AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 70 [AMD Official Use Only] BVH TRAVERSAL COST · Far Cry 6 limits the objects that are in the TLAS to only a radius around the player · Terrain clip mapped is contained with a longer range · BLAS objects are at a lower LOD, less nodes to traverse to get to triangles · More correct rendering for distance objects, fully detailed objects would alias · Helps address ray divergence AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 71 [AMD Official Use Only] HYBRID REFLECTIONS BLAS/TLAS TIPS · Consider using CS for vertex generation for BLAS · Far Cry 6 generates BLAS positions and DXR Prim attributes in a single CS pass · We recommend trying async compute for BLAS/TLAS building · Easy to overlap with early/mid parts of your frame, big perf win for Far Cry 6 · Try to limit amount of geometry in your TLAS to avoid cost of traversing large TLASes per frame · For Far Cry 6 reflections, only objects within a radius of 100 are put into the BVH · Consider having lower LODs for reflection BLASes for faster tracing and lower VRAM usage · Far Cry 6 uses a constant lower LOD for objects in BLASes AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 72 HYBRID REFLECTIONS SAMPLE [AMD Official Use Only] HYBRID REFLECTIONS OVERVIEW SSLR Confidence Far Cry 6 Reflections Tracing SSLR HybridRays DXR Rays Far Cry 6 Denoiser Hybrid Sample Reflections Tracing SSLR HybridRays DXR Rays FidelityFX-Denoiser Hit/Miss counters Variance AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 74 [AMD Official Use Only] FEEDBACK COUNTERS · We store a feedback counter for each set of 8x8 pixels (1 tile) · The counter consists of one U32 with hit/miss counters for 2 frames of statistics U32 new_hitcounter:8 | old_hitcounter:8 | new_misscounter:8 | old_misscounter:8 · When deciding if a ray should be Hybrid or DXR, we use reprojected counters from previous frame 1 U32 per 8x8 tile of pixels AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 75 [AMD Official Use Only] FEEDBACK COUNTERS · Reproject counters by picking a random motion vector in the 8x8 tile · We can switch pixels from SSLR to DXR, also need a way to switch from DXR to SSLR · Problem is all our DXR rays will always hit something with high confidence · We solved this by randomly converting pixels in tiles from DXR to Hybrid Reprojection: AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 76 [AMD Official Use Only] CLASSIFICATION For HW rays roll a dice 0..1 to turn them into hybrid. Hybrid pixels RT pixels Successful screen space intersection Increment hit counter Terminate or switch to raytracing Increment miss counter Reproject using motion vectors AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 77 [AMD Official Use Only] CONCLUSION · Overall, Hybrid Raytraced Reflections is a powerful technique that gives a good quality boost over SSLR · Keeping raytraced reflections hybrid really helps in making the technique viable for real time · Lots of performance saved by using SSLR where applicable · Plan ahead for asset system integration and handling materials in ray tracing shaders · Important to keep in mind how ray tracing runs on HW and its performance characteristics · AMDs Hybrid Reflection sample is currently available at https://gpuopen.com/learn/hybrid-reflections/ · DX12 implementation with docs and MIT licensed public source code AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 78 [AMD Official Use Only] DISCLAIMER AND ATTRIBUTION DISCLAIMER The information contained herein is for informational purposes only and is subject to change without notice. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale. GD-18 © 2022 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, Ryzen, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners. AMD PUBLIC | PERFORMANT REFLECTIVE BEAUTY IN FC 6 | STEPHANIE AND IHOR | GDC - MARCH 2022 79Microsoft PowerPoint for Microsoft 365