Efficient Spatial Resampling Using the PDF Similarity
Author: Yusuke Tokuyoshi, Advanced Micro Devices, Inc., Japan
Abstract
In real-time rendering, spatiotemporal reservoir resampling (ReSTIR) is a powerful technique to increase the number of candidate samples for resampled importance sampling. However, reusing spatiotemporal samples is not always efficient when target PDFs for the reused samples are dissimilar to the integrand. Target PDFs are often spatially different for highly detailed scenes due to geometry edges, normal maps, spatially varying materials, and shadow edges. This paper introduces a new method of rejecting spatial reuse based on the similarity of PDF shapes for single-bounce path connections (e.g., direct illumination). While existing rejection methods for ReSTIR do not support arbitrary materials and shadow edges, our PDF similarity takes them into account because target PDFs include BSDFs and shadows. In this paper, we present a rough estimation of PDF shapes using von Mises-Fisher distributions and temporal resampling. We also present a stable combination of our rejection method and the existing rejection method, considering estimation errors due to temporal disocclusions and moving light sources. This combination efficiently reduces the error around shadow edges with temporal continuities. By using our method for a ReSTIR variant that reuses shadow ray visibility for the integrand, we can reduce the number of shadow rays while preserving shadow edges.
Introduction
Hardware ray tracing and Monte Carlo integration are used for recent real-time applications as well as offline renderers. However, to achieve real-time frame rates, the number of rays must be limited to a few per pixel. Therefore, importance sampling is vital to render high-quality images with such a limited ray count. Recent resampling techniques based on resampled importance sampling (RIS) generate samples approximately according to a target distribution by selecting samples from candidate samples. Spatiotemporal reservoir resampling (ReSTIR) is one of the most powerful RIS-based techniques. It significantly increases candidates by reusing samples from past frames and neighboring pixels.
Although ReSTIR can reuse thousands of samples, spatial reuse is not always efficient for highly detailed scenes, because each pixel has a different target distribution due to geometry edges, normal maps, spatially varying materials, and shadow edges. When the target distribution for the reused pixel is significantly different from the integrand (i.e., path contribution for lighting) at the current pixel, the reuse can increase variance. This mismatch also increases a bias for biased ReSTIR variants that reuse visibility to reduce the number of shadow rays for real-time applications. While existing ReSTIR rejected samples from reuse using some heuristics, such as the similarity of geometries, these heuristics did not support shadow edges and arbitrary materials. Thus, ReSTIR with visibility reuse using two rays per pixel produced a darkening bias around shadow edges. For aggressive visibility reuse using less than two rays per pixel, it lost shadow edges.
This paper introduces a new rejection method for spatial reuse in single-bounce path connections (e.g., direct illumination). Our method uses similarity in shapes of normalized target distributions (i.e., target PDFs) in the light direction domain. Since the target PDF includes shadows and the bidirectional scattering distribution function (BSDF), we can detect shadow edges and material boundaries by using this PDF similarity. Although it is infeasible to get the exact PDF shape for each pixel at real-time frame rates, we roughly estimate the PDF with the von Mises-Fisher (vMF) distribution using temporal resampling. Thus, our method reduces error around shadow edges with temporal continuities. Our temporal estimation for the PDF shape does not trace additional shadow rays by reusing the initial sample from lighting estimation, while it can introduce a negligibly small bias. We show that this bias is barely perceptible in our experimental results. By applying our method for a biased ReSTIR with aggressive visibility reuse, we can render high-quality shadow edges with a small number of shadow rays.
Our contributions are as follows:
- We introduce a rejection method based on the similarity of PDF shapes for ReSTIR.
- To perform our method at real-time frame rates, we roughly approximate the PDF shape using a vMF. We also present a temporal estimation for this vMF approximation.
- To handle the estimation error of the PDF shape, we combine an existing rejection heuristic and our PDF similarity based on the temporal continuity between frames.
- We demonstrate the effectiveness of our method for several ReSTIR variants with different visibility computation methods.
Background
Related Work
Recent resampling techniques are built upon sampling importance resampling (SIR). SIR generates samples approximately according to a target distribution using a two-pass sampling algorithm. The first pass generates candidate samples according to a source PDF, and then the second pass selects samples from the candidates according to the ratio of the target distribution to the source PDF. For Monte Carlo integration, Talbot introduced resampled importance sampling (RIS), which unbiasedly normalizes the distribution of samples selected by SIR. We can perform SIR and RIS in a stream manner using weighted reservoir sampling.
Bitterli et al. introduced spatiotemporal reservoir resampling (ReSTIR) to reuse samples across pixels and frames based on RIS. ReSTIR puts a sample into a reservoir for each pixel and then applies weighted reservoir sampling to spatiotemporal neighboring pixels. They also showed both biased and unbiased variants of their algorithm for direct illumination. In their biased variant, they reused the visibility of the initial sample for target distributions but did not reuse the visibility for the integrand. The reason is that their visibility reuse ignores high-frequency shadow edges in spatial reuse. Thus, the shadow edges in reused visibility are blurred and disappear during resampling. Wyman and Panteleev rearchitected the biased ReSTIR variant for production use. In this improved method, they reused the visibility for the integrand separately from Bitterli et al.'s visibility reuse. For their visibility reuse, they also proposed an adaptive shadow ray tracing based on the distance of current and reused pixels. Their adaptive approach allowed us to control the tradeoff between the performance and detailed shadows. Recently, ReSTIR has been extended to world-space reservoirs and multi-bounce path samples for global illumination. Lin et al. generalized RIS and ReSTIR. They also improved ReSTIR for path tracing by resampling similar paths using different domains.
To reduce variance and bias for ReSTIR and its biased variants, it is desirable to reuse only similar pixels for real-time rendering. Bitterli et al. heuristically rejected dissimilar pixels based on the similarity of geometries (i.e., depth and normal) as in an edge-stopping function for bilateral image denoising. This rejection heuristic prevents the propagation of samples and the blurring of reused visibility across geometry edges. Lin et al. used roughness parameters of surfaces and edge length for their connectivity of paths. Unlike these approaches, our rejection method uses the similarity of target PDF shapes to take shadow edges and arbitrary materials into account for single-bounce path connections.
Algorithm of ReSTIR
Spatiotemporal reservoir resampling (ReSTIR) builds upon resampled importance sampling (RIS). RIS first generates candidate samples xi according to a source PDF pi(xi) and then randomly selects a sample X from the candidates according to the weight of each candidate wi. For one-sample case, this RIS estimator is written as:
$$ \int_{\Omega_s} f(x)dx \approx \hat{f}(X)W_X $$
where Wx is an unbiased contribution weight which is an estimated reciprocal PDF given by:
$$ W_X = \frac{1}{p_s(X)} \sum_{i} w_i $$
where ps(x) ≈ f(x) is the target distribution and its shape is more similar to the integrand f(x) than the source PDF. For notations used in this paper, please see Table 1. In generalized RIS, the candidate weight is given by:
$$ w_i = m_i(T_i(x_i))p_s(T_i(x_i)) \left|\frac{\partial T_i}{\partial x_i}\right| $$
where mi() is the weight of multiple importance sampling (MIS) that satisfies Σ mi(x) = 1, Ti is a bijective shift mapping from the candidate's domain Ωi to the integral domain Ωs, and Wi is the contribution weight for the candidate i (e.g., W₁ = 1/pi(xi) for classic RIS). One high-quality MIS weight for RIS is Talbot MIS, and it is generalized by Lin et al. as follows:
$$ m_i(x) = \frac{p_{s_i}(x)}{\sum_j p_{s_j}(x)} $$
where $$ p_{s_i}(x) = \frac{p_i(T_i^{-1}(x))}{|\partial T_i^{-1}/\partial x|} $$ if x ∈ Ti(supp(pi)), and 0 otherwise.
For other MIS weights, please refer to Lin et al. In this generalized RIS, a sample X is selected from shifted candidates Ti(xi). For an infinite number of candidate samples, the resulting sample X follows the normalized target PDF ps(X) = ps(X)/||ps|| and Wx converges to 1/ps(X).
ReSTIR is a chained form of the generalized RIS, and it increases the number of candidates by reusing spatiotemporal neighboring samples stored in each pixel. This algorithm (shown in Algorithm 1) first performs classic RIS for each pixel using a target distribution without shadow visibility. Then, a sample X is resampled from spatiotemporal neighboring pixels with visibility test (or visibility reuse). This spatiotemporal resampling performs using the candidate weight given by Eq. 3 where i is a reused pixel. The contribution weight Wx for each pixel is also updated using Eq. 2. In this chained RIS, we use an accumulated candidate count Mi for the MIS weight as follows:
$$ m_i(x) = \frac{M_i p_{s_i}(x)}{\sum_j M_j p_{s_j}(x)} $$
Sawhney et al. used a similar MIS weight for their temporal resampling. For the explicit form of the MIS weight used in this paper, please refer to Appendix A.
To improve the efficiency, ReSTIR rejects dissimilar pixels from reuse by using some heuristics (e.g., geometry similarity). This rejection can be implemented by zeroing, reducing, or limiting the accumulated candidate count Mi for the reused pixel if the MIS weight takes the candidate count into account (as shown in Eq. 4). Bitterli et al. clamped the candidate count for past frames which may be different from the current frame for dynamic scenes. This clamping is also one of rejection heuristics. Such a reduction of candidate counts does not introduce a bias if the reduction rate is determined independently from samples. The rejection of dissimilar pixels is important to reduce error, especially for spatial reuse, because neighboring pixels can have detailed geometry, different materials, and shadow edges. The difference between pixels is often more significant than the difference between frames when lighting changes continuously. Therefore, we introduce a new rejection heuristic for spatial reuse.
Our Rejection Method for ReSTIR
As mentioned in Sect. 2.2, the reuse of similar pixels is desirable to reduce error for ReSTIR. While Lin et al. described such similar pixels as similar path contributions: f(x) ≈ f(T;(x)) and |di|dx| ≈ 1, we propose to reuse pixels with similar normalized target PDFs instead of unnormalized target distributions or path contributions (Sect. 3.1). Then, we introduce a similarity of normalized target PDFs between pixels for our rejection heuristic. This approach takes shadows and arbitrary materials into account since target PDFs include them. In this section, we present an efficient method to compute the PDF similarity for spatial reuse in single-bounce path connections. The pseudo code of our method is shown in Algorithm 2.
Resampling with Similar PDFs
ReSTIR reduces error by converging the contribution weight Wx to 1/ps(X) for many candidate samples. For this case, by substituting Wx ≈ 1/ps(X) and Eq. 3 in Eq. 2, we yield:
$$ \sum_{i} \frac{m_i(T_i(x_i)) p_s(T_i(x_i))}{|\partial T_i / \partial x_i|} \approx 1 $$
When ReSTIR converges, we also obtain Wi ≈ 1/pi(xi). Therefore, we can rewrite Eq. 5 into:
$$ \sum_{i} \frac{m_i(T_i(x_i)) p_s(T_i(x_i))}{|\partial T_i / \partial x_i|} \frac{1}{p_i(x_i)} \approx 1 $$
Algorithm 2: ReSTIR with our rejection method
Direction-TemporalResampling is the same as TemporalResampling except for the rejection of moving lights and the calculation of the average-light direction vx. Using the average directions, OurRejectionHeuristic approximately computes the PDF similarity between pixels.
function ReSTIR(s)
[xs, Ws, Ms] ← RIS(s);
if Shadowed(xs) then Ws ← 0;
[xs, Ws, Ms] ← Temporal Resampling(s, [xs, Ws, Ms]);
[xs, Ws, Ms] ← SpatialResampling(s, [xs, Ws, Ms], [W, M, vs]);
StoreReservoir(s, [xs, Ws, Ms]);
return f(xs) Ws;
function Temporal Resampling(s, [xs, Ws, Ms])
i ← PickTemporalNeighbor(s);
[xi, Wi, Mi] ← GetReservoir(i);
h ← TemporalRejectionHeuristic(s, i);
Mi ← min(Mi, Mmax) * h; // Reduce the candidate count based on heuristics
return Resampling(s, [xs, Ws, Ms], i, [xi, Wi, Mi]);
function SpatialResampling(s, [xs, Ws, Ms], [W, M, vs])
i ← PickSpatialNeighbor(s);
[xi, Wi, Mi] ← GetReservoir(i);
hprev ← ExistingRejectionHeuristic(s, i);
[x, W, M, vi] ← GetDirectionReservoir(i);
hour ← OurRejectionHeuristic(vs, vi);
t ← max((min(M, Mỹ) – M) / Mmax, 0);
if W¥ > 0 ^ W¥ > 0 then h ← lerp(hprev, hour, t) else h ← hprev;
Mi ← Mi * h; // Reduce the candidate count based on heuristics
return Resampling(s, [xs, Ws, Ms], i, [xi, Wi, Mi]);
function Resampling(s, [xs, Ws, Ms ], i, [xi, Wi, Mi])
[ms, mi] ← MISWeights(s,xs, Ms, i, xi, Mi);
ws ← ms*ps(xs)*Ws; Wi ← mi*ps(Ti(xi))*Wi*|∂Ti/∂xi|;
Wsum ← Ws + Wi;
rand ← GenerateRandomNumber();
if rand < Wi/Wsum then X ← xi else X ← Xs;
Wx ← Wsum/ps(X);
Mx ← Ms + Mi;
return [X, Wx, Mx];
Since Σ mi(T;(x;)) = 1, ReSTIR can have a small error in the following case:
$$ \frac{p_s(T_i(x_i))}{|\partial T_i / \partial x_i|} \approx \frac{p_i(x_i)}{|\partial x_i|}$$
Since this is the ratio of normalized target PDFs instead of unnormalized target distributions, we use the similarity of normalized PDFs for our rejection heuristic. Although Eq. 6 is a necessary condition for convergence and not a sufficient condition, we show that our method reduces error in our experimental results (Sect. 4).
Similarity Computation for Spherical PDFs
vMF Approximation
It is difficult to obtain the exact shape of the target PDF ps(ω) in practice. Therefore, as shown in Fig. 2b, we roughly approximate the PDF with the von Mises-Fisher (vMF) distribution in S² (a.k.a. normalized spherical Gaussian):
$$ p_s(\omega) \approx g(\omega; \mu_s, \kappa_s) = \frac{\kappa_s}{4 \pi \sinh \kappa_s} \exp (\kappa_s (\omega \cdot \mu_s)) $$
where μs and κs are the lobe axis and sharpness to represent ps(ω). This vMF distribution is obtained by the average direction of the PDF vs = ∫s² ωs(ω) dw using Banerjee et al's conversion:
$$ \mu_s = \frac{v_s}{||v_s||}, \quad \kappa_s = \frac{3||v_s|| - ||v_s||^3}{1 - ||v_s||^2} $$
For this vMF approximation, we roughly estimate the average direction vs at real-time frame rates.
Temporal Estimation of the Average Direction
For single-bounce path connections, we can rewrite the spherical integral vs = ∫s² ωps(ω)dw into a path-space integral: vs = ∫s² ω(x)ps(x)dx. Therefore, to estimate the average direction vs, this paper uses a biased variant of ReSTIR which resample a light direction wx according to the target distribution ps(X). In our average-direction estimation, we reuse the visibility over time similar to existing biased ReSTIR methods. Unlike regular ReSTIR, we reuse only temporally neighboring pixels and do not reuse spatially neighboring pixels to preserve shadow edges and material boundaries. In addition, we reuse the initial sample from the lighting estimation (see Algorithm 2). Thus, our average-direction estimation does not trace additional shadow rays. In ReSTIR, a selected sample direction wx is used to estimate the integral as follows: vs ≈ wxps(X)Wx, but one sample direction is insufficient to estimate the average direction. Therefore, instead of selecting one direction wx according to the candidate weight, we temporally accumulate candidate directions as follows:
$$ v_x = \frac{v_s W_s + v_i W_i}{W_s + W_i} $$
where vs = wss is the initial candidate direction, vi is the weighted average direction at the previous frame, ws and wi are candidate weights for initial and reused samples, and vx will be reused for the next frame as vi. Since the accumulated candidate count is clamped for temporal resampling (as mentioned in Sect. 2.2), vx can be an exponential moving average of sample directions. If the scene is not animated and thus Ti(xi) = xi between frames, we can rewrite the Monte Carlo estimator with an exponential MIS weight m'; into the following temporal estimator:
$$ v_s \sim \sum_{j} \frac{w_{j,i} p_{s_j}(x_j) m'_{j}(x_j)}{p_j(x_j)} \approx v_x p_s(X) W_x $$
where xj and ωjj are the initial sample and its direction for each frame. However, it is infeasible to evaluate the normalized target PDF ps(X) = ps(X)/||fps || analytically. By substituting Wx ≈ 1/ps(X) in Eq. 11, we obtain the following simple approximation:
$$ v_x \approx v_x $$
Since vx is a weighted average (Eq. 10), this estimator is a variant of weighted importance sampling (or ratio estimator). Thus, it has a bias due to normalization, but the bias reduces quickly. In addition, this approximation satisfies ||vx|| ≤ 1 which is required for Eq. 9. Fig. 3 shows visualization of estimated average direction vx for each pixel. In this scene, average directions are different between lit and shadowed pixels.
Although our approximation is efficient, it can produce a temporal delay especially for moving shadows. To reduce the delay for shadows, we reject rapidly moving lights from temporal reuse by using the following light-direction-based heuristic:
$$ h_{dir} = h_{prev} \exp (\lambda ((\omega_{i,i} \cdot \omega_{s,i}) - 1)) $$
where hprev ∈ [0, 1] is an existing rejection heuristic (e.g., geometry similarity), ωi,i and ωs,i are light directions at previous and current frames, and λ ∈ (0, ∞) is a user-specified parameter to control the sensitivity for moving lights (λ = 1000 is used in this paper). We multiply the accumulated candidate count by hdir to reject past candidate directions. When lighting changes, although the rejection of moving lights introduces a bias and variance for our average-direction estimation, we obtain the temporal continuity of lighting from the reduced candidate count. We use this temporal continuity in Sect. 3.3 to handle the estimation error of the average direction.
Similarity of vMFs
Once the vMF distribution (i.e., approximate target PDF) is estimated for each pixel, we compute the similarity of them. In this paper, we use a product integral-based similarity to evaluate the overlaps of PDFs. However, if a scene has only one point light source, the PDFs are delta functions and there is no overlaps since we ignore the shift of the lobe axis as shown in Fig. 4a. We cannot evaluate the similarity for this case. Therefore, to obtain the similarity based on the distance between shifted lobe axes for such high-frequency PDFs, we smooth each PDF (Fig. 4b) using a smoothing kernel g(ω'; ω, α) as follows:
$$ p_s(\omega) = \int_{\mathcal{S}^2} p_s(\omega') g(\omega'; \omega, \alpha) d\omega' \approx \int_{\mathcal{S}^2} g(\omega'; \mu_s, \kappa_s) g(\omega'; \omega, \alpha) d\omega' \approx g(\omega; (\mu_s, \kappa_s), (\mu_s, \alpha)) $$
where α ∈ (0,∞) is a user-specified kernel sharpness to control the sensitivity for the shift (α = 100 is used in this paper), and κs = κα/(κs + α) is derived in Iwasaki et al. Then, we compute the similarity of the smoothed vMFs between pixels. In this paper, we use an analytical product integral-based similarity derived in Tokuyoshi [2015] as follows:
$$ h_{smooth} = \frac{\int_{\mathcal{S}^2} p_s(\omega) p_i(\omega) d\omega}{\sqrt{\int_{\mathcal{S}^2} (p_s(\omega))^2 d\omega} \sqrt{\int_{\mathcal{S}^2} (p_i(\omega))^2 d\omega}} \approx \frac{\int_{\mathcal{S}^2} g(\omega; \mu_s, \kappa_s) g(\omega; \mu_i, \kappa_i) d\omega}{\sqrt{\int_{\mathcal{S}^2} (g(\omega; \mu_s, \kappa_s))^2 d\omega} \sqrt{\int_{\mathcal{S}^2} (g(\omega; \mu_i, \kappa_i))^2 d\omega}} = \frac{2 \sqrt{\kappa_s \kappa_i}}{\kappa_s + \kappa_i} \exp \left( \frac{\beta \kappa_s \kappa_i}{\kappa_s + \kappa_i} (\mu_s \cdot \mu_i - 1) \right) $$
where β ∈ (0,∞) is a user-specified parameter to control the sensitivity for our rejection heuristic (β = 10 is used in this paper). Using this PDF similarity hour ∈ [0, 1], we can prevent spatial reuse across shadow edges and material boundaries if estimated vMFs have small errors.
Combination with Existing Heuristics
Although our PDF similarity can detect shadow edges and material boundaries, it has a variance caused by the temporal estimation of the average direction. In addition, since our average-direction estimation shares the initial sample with lighting estimation, the variance of our PDF similarity can correlate to the variance of lighting. This correlation results in a bias in the rejection of spatial reuse. Although the variance is decorrelated by using different random numbers in every resampling routine, the variance and its correlation may be noticeable when the number of candidate samples is small due to temporal disocclusions and rapidly moving lights. Therefore, we use our rejection heuristic only when the candidate count is sufficient. In this paper, we interpolate our heuristic hour and existing heuristic hprev using the temporally accumulated candidate count as follows:
$$ h = \begin{cases} t h_{our} + (1-t) h_{prev} & \text{if } W_s > 0 \land W_i > 0 \\ h_{prev} & \text{otherwise} \end{cases} $$
$$ t = \max \left( \frac{\min(M'_s, M'_i) - M}{M_{max}}, 0 \right) $$
where M's and M'i are accumulated candidate counts for our average-direction estimation, M is the initial candidate count, and Mmax is the maximum candidate count for temporal reuse (we set Mmax = 20M as in Bitterli et al.). If the contribution weight W or W for the average-direction estimation is zero, either light direction is indefinite. Thus, we use only the existing heuristic for this case. Using this combination, our heuristic is effective only for temporally continuous pixels and static or slowly moving lights.
Experimental Results
Here we show results using our rejection method and the previous geometry-based rejection method for several ReSTIR variants. We implement these methods on Microsoft MiniEngine using DirectX Raytracing. All images are rendered with 1920×1080 screen resolution on an AMD Radeon™ RX 6900 XT GPU. The image quality is evaluated with the symmetric mean absolute percentage error (SMAPE) metric. For direct illumination, we generate 1048576 virtual point lights (1 M VPLs) on area light sources, and then we sample one VPL using ReSTIR from them. For the first RIS pass in the ReSTIR algorithm (Algorithm 2), we use an unbiased tile-based light culling to improve the efficiency. For spatial reuse, we sample neighboring pixels according to the Gaussian distribution of variance 64. In our experimental implementation, we use 16 bytes per pixel to store reservoirs for our average-direction estimation (32-bit integer for a VPL index, 32-bit floating point for W, 16-bit floating point for M, and 16-bit floating point for each dimension of vs).
ReSTIR with two rays per pixel
Fig. 5 shows ReSTIR using two shadow rays per pixel for visibility reuse (i.e., one shadow ray is reused for the target distribution fpi, and the other shadow ray is reused for the integrand f). Although this visibility reuse is efficient for real-time applications, it duplicately casts shadows on shadow edges and thus produces a darkening bias. Using our rejection method, we reduce both bias and variance on shadow edges for temporally continuous pixels. Our method also reduces variance around temporally continuous glossy highlights, since the target PDF includes the BSDF. While the computational complexity of our method is constant for each pixel, it samples more visible lights than the previous method. Thus, our method can affect shadow ray tracing cost which depends on the complexity of the scene geometry. For scenes in Fig. 5, the total overhead for our method is about 0.2 milliseconds.
ReSTIR with one ray per pixel
Fig. 6 shows ReSTIR that reuses the visibility of the initial sample for the integrand f as well as the target distribution p₁. When using the previous rejection heuristic for this case, shadow edges blur and disappear due to spatial reuse. By using our rejection method with an overhead of about 0.3 milliseconds, we obtain hard contact shadows for temporally continuous pixels without tracing two rays per pixel. On the other hand, our method produces almost the same results as the previous heuristic for temporal disocclusions (Fig. 7).
ReSTIR with adaptive ray tracing
Whether or not to trace a shadow ray for the integrand can be determined based on the distance between current and reused pixels. For this adaptive ray tracing (Fig. 8), we can control the tradeoff between the performance and detailed shadows by using a threshold for the distance. Even with a small number of rays per pixel, our rejection method preserves hard contact shadows more than the previous method for temporally continuous pixels. To obtain hard shadow edges for temporal disocclusions, we should trace a shadow ray for such pixels. In Fig. 9, we stochastically trace a ray according to a probability 1 - t (where t is the temporal continuity given by Eq. 16) in addition to the distance-based approach. When the camera moves, although this approach traces more shadow rays than using only the pixel distance and can increase a darkening bias, it produces more highly detailed and temporally coherent shadows.
ReSTIR with exact visibility test
Fig. 10 shows ReSTIR using exact visibility test for target distributions. In our implementation, it requires five rays per pixel. For this case, the MIS weight using target distributions already takes the shadow edges into account. Even using this MIS, our rejection method reduces variance with an overhead of 0.5 milliseconds in our experiment. This is because the above MIS weight ignores the normalization of target distributions, while our PDF similarity takes this normalization into account. The normalization factor for the PDF ||fps || ≈ ∫ f(x)dx is approximately equal to the expected value of the pixel luminance. Therefore, the normalization factors are significantly different between a lit pixel and shadowed pixel. We take this difference into account for our PDF similarity. Our method can have a bias due to the correlation of variance estimation shares the initial sample with lighting estimation, the variance of our PDF similarity can correlate to the variance of lighting. This correlation results in a bias in the rejection of spatial reuse. Although the variance is decorrelated by using different random numbers in every resampling routine, the variance and its correlation may be noticeable when the number of candidate samples is small due to temporal disocclusions and rapidly moving lights. Therefore, we use our rejection heuristic only when the candidate count is sufficient. In this paper, we interpolate our heuristic hour and existing heuristic hprev using the temporally accumulated candidate count as follows:
Limitations
Bias: Since the average-direction estimation for our PDF similarity shares the initial sample with lighting estimation, our rejection heuristic based on the PDF similarity can introduce a bias due to the correlation of samples. Although spatiotemporal resampling decorrelates samples in every frame, the proposed method is still an inconsistent estimator. This is because the accumulated candidate count is limited and initial samples are combined every frame in ReSTIR. However, the bias is negligibly small for temporally continuous pixels. Enabling our heuristic only for non-zero contribution weights also introduces a sample correlation if the expected value of the lighting integral is not zero. However, such zero weights are rare for temporally continuous pixels. We can decorrelate samples by using different initial samples, though this approach increases the computational overhead.
Temporal discontinuities: Our rejection heuristic works only for temporal continuities. Therefore, our rejection heuristic is not always effective in motion. Although our method renders high-frequency shadow edges using only one ray per pixel for static scenes, we have to trace an additional shadow ray to render shadows for animated scenes as in previous work. To preserve shadow edges while using less than two rays per pixel, we use spatiotemporal adaptive ray tracing for animated scenes.
False positives: Our method can approximate different PDFs into an identical vMF lobes. In this case, our method cannot reject samples that should be rejected. However, this case does not occur often enough to be a problem in our experimental results.
Multiple bounces: Since our method uses the PDF similarity in a spherical domain, it does not support multiple bounces whose PDF is the product of spherical PDF sequences. Extension for multi-bounce illumination such as glossy-to-glossy interreflections is left for future work.
Memory overhead: Our average-direction estimation stores reservoirs in memory. Thus, it has a memory transfer cost. In our experimental implementation, we use 16 bytes per pixel for these reservoirs. We consider reduction of the reservoir data size as future work.
Highly glossy surfaces: Our PDF similarity estimation has a temporal delay for highly glossy surfaces when the view direction changes. Thus, this delay can produce variance around glossy highlights with a moving camera (Fig. 13). To reduce this delay, we can add an analytic glossy lobe similarity [Tokuyoshi 2015] between frames to the rejection heuristic (Eq. 12) in our average-direction estimation. Another approach to avoid the view-dependent delay is to decouple incoming radiance and the BSDF from the PDF. This decoupling approach separately approximates incoming radiance and the cosine-weighted BSDF using two vMFs, and then computes a vMF representing the PDF by using the product of the two vMFs as in spherical Gaussian lighting. Since we can obtain the vMF for the BSDF analytically or using lookup tables, we can avoid the view-dependent delay for highly glossy surfaces while increasing the vMF approximation error. We would like to investigate the efficiency of these approaches in the future.
Conclusion
This paper has presented a new rejection method based on the PDF shape similarity between pixels for single-bounce ReSTIR (e.g., direct illumination). Using this PDF similarity, we alleviated undesirable spatial resampling across shadow edges and material boundaries. To perform at real-time frame rates, our method roughly approximates the PDF with a vMF by using the temporal average of sample light directions for each pixel. We have also presented a stable combination of an existing rejection heuristic and our PDF similarity considering the estimation error of the temporal average direction. Using our method, we improved the image quality for temporally continuous lighting while using a smaller number of rays than the previous method. On the other hand, our method is comparable quality to the existing heuristic for temporal disocclusions and rapidly moving lights.
Although our method takes into account the estimation error when lighting changes, it ignores error due to the temporal changes of view directions for highly glossy surfaces. To accurately handle such view-direction changes, we are currently considering the integration of a glossy lobe similarity or decoupling of incoming radiance and BSDFs into our method. We would like to investigate the efficiency of these techniques in the future. Our PDF similarity is limited to single bounce, but it is applicable to indirect illumination by using VPLs. We would also like to investigate the efficiency of our method for VPL-based ReSTIR algorithms such as ReSTIR GI.
Acknowledgments
We would like to thank the Amazon Lumberyard team for the BISTRO scene, M. Winkelmann and K. Anderson for the ZERO-DAY scene, and G. M. Leal Llaguno for the SAN MIGUEL scene. The SAN MIGUEL scene is distributed by McGuire. We would also like to the anonymous reviewers for their valuable comments and constructive suggestions.
References
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, and Suvrit Sra. 2005. Clustering on the Unit Hypersphere Using von Mises-Fisher Distributions. J. Mach. Learn. Res. 6 (2005), 1345–1382.
Philippe Bekaert, Mateu Sbert, and Yves D. Willems. 2000. Weighted Importance Sampling Techniques for Monte Carlo Radiosity. In EGWR '00. 35–46.
Benedikt Bitterli, Chris Wyman, Matt Pharr, Peter Shirley, Aaron Lefohn, and Wojciech Jarosz. 2020. Spatiotemporal Reservoir Resampling for Real-Time Ray Tracing with Dynamic Direct Lighting. ACM Trans. Graph. 39, 4, Article 148 (2020), 17 pages. https://doi.org/10.1145/3386569.3392481
Guillaume Boissé. 2021. World-Space Spatiotemporal Reservoir Reuse for Ray-Traced Global Illumination. In SIGGRAPH Asia '21 Tech. Commun. Article 22, 4 pages. https://doi.org/10.1145/3478512.3488613
Jakub Boksansky, Paula Jukarainen, and Chris Wyman. 2021. Rendering Many Lights with Grid-Based Reservoirs. In Ray Tracing Gems II: Next Generation Real-Time Rendering with DXR, Vulkan, and OptiX. Apress, 351–365. https://doi.org/10.1007/978-1-4842-7185-8_23
Min-Te Chao. 1982. A General Purpose Unequal Probability Sampling Plan. Biometrika 69, 3 (1982), 653–656. https://doi.org/10.1093/biomet/69.3.653
Carsten Dachsbacher and Marc Stamminger. 2005. Reflective Shadow Maps. In I3D '05. 203–231.
Elmar Eisemann and Frédo Durand. 2004. Flash photography enhancement via intrinsic relighting. ACM Trans. Graph. 23, 3 (2004), 673–678.
Ronald Aylmer Fisher. 1953. Dispersion on a sphere. Proc. R. Soc. Lond. Ser. A 217, 1130 (1953), 295–305. https://doi.org/10.1098/rspa.1953.0064
Eric Heitz, Stephen Hill, and Morgan McGuire. 2018. Combining Analytic Direct Illumination and Stochastic Shadows. In I3D '18. Article 2, 11 pages. https://doi.org/10.1145/3190834.3190852
Kei Iwasaki, Yoshinori Dobashi, and Tomoyuki Nishita. 2012. Interactive Bi-Scale Editing of Highly Glossy Materials. ACM Trans. Graph. 31, 6, Article 144 (2012), 7 pages. https://doi.org/10.1145/2366145.2366163
Alexander Keller. 1997. Instant Radiosity. In SIGGRAPH '97. 49–56. https://doi.org/10.1145/258734.258769
Daqi Lin, Markus Kettunen, Benedikt Bitterli, Jacopo Pantaleoni, Cem Yuksel, and Chris Wyman. 2022. Generalized Resampled Importance Sampling: Foundations of ReSTIR. ACM Trans. Graph. 41, 4, Article 75 (2022), 23 pages. https://doi.org/10.1145/3528223.3530158
Daqi Lin, Chris Wyman, and Cem Yuksel. 2021. Fast Volume Rendering with Spatiotemporal Reservoir Resampling. ACM Trans. Graph. 40, 6, Article 279 (2021), 18 pages. https://doi.org/10.1145/3478513.3480499
Amazon Lumberyard. 2017. Amazon Lumberyard Bistro, Open Research Content Archive (ORCA). http://developer.nvidia.com/orca/amazon-lumberyard-bistro
Morgan McGuire. 2017. Computer Graphics Archive. https://casual-effects.com/data
Yaobin Ouyang, Shiqiu Liu, Markus Kettunen, Matt Pharr, and Jacopo Pantaleoni. 2021. ReSTIR GI: Path Resampling for Real-Time Path Tracing. Comput. Graph. Forum 40, 8 (2021), 17–29. https://doi.org/10.1111/cgf.14378
Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama. 2004. Digital photography with flash and no-flash image pairs. ACM Trans. Graph. 23, 3 (2004), 664–672.
Donald B. Rubin. 1987. A Noniterative Sampling/Importance Resampling Alternative to the Data Augmentation Algorithm for Creating a Few Imputations When Fractions of Missing Information Are Modest: The SIR Algorithm. J. Amer. Statist. Assoc. 82, 398 (1987), 543–546.
Rohan Sawhney, Daqi Lin, Markus Kettunen, Benedikt Bitterli, Ravi Ramamoorthi, Chris Wyman, and Matt Pharr. 2022. Decorrelating ReSTIR Samplers via MCMC Mutations. https://doi.org/10.48550/ARXIV.2211.00166
Justin F. Talbot. 2005. Importance Resampling for Global Illumination. Master's thesis. Brigham Young U.
Yusuke. Tokuyoshi. 2015. Specular Lobe-Aware Filtering and Upsampling for Interactive Indirect Illumination. Comput. Graph. Forum 34, 6 (2015), 135–147. https://doi.org/10.1111/cgf.12525
Yusuke Tokuyoshi. 2022. Tiled Reservoir Sampling for Many-Light Rendering. Technical Report No. 21-11-ecdc. Advanced Micro Devices, Inc.
Yu-Ting Tsai and Zen-Chung Shih. 2006. All-Frequency Precomputed Radiance Transfer Using Spherical Radial Basis Functions and Clustered Tensor Approximation. ACM Trans. Graph. 25, 3 (2006), 967–976. https://doi.org/10.1145/1141911.1141981
Eric Veach. 1998. Robust Monte Carlo Methods for Light Transport Simulation. Ph. D. Dissertation. Stanford U.
Eric Veach and Leonidas J. Guibas. 1995. Optimally Combining Sampling Techniques for Monte Carlo Rendering. In SIGGRAPH '95. 419–428. https://doi.org/10.1145/218380.218498
Jiaping Wang, Peiran Ren, Minmin Gong, John Snyder, and Baining Guo. 2009. All-Frequency Rendering of Dynamic, Spatially-Varying Reflectance. ACM Trans. Graph. 28, 5 (2009), 133:1–133:10. https://doi.org/10.1145/1618452.1618479
Mike Winkelmann and Kate Anderson. 2019. Zero-Day, Open Research Content Archive (ORCA). https://developer.nvidia.com/orca/beeple-zero-day
Chris Wyman and Alexey Panteleev. 2021. Rearchitecting Spatiotemporal Resampling for Production. In HPG '21. 19 pages. https://doi.org/10.2312/hpg.20211281
A MIS WEIGHTS USED IN THIS PAPER
In this paper, we reuse one sample at a neighboring pixel i and combine it into the sample at the current pixel s in each resampling routine as in an existing practical implementation [Wyman and Panteleev 2021]. For this case, we weight target distributions by using accumulated candidate counts Ms and M¡. In our implementation for area light sources, we use the following equation:
$$ m_s(x_s) = \frac{M_s p_{s_s}(x_s)}{M_s p_{s_s}(x_s) + M_i p_{s_i}(x_i)} $$
$$ m_i(T_i(x_i)) = \frac{M_i p_{s_i}(T_i(x_i))}{|\partial T_i / \partial x_i|} \frac{1}{M_s p_{s_s}(T_i(x_i)) + M_i p_{s_i}(T_i(x_i))}$$
where the spherical target distribution fps(·) is the product of the incoming radiance and the cosine-weighted BSDF at zs as follows:
$$ p_s(\omega) = L(z_s, \omega) \rho(z_s, \omega) \frac{|n(z_s) \cdot \omega|}{||c_s - z_s||^2} $$