Physically-based path tracer in CUDA.
Author: Lu M.
Tested System:
- Windows 11 Home
- AMD Ryzen 7 5800HS @ 3.20GHz, 16GB RAM
- NVIDIA GeForce RTX 3060 Laptop GPU 6GB (Compute Capability 8.6)
This project implements a GPU path tracer written in CUDA that renders images. The implementation focuses on both visual correctness and GPU performance. The tracer supports multiple sampling strategies and a thin-lens camera model for depth-of-field.
- Clone the repository:
git clone https://github.com/lu-m-dev/CUDA-path-tracer.git
- Navigate to the project directory:
cd CUDA-path-tracer - Build with CMake:
cmake -B build -S . -G "Visual Studio 17 2022"
- Open the solution in Visual Studio:
cd build start ./cis565_path_tracer.sln
- Diffuse (Lambertian): cosine-weighted hemisphere sampling with albedo-based BRDF.
- Specular reflectance: perfect mirror reflection for specular materials.
Grouping intersections by material ID creates contiguous memory reads in the shading kernel, improving cache behavior and reducing memory latency on the GPU.
After shading, rays that are terminated (dead) are removed from the active ray list using parallel stream compaction. This reduces the number of rays traced in future iterations.
During shading, rays that have many remaining bounces and insignificant color intensities are randomly terminated. This reduces the number of rays traced in future iterations. We compensate surviving paths by dividing throughput by the survival probability.
| Original | Russian roulette |
|---|---|
![]() |
![]() |
Antialiasing is performed by jittering sample positions within each pixel across multiple samples per pixel. This reduces aliasing and helps Monte Carlo convergence.
| Original | Anti-aliasing |
|---|---|
![]() |
![]() |
Dielectric materials use Snell's law to compute transmission directions and Schlick's approximation for Fresnel reflectance. Total internal reflection is handled explicitly.
Implemented using a thin-lens camera: sample the lens disk and aim rays at a point on the focal plane located at the focus distance. The aperture controls blur amount.
| No DoF (sharp) | Focused on front glass sphere | Focused on back blue sphere |
|---|---|---|
![]() |
![]() |
![]() |
Direct lighting samples scene lights per hit point. We compute a visibility test toward sampled light points and add contribution scaled by distance (and normalized by the light intensity to emphasize contrast in lit regions). This requires a separate pass / kernel for sampling lights and visibility checks, so it is more expensive but increases visual fidelity.
for each lightSample {
Lp = samplePointOnLight(light);
wi = normalize(Lp - hit.pos);
if (!occluded(hit.pos, Lp)) {
Li = light.power / distanceSquared(hit.pos, Lp);
contribution += brdf * Li * max(0, dot(wi, normal)) / pdf;
}
}
contribution *= normalizeByLightIntensity(light);
| Original | With direct lighting |
|---|---|
![]() |
![]() |
Supported 2D random number generators: uniform, purely random, jittered, Halton, and Sobol. Samplers are used for pixel jitter, lens sampling, and BRDF sampling. Low-discrepancy sequences often improve convergence but can introduce structured artifacts unless scrambled.
Representative performance numbers measured on the development machine (scene dependent):
- Baseline (no extra features): 12.0 FPS
- With Russian roulette enabled: 13.1 FPS
- With Direct Lighting enabled: 10.8 FPS (extra kernel invocation for direct lighting visibility computations)
Discussions:
- Sorting intersections by material did reduced memory stalls in the shading kernel, but the speed improvement was cancelled out by the additional sorting process, yielding no performance improvement overall.
- Stream compaction reduces traced rays in later bounces, improving throughput; however, the relative benefit depends on scene complexity and depth distribution.
- Direct lighting increases light directionality visuals but requires additional computation; use it when higher fidelity is required.
This is what happens when inner/outer surface determination is flipped during refraction handling. Inner surfaces are treated as outer surfaces, and vice versa. This only impacted the dielectric material object (in the figure above, the rightmost sphere and cube). Those objects are partially specular and partially "diffusive" to darkness. Thus, an interesting and complex pattern is created when external and internal interactions are flipped.
This project adapted methods and pseudocode from these references:
- Physically Based Rendering
- Ray Tracing in One Weekend
- The path to path-traced movies
- University of Pennsylvania CIS 5650
Acknowledgements: Many thanks to the authors and open-source materials. Code structure and many algorithmic choices were influenced by the references above.









