Being a primarily Unreal-focused developer, I don’t really spend that much time in standard C++. Ya, technically I work a lot in C++, but C++ using STL and related things is very different from the custom containers and macro-heavy nature of working in Unreal. Part of the fallout of that is that I generally miss new features of the language for quite a while. I didn’t get into working in C++11 and newer until I was off of working in UE3, and even moving into UE4 I don’t get exposed to things like STL or the the standard implementation of threads. It’s one of those things where, ya I’ve used threads and I get the concepts behind it, but creating a worker thread to offload a specific task is much different than architecting code to properly and efficiently support a threadable workload. That’s where my screwing around here comes into play.
For this screwing around, I decided to thread an implementation of a raytracer. It’s a workload that is inherently parallelizable. You’ve got a bunch of rays going out that can independently resolve themselves. Ya, you may need to have a ray spawn further rays, but that can live within the worker thread as it chews through the work. From a naive implementation standpoint, each pixel could be its own thread and run in parallel, and that’s basically where I started.
- For the purposes of this, I started with a sample implementation from Ray Tracing in One Weekend by Peter Shirley. This series of books is a supremely fantastic quick look at basic concepts behind ray tracing, and gave me a quick place to get to a point where I could investigate threading.
- For my CPU, I’m running this on an AMD 3950x (16 core, 32 thread) at stock speeds. I’m not doing anything to minimize background processes, but it shouldn’t be a huge issue for where I’m at.
- I’m currently using Visual Studio 2019’s built-in performance profiler. I don’t particularly like it compared to other tools, but my profiler of choice on my current hardware (AMD uProf) currently has a bug on some installs of the May 2020 version of Windows 10 that prevents profile captures. The VS profiler is basic, but gets me enough information for the basics that I’m starting at.
- This is running in release with default optimizations purely out of laziness.
- I’ll post some code samples around. These won’t generally compile because I’m stripping out unnecessary stuff from the samples (ex: you don’t need to care about me setting image dimensions and writing it to disk).
For the purposes of my current testing, this is the output image. It’s two spheres where each pixel color represents the surface normal hit by a ray. The image is 800×400 resolution and each pixel does 100 slightly randomized rays to give an anti-aliased result. In the basic current pass, I’m not doing any bounced rays on collisions. The final image is therefore the result of 32 million ray casts. In some future tests, I’ll be adapting the rest of the book to the multithreaded version and support reflection/refraction and increasing the workload through that process.