Hi, my name is Brad, and I'm currently a student at Brown University studying computer science and physics, writing music and taking multimedia classes in between problem sets and CS projects.

Here at Brown, I do computer graphics research in the visual computing lab, and I work with STEAM, a group focused on breaking down the barriers between disciplines.

Most recently, I worked at the intersection of geometry processing and sound transport for Call of Duty: Modern Warfare, but I've also interned at

startups working on wireless charging at a distance and cloud based music editing.

As much as I like to dive in deep on isolated subjects, we too often assign the things we study to artificial categories that prevent us from thinking outside the box. The best innovations lie at the intersection of creative thinking and analytical understanding–of art and science. I try to live and work at this crossroads.

For more about me and the stuff I've done, check out my projects and my resume.

"U V. Captcha" is my debut work as the indie-electronic artist SN0WCRASH. The EP explores the themes of love, relationships, and self-identity in the context of the digital age with three danceable indie-pop songs. From the soaring vocal choruses of "Silicon Juliette" to the more nuanced bit-crushed madness of "Switching Off", the record covers a propensity of sonic space while remaining catchy and accessible.

"Singularity" is my second work as the indie-electronic artist SN0WCRASH. A departure from the upbeat indie disco of "U V. Captcha", this record's wistful garage-pop anthems brood on the growth and decay of relationships over time. From the dark analogue madness of the interpretation of Bowie's "Heroes" to the vivid future-trap-dream-pop beats of "Polaroids", the songs reflect on the immediacy of now, the rose-tint of yesterday, and the apocalyptic tomorrow.

"Don't Get Your Hopes Up" is my third work as the indie-electronic artist SN0WCRASH. This EP branches even further into the territory of moody synthwave, leveraging compositional elements ranging from deep chorused guitars to lush analog synth patches. The songs are situated in a more mature place thematically, diving into topics like distrust and anxiety.

As project leaders for Brown cyberSTEAM (STEM + Art), Eddie Jiao and I organized students from Brown and RISD to build an interactive art installation. Each lantern is a meter tall and uses ultrasonic and sound sensors to change color depending on human presence and environmental sound.

- Check out the code on
**github**

Escape is the debut record I produced with my high school band 37th Parallel. After raising over $12,000 on kickstarter, we spent a week at 25th Street Studios in Oakland to record an album of 11 original songs exploring everything from alienation, to growing up, to escapism (as the name suggests).

While in my first year at Brown, I took the intro CS classes CS17 and CS18. As part of the coursework, I had to build a few cool projects.

One of the big focuses of CS18 was fundamental data structures. Given a set of data and a use case, we'd be asked to choose the most effective data structure.

This part of the course culminated in building a search engine designed to respond rapidly to free text queries on a 220MB wikipedia corpus.

Some of the challenges posed by this task were figuring out a way to efficiently read files into a data structure ("indexing"), write that data structure to a file in an unambiguous format, come up with a way to rank pages given a query, and implement Google's page-rank algorithm.

The final project I completed for CS17 was a console based Connect-4 game, along with an AI player based on a recursive game-tree lookahead algorithm.

The final project for CS18 was actually split into three different projects—a text-based web browser ("sparkzilla"), a GUI-based web browswer ("GUIzilla"), and server supporting dynamic web content.

Ultimately, the greatest challenge of the three was constructing the server to handle dynamic page content and multiple GET/POST requests. A session-map of unique id-page pairs was stored on the server side to deal with this.

Want to see the code? These are all private repos on github for academic code reasons. Contact me and I can give you access!

I took Brown's CS33 "Computer Systems" course that explores the architecture of Unix machines. Here are a few of the projects from that course.

A big chunk of the course focused on the concept of the shell. This culminated in the "shell" project, where we were tasked with writing our own c-shell. The shell supports input and output redirection, signal handling, foreground/background process management, and a few commands (i.e. cd, rm).

One of the final projects of the course was to implement a dynamic memory allocator; effectively to reimplement the standard library functions malloc, free, and realloc. I employed a first-fit algorithm, with an optimization that allowed free blocks to be coalesced together, reducing internal fragmentation.

The culminating project of the course was to construct a thread-safe database, a server to host this database, and clients to interact with the server.

Understanding of synchronization techniques such as condition variables, mutexes, and barriers was essential to successfully completing the project. Additionally, knowledge of the C networking library and general understanding of sockets was required to implement the networking portion.

Want to see the code? These are all private repos on github for academic code reasons. Contact me and I can give you access!

I took Brown's CS1230 "Computer Graphics" course, which is a fast paced introduction to core concepts in 2D and 3D graphics including image processing, 3D modeling/viewing/rendering, and shader programming on the GPU.

The culminating assignment (prior to the final project) was to implement a CPU raytracer in C++. We did this in stages, starting with the code to construct and traverse the scenegraph (and render with OGL), then moving on to rendering this scene with a simple non-recursive raytracer, and finally augmenting this raytracer with additional features like recursive reflections, shadows, and texture mapping.

I also opted to implement some additional features that interested me. Some were just in the pursuit of making nicer looking images—simple anti-aliasing and brute-force super-sampling were among these.

Others were aimed at speeding up the intersection process. Because ray-tracing is "embarrassingly parallel", I added an option to multithread the rendering process.

This was somewhat helpful, but did not improve the efficiency nearly as much as my other optimization: using an axis aligned KD-Tree to cut down on the number of ray intersections necessary to compute for each ray (from O(n) to O(log(n))). Using this feature, images were rendered about four times as quickly (dependent on the number of primitives in the scene, of course).

Want to see the code? This is a private repo on github for academic code reasons. Contact me and I can give you access!

For our final project in Brown's CS1230 course, myself, Leo Ko, and Eddie Jiao decided to tackle procedural generation of branch structures and leaf patterns for realistic trees.

While many tactics exist for modeling self-similar branch structures, one of the most established is through use of the mathematical object known as the L-System. L-Systems are, more or less, nice and compact ways of representing fractals.

As we saw it, there were two main problems we had to solve.

The first of these was how to modify the classic 2D turtle graphics formulation of L-Systems to work for generating 3D meshes. As it turned out, the only change that was necessary was to generalize any rotations in the 2D plane to rotations in 3-space. One slight nuance is that it's really only possible to get good looking branching structures if you allow your second angle to represent orientation changes relative to the axis of the current branch, as opposed to the global z-axis. This way, you end up with branches that shoot out from different places around their parent branch.

The other main hurdle we encountered was the uniformity of L-Systems. Because L-Systems are effectively just fractals, they are, more or less, exactly self-similar. Even so-called stochastic L-Systems, which incorporate randomness into their branching structure, still look very uniform when translated directly to a 3D mesh.

A big step in finding the solution was to introduce stochasticity to the translation process from abstract L-System to 3D mesh. Introducing random perturbations to the otherwise uniform length, orientation, and thickness of each branch significantly increased the reaslism of the branching structures.

Even with this modification, the trees still didn't look quite right. We realized this was because certain branches "stuck out" so much that it looked physically impossible. To remedy this issue, we implemented a variable "gravity" parameter, that bends branches toward the ground (in a way that's more or less physically accurate, using basic torque/moment of inertia calculations).

We expected leaves to be quite a challenge, but it turned out that the structure of our code allowed us to treat leaves just like other branches that, when translated to a mesh, became small hexagons instead of long branches.

One of the last features we added, but possibly the most important, is the ability to export the branch structure and leaf pattern (separately) to .obj files. This allowed us to produce some nice scenes in Maya/Blender, and we hope it will allow others to use this tool for asset generation. We plan to make this project open-source soon.

Want to see the code? This is a private repo on github for academic code reasons. Contact me and I can give you access!

The first project in Brown's CS2240 course, "Advanced Computer Graphics", was to implement a CPU path tracer, a renderer that models physically accurate light transport. The baseline requirements for the project were to implement a basic brute force path tracer that supports 4 kinds of BSDF's: Lambertian, Phong, perfect mirror, and refractive dielectric with Fresnel reflection.

Russian-Roulette path termination was a required feature, though it was left up to our discretion how to implement it. I chose to make the termination probability proportional to the square root of the norm of the BSDF, this way paths were more likely to terminate when bouncing off of surfaces that were minimally reflective. However, I had to play around with clamping it within a discrete range to avoid paths that bounce for too long and paths that don't bounce long enough. I ended up clamping it to the range [0.7, 0.9], which empirically produced the images with the least noise.

We were also required to implement direct lighting to reduce the noise in our image. All the images below were rendered with direct lighting, because without it the path tracer is almost unusable there's so much noise!

I also decided to implement a few extra features. I had initially intended to implement HDR's and some kind of simple denoising algorithm, but got caught up implementing other things instead. There was just too much cool stuff to do! I figured I would start by implementing features that would reduce the variance in my image (that weren't post-processing denoising algorithms), that way I could spend less time rendering tests. I implemented two features toward this end.

The first of these was "stratified sampling". This idea is simple: to get a more uniform sample of rays per pixel, divide the pixel into grid cells and sample rays through each grid cell to guarantee that the rays will roughly cover the whole pixel area. As much as I expected the differences to be noticeable, it seems that this sampling method is not really much of an improvement based on the image comparisons.

The second was "importance sampling". The idea here is to sample important BSDF directions more, and less important BSDF directions less. This reduces variance in the image. There is a huge difference in noise between the importance/non-importance sampled images, but it's particularly visible when rendering scenes that make use of the phong BRDF. This comparison is a testament to the fact that you really can't even render a clean image of a glossy object without some kind of importance sampling method.

I also tried my hand at implementing the Cook-Torrance BSDF, a microfacet model commonly used in PBR workflows (and the one used in Blender's Principled BSDF shader). I used the Beckmann distribution function. I wasn't able to get the importance sampler to work in time, which makes things tough.

The final extra feature I implemented was depth of field—it's not so much a physically based model as a hack where I scatter the ray origins around the eye point within some radius (the "aperture") and then move the virtual film plane to where I want to place the focus plane.

Want to see the code? This is a private repo on github for academic code reasons. Contact me and I can give you access!

To complement the computer graphics coursework I was doing during my junior fall semester, I decided to learn a 3D modeling program—Blender. Here are some images of some original projects and some tutorials.

I decided to challenge myself to model one of my favorite guitars. Some people might notice that the guitar has an Epiphone headstock—this is because I was working with Epiphone reference photos.

This was a modeling tutorial done by Andrew Price, "blender guru". I'm pretty pleased with the result.

Ever since playing the game "Morphite", I've been interested in low-poly art. This tutorial was a simple introduction.

This was a fun/whimsical attempt at modeling the sword "Aerondight" from "The Witcher III", in a low-poly style.

I started playing around with emissive shaders and the wireframe modifier when talking about an idea for a game using an old-school vector graphics aesthetic. This was a fun result.

After completing Martin Finke's tutorial on designing and building audio plugins with Oli Larkin's WDL-OL interace, I decided to try my hand at writing my own.

"The Replicant" Analog Delay Engine is meant to emulate the sound and color of old school analog delay circuits. It sports some key features to this end—rich channel-independent modulation, distortion that can be shaped with hi-pass and low-pass filters, and "analog" meters that mimick the repeat to repeat high frequency loss present in classic analog delay units.

Despite having a vintage sound color, the Replicant is designed to fit comfortably into a modern production workflow. Features like tempo sync of delay time and modulation, tap tempo, and L/R sync for each control allow for the precision you'd expect from a modern effects unit.

From a development standpoint, I ran into quite a few stumbling blocks before getting to the finished product.

For one, I was bogged down for quite a while by some digital artifacts that would come through when the delay time was modulated. Ultimately, the solution to getting rid of these (and the key to thick, continuous modulation capability), was utilizing interpolation to cope with fractional delay times. Utilizing linear interpolation improved the problem, but ultimately I ended up utilizing Hermite (polynomial) interpolation to completely eradicate the artifacts.

I also encountered issues when attempting to implement classic analog delay feedback loop functionality. It was pretty apparent that to get baseline functionality, content already in the delay buffer had to be resampled, resulting in that swooshing time strecthing sound that anyone with a delay stompbox knows so well.

The problem with this implementation—and it was quite a problem—was that it sounded awful. Harsh digital artifacts and aliasing abounded. I tried quite a few solutions before hitting on the correct one: applying a smoothing algorithm while resampling the existing buffer. Even within this solution there was room for experimentation; after testing a few different methods, I (to my surprise) found that a single pass-through with a simple 3-point algorithm sounded best.

A goal from the beginning was to make the plugin skinnable—and it's still just that, a goal. WDL-OL, for all its usefulness, does not have great support for GUI manipulation. Perhaps in a future build, I'd like to return to this problem and find a good way to solve it.

A demo video is available here!

Download the Replicant Delay for free here!

Want to see the code? Shoot me a message and I'll give you access to the github.

I had the idea to do a simulation of Newtonian gravity in VR using AFrame.io, but figured it would be a good idea to implement it as a 2-D project first and work out the kinks in that environment first.

For the modelling element, the program is spartan enough that all the computing can be done client side. For the graphics element, I used p5.js. Stay tuned for a webVR rehash.

This project lives on the web! You can find it here.

Want to see the code? Check it out on github.

David Schurman (my friend here at Brown student and fellow STEAM leader) and I were both frustrated with the lack of visual music discovery tools out there. After searching for a virtual 'music map' of sorts, we were disappointed with the results. Most maps either only went one relational level deep or—even worse—were text based!

We realized that, in order for a related artists graph to be meaningful, two layers of depth are necessary, and relations between artists that are not the center artist have to be represented. This quite literally ties artists who are similar to each other together in the visualization.

This is particularly useful for seeing what different genres artists draw from. For instance, if you search "BØRNS" on Musicfind, you'll see two groups form. One has artists from the new wave (well, one definition of new wave) scene, like Hippo Campus, Bleachers and WALK THE MOON. Another is populated with female-dominant indie pop groups, like Banks, Broods, and Wet. These arguably represent two different important facets of BØRNS' musical style and influence.

As much as we'd love for this to be a fully functioning music discovery tool, it is still just an experiment. Cross-platform support is a goal for the future—it currently only works on Chrome due to some optimization requirements involving moving around and styling images in SVG. It would also be wonderful to give users full access to their Spotify accounts from the webpage so they can manage playlists while searching around the map. While we have abandoned work on Musicfind for now, we hope that sometime before the end of the year we can come back to it and turn it into the beautiful visual artist discovery tool that music enthusiasts deserve.

This project lives on the web! You can find it here.

Want to see the code? Shoot me a message and I'll give you access to the github.

This was a project for a course I took second semester sophomore year that focused on Galois Theory—in short, the study of the relationship between groups, fields, and polynomials. I worked with some friends here at Brown—David Halpern and Adam Shelby—over the course of about a week and a half to put it together.

One of the topics we covered was the notion of constructibility. A question of interest to the geometers of classical times was "what numbers can you construct with a compass and a straight edge"? In other words, starting with the points (0, 0) and (1, 0) in the plane and allowing yourself to only draw a finite number of lines between two points and circles with center at one point and intersecting another, what points in the plane can you get to?

This may seem like a trivial question, but it turns out that it is not so easy to answer. In fact, it wasn't shown until 1796 that the regular 17-gon was constructible (by Gauss, when he was 18, naturally). To provide a true criterion for constructibility, one has to draw a connection between the idea of "geometrically constructible" and "algebraically constructible".

We've already mentioned what it means to be geometrically constructible. An algebraically constructible number is one which can be constructed from performing field operations (addition, multiplication, subtraction, division) and a finite number of square roots with rational numbers. If you know some algebra, you'll recognize this as performing successive field extensions to the rationals.

We won't go into the proof here, but it turns out that these two definitions are actually the same—this is because there is an algorithm for performing field operations and taking square roots in the plane, and this algorithm involves only drawing lines and circles, starting with two rational points.

So what does this website do? It visualizes this construction process. You enter an algebraically constructible number, and it will draw out the geometric construction process for you!

This project lives on the web! You can find it here.

Want to see the code? Check it out on github.

For the second project in Brown's CS224 course, "Advanced Computer Graphics", we were tasked with implementing some fundamental mesh operations: subdivision, greedy decimation using a quadric error metric, and bilateral filtering.

The bulk of the work of the project was bringing up an implementation of an acceleration data structure that allowed for fast mesh editing and traversal. I chose to implement a half-edge data structure, which, though ultimately more challenging to get right than something like a simple adjacency table, allows for cleaner implementations of the unit topology-altering operations necessary to implement the more complex processing algorithms.

In particular, these operations are edge split, edge flip, and edge collapse. These were all fairly straightforward to puzzle out, though they required keeping track of a lot of pointers to implement correctly. Writing these was like something akin to writing a doubly linked-list deletion but ten times more complicated.

Something that's important to note is that the half-edge data structure is only valid if the mesh it represents is "manifold". Roughly, a manifold mesh is one that perfectly encloses a volume. To guarantee this invariant, you need to be sure to prevent operations that destroy it from occuring. Luckily, there are constant time ways of detecting whether one of the three operations above will violate the manifold property.

Here's an example of multiple iterations of bilateral filtering on a high poly but very noisy bunny mesh. Note how the noise disappears, but much of the relevant detail is preserved.

Here's an example of multiple rounds of decimation on an initially high-poly cow mesh.

Here's two examples of subdivision across multiple iterations.

Raytracing is a classic algorithm that leverages the properties of geometric light transport to calculate very convincing specular reflections. Nearly every student of graphics has implemented raytracing in one form or another—it's arguably the most well-known algorithm in computer graphics.

The algorithm is simple: treat the image you're trying to render as a virtual "film plane", place it in your 3D scene, and send a ray of light through each pixel, bouncing it around the scene and accumulating color values with each bounce. This ends up being pretty easy, because light travels in a straight line.

For our final project for Brown's PHYS1100 course, "General Relativity", me and my fellow Brown student Alex Lawson decided to answer the question, "what if light doesn't travel in a straight line?".

In particular, we decided to try to raytrace scenes that had a black hole in them. The black hole warps spacetime in such a way that light bends around its singularity, creating a beautifully haunting fisheye-looking effect known as "gravitational lensing".

To understand how we implemented this project, you'll need to know a bit of General Relativity, the brainchild of Albert Einstein that changed our notions of space, time, and gravity forever.

Einstein kicks off the theory of General Relativity with the statement that space and time are really one thing, "spacetime", which is a four-dimensional object called a manifold. It's called four-dimensional because it has three spatial dimensions and one time dimension. To know where you are in spacetime, you have to know where you are as an (x, y, z) point in space, and what time "t" it is. In this sense, your location in spacetime is (t, x, y, z).

You can think of a manifold kind of like a surface, like a sphere or a plane. A sphere is a two dimensional manifold, but to see a sphere, you need to have three dimensions—how could you put all of a sphere on a piece of paper without unwrapping it? In this same way, four-dimensional spacetime can only be seen as a surface from a five-dimensional perspective. As lowly beings who only experience three of the dimensions in a way that isn't just "always moving forward" like the way we perceive time, we'd be really hard-pressed to visualize what our 4D spacetime looks like as a geometric object. Nonetheless, we can come up with examples that, while not entirely accurate, get the idea across.

If there was no matter in our universe—nothing with mass—then spacetime would be flat, like the surface of a trampoline. The key statement of General Relativity is that when you put an object with mass into spacetime, spacetime "warps" around it. To continue the trampoline analogy, this is like what happens when you put a heavy ball in the middle of the trampoline and see the trampoline surface bend under its weight.

The mathematical object that encodes the curvature of a manifold is known as the metric. The Einstein equation relates the metric to something called the stress-energy tensor, which you can think of as a way of representing the matter distributed across the universe. When you solve Einstein's equation, you are looking for a metric that satisfies the equation, given a particular stress-energy tensor.

The simplest solution to the Einstein equation is that of the Schwarzschild metric, and when it was discovered (three weeks after Einstein published his theory of General Relativity) it was already a representation of remarkable new physics: it represents a non-rotating black hole.

It turns out that things on the manifold of spacetime like matter and light move along particular paths. These paths are called "geodesics", and they're the shortest paths between two points on the manifold. For instance, a geodesic on a plane is just a straight line.

The geodesics on a manifold are solutions to a second-order differential equation known as the geodesic equation. To figure out how an object moves in curved spacetime, you just need to solve the geodesic equation with the right initial conditions.

Light follows a particular subset of geodesics called "null" geodesics—these are geodesics for which the right hand side of the geodesic equation is zero. So, to figure out how light moves around a black hole, we can find the geodesic equation for the Schwarzschild metric, and solve it numerically to find the null geodesics—this is the analogue of "tracing rays" in the usual flat-space raytracing algorithm.

Of course, this ends up being wildly computationally expensive, so to make things go faster, we set up a static scene where an image is placed some distance behind a black hole. For each pixel in the image we wanted to render, we marched rays of light around the scene using the geodesic equation until they either fell into the black hole or hit the image behind the black hole. We recorded which pixel each pixel in the output rendered image corresponded to in the image behind the black hole, and used this data to generate a lookup table. In this way, we can run the rendering algorithm once to generate the table, then use the lookup table to place any image behind the black hole.

The results? Pretty cool!

For the culminating project of Brown's CS224 course, "Advanced Computer Graphics", I worked with my fellow Brown students Vicki Tan and Kelly Wang to implement R. Miyazaki's seminal cloud simulation paper. Over the course of six weeks, we studied, understood, and implemented the algorithm described in the paper from scratch. In fact, we even fixed a part of it.

Before I dig into the details, I should note that this was all done in C++ with QtCreator.

Physically-based cloud simulation starts with physically-based fluid simulation: after all, clouds are a product of atmospheric fluid dynamics. The famous Navier-Stokes equations govern the macroscopic motion of a fluid. They are based on a key observation: to completely describe the state of a body of fluid at some point in time, all you need to know is its velocity at every point in space. Because of this, they model the fluid as a vector field of velocities, and describe how that vector field evolves with time.

The Navier-Stokes equations, in their full form, are a huge tensorial monster that can get very complex (in fact, it's still not known if solutions always exist in certain configurations). To make things easier on ourselves, we make two key assumptions.

First, we assume that the fluid is "isotropic", meaning that the way it responds to stress is the same in each direction. This lets us reduce the stress tensor to basically one number.

b) "incompressible", meaning that, roughly, the density of the fluid in a unit volume doesn't change unless more mass is added to the volume.

The first Navier-Stokes equation states that the field is divergenceless, which, coupled with the incompressibility assumption, can be shown to imply that the flow conserves mass.

The second Navier-Stokes equation is basically a list of the accelerations that the fluid experiences. These include a) external accelerations, like gravity, b) an acceleration that moves fluid from areas of high to low pressure, c) a "diffusive" acceleration that reduces the differences between neighboring velocities in accordance with a "viscosity" parameter, and d) an "advective" acceleration that compensates for the fluid velocity being transported through space by itself during the interval used to calculate the time derivative.

To simulate the Navier-Stokes equations for incompressible isotropic flow, we used the (now classic) semi-Lagrangian scheme developed by Jos Stam back in the early 2000's. This isn't necessarily the most accurate strategy for fluid simulation, but it is preferred for simulating atmospheric fluid dynamics because it is unconditionally stable. This means that you can make the time step as big as you want without the simulation "blowing up", something necessary to make simulating the large time scales associated with atmospheric fluid dynamics computationally tractable.

We represent the simulation space with a voxel grid—you can think of this as a big grid of little cubes, each of which has an associated velocity value (a natural discretization of a vector field). During a single time-step, Stam's method separates out each of the accelerations and applies them to the velocity field sequentially. External accelerations are applied via a simple first-order forward integration. Pressure and diffusion are formulated as sparse linear systems, which are solved using a sparse solver.

Where Stam's method really gets innovative is in the advection step. To account for the velocity of the fluid in a voxel being transported by itself, Stam proposes that we imagine a particle from that voxel being traced backward in time to a previous place in the grid. We then sample the velocity at this place in the grid, and then use that as the velocity for our voxel. From this description, it's clear why the method is unconditionally stable: it only ever interpolates velocity values that are already in the grid, so it can never "blow up" to hugely positive or hugely negative values.

To implement semi-Lagrangian advection, we had to implement a voxel grid class in C++ that could be sampled at non-integer points and store its data in a way that was conducive to being used in sparse-matrix computations. Using the Eigen math library, we stored the entire grid as a one-dimensional vector in x-major/y-middle/z-minor order, and wrote an accompanying discrete derivatives class that generated the appropriate discretized vector calculus operators as matrices that could multiply this 1D vector. We then used the Eigen SparseLU solver to implement the diffusion and pressure steps. We exposed access to individual points in the grid via a sampling function that tri-linearly interpolated between grid cells. This made back-sampling for the advection step easy and readable.

Here's a fun little animation we rendered of a high-vorticity fluid to demonstrate our implementation of Stam's method. To generate it, we wrote out the voxel grid to a file using OpenVDB, and rendered it using Maya/Arnold.

Semi-Lagrangian advection is pretty cool, but it's not without its faults. It is not inherently mass-conserving (though it can be made to be), and it also can have trouble handling particular boundary conditions, as we found out painfully during our implementation process.

Once you've got a fluid simulator, making the transition to cloud simulation is only a matter of making a few extensions.

The broad strokes explanation for how clouds form is easy enough to understand. The sun heats the ground. This creates a buoyant force, transporting water vapor up higher into the atmosphere. As it moves upwards, its temperature decreases—a process known as "adiabatic cooling". Once its temperature decreases enough, it is too cold to remain as water vapor, so it condenses into water droplets: clouds. When water vapor is converted into water droplets, it releases latent heat, which results in additional buoyant force that transports the water droplets and remaining vapor even further skyward, creating the puffy cloud tops we're all familiar with.

Ok, cool, so that's how it works. Now how do we simulate it?

To start, we're going to need to store more data. In particular, we'll need voxel grids to store cloud density, vapor density, and temperature. These will be transported around the simulation space via the velocity field, using semi-Lagrangian advection.

To model bouyancy, we'll apply an additional force to the velocity field in the upward direction that is proportial to the difference between the temperature and the "ambient temperature" (which we'll set for each cell in the grid beforehand and which will remain constant with time) at a point. We'll simulate the heating of the ground by placing a heat source along the bottom of the simulation space.

Adiabatic cooling is easy enough—just increase or decrease the temperature of a grid cell in a way that's proportional to its upward velocity.

The phase transition gets a little trickier. We can define a quantity called the "saturation vapor density" that depends on temperature. If there's more vapor in a cell than its saturation vapor density, then some of that excess vapor starts getting converted to clouds. If there's less, then whatever cloud density is there starts getting converted back to vapor. Seems simple enough, right? Turns out the Miyazaki paper got it wrong, and the way they calculated saturation vapor density guaranteed that no clouds would ever form. We had to make a slight modification to the equation to ensure that it modeled the correct behavior.

Finally, to ensure that we don't lose small-scale detail to the smoothing effect of semi-Lagrangian advection, we'll apply a force that encourages areas of the fluid with high curl to continue swirling—a well-known augmentation to Stam's method known as "vorticity confinement".

So, if you manage to do all that correctly, you'll be able to generate physically based clouds. Of course, chances are, instead you'll run into...

We thought we were close to being done once we'd written all the code. In reality, we spent more time debugging the simulation than we did writing it. Such is the nature of code that takes an incredibly long time to run and can't really be unit-tested. Truth be told, there were more bugs than I have space to write about here, but I'll document some of the key ones.

One bug we encountered was some twist on the classic off-by-one error: we had a mistake in the way we were converting from physical "world space" (in meters) to non-physical index space (grid cell numbers). We had forgetten to account for the fact that we were assuming that the representative point for grid cells was in their center, not their bottom-left corner.

Another issue stemmed from the way we handled advection steps outside the simulation space. We initially just clamped the sample point to lie inside the grid, and used the value at that point. Upon further consideration, we realized this was completely wrong—you have to calculate the intersection of the advection path with the boundary it passes through, and use that point, or, better yet, bounce it off all boundaries it intersects until the full backward time-step is completed.

Perhaps the worst difficulty—though one that you might not necessarily call a bug—was determining and implementing the best boundary conditions for the voxel grids. We tried a number of different options, but all suffered from either artificial decrease or increase of total density. Ultimately, we opted to use a method that modeled collisions with the boundaries (a "reflective" boundary condition), and account for the density dissipation by adding a water vapor source at the bottom of the simulation space.

All in all, this was a very challenging project, and debugging the various boundary condition bugs and trying to make sense of the paper tested my patience more times than one. Ultimately though, it was massively rewarding when we were able to render some beautiful, realistic clouds!

Interactive fluid simulations have been a sort of “holy grail” for real-time graphics applications for some time now. The main issue is that they’re very slow to compute. We (myself, Natalie Lindsay, Michael Cosgrove, and Purvi Goel) hypothesized that it’s possible to compute a physically accurate fluid simulation at low resolution in real-time, and use deep learning to efficiently up-sample the resulting fluid simulation in a way that is aesthetically convincing, even if it's not physically accurate.

As a reminder, the state of a fluid is described by,

- t: global time
**v**(**x**, t): the velocity of the fluid at time t and position**x**- ⍴(
**x**, t): the density of the fluid at time t and position**x** - Some additional parameters (viscosity, external forces, etc.)

The Navier-Stokes equations tell us how such a fluid’s state evolves with time. The job of a fluid sim is to discretize the fluid’s state somehow, and then approximate a solution to the Navier-Stokes equations to time evolve the fluid. We represent **v**(**x**, t) and ⍴(**x**, t) as discretized grids that we can sample at any point via interpolation.

Then, a really, really high-level diagram of a traditional fluid sim might look something like:

Our goal was to augment such a fluid simulation such that, given hi-res **V**(**x**, t) and Ρ(**x**, t), we downsample them to a lower resolution to get **V**(**x**, t) and ⍴(**x**, t), use a fluid sim to get **v**(**x**, t + 1) and ⍴(**x**, t + 1), and run the result through Wetnet to upsample the grids to get an approximation of **V**(**x**, t + 1) and Ρ(**x**, t + 1). Using yet another high-level diagram:

Our model architecture is an extended autoencoder that shrinks frames of the velocity grid down to features using convolutional layers, and then upsamples them with transposed convolutional layers to predict a modification to the velocity field. We then add this predicted offset to the original lo-res field to get our estimate of the hi-res field.

The key difference between our model and a vanilla autoencoder is the use of RNN cells at the bottleneck. Every frame of a fluid simulation is entirely dependent on preceding frames. In order to capture that important causal behavior, we need something in the network that can maintain state across frames of the simulation---in our case, a series of RNNs.

More precisely, we concatenated each of the encoded frames’s features and passed each feature into a separate RNN. The idea here was that every encoded high-detail feature will time-evolve in a different way, and each per-feature RNN can capture this information and make use of it to improve the temporal consistency of the predictions.

You may be wondering where the architecture diagram is. If you scroll down a bit, you'll find it, but it's useful to talk about one more thing before we get to that: the loss.

The natural loss that comes to mind when working with regular grid data is a summation over the L2 norm of the grid differences. Our loss terms mostly conform to this standard, but we made a few key choices about what grids to compare.

Our first loss term is just the vanilla L2 loss between the predicted frame and the actual high-res frame; straightforward. However, we also introduce a second L2 loss term, **bidirectional physically based consequence loss**. That's quite a mouthful, but the idea is simple: to improve temporal consistency, we regularize the usual L2 loss with the L2 loss between the actual high-res frame at time t+1 and the **upsampled frame** put through a traditional high-res fluid sim to get the predicted upsampled frame at t+1. In other words, we enforce that our prediction, once plugged back into a regular fluid sim, produces reasonable results. We also do this for t-1 by running the sim backwards.

To optimize the parameters of the network to minimize this consequence loss, the fluid sim you use needs to be differentiable... so we built a differentiable fluid sim in numpy!

Our third loss term is not an L2 loss at all, but rather a discriminator loss. We train the discriminator in conjunction with the autoencoder to recognize if frames of a high-res fluid sim are really high-res or results of the upsampling process---in other words, if they're real or fake. The idea here is that, at the end of the day, we don't really care if the upsampled frame is exactly the same as the real high-res frame, we care if it appears real or not.

So, with that, we have our final architecture diagram, discriminator included.

Here is an image of our results. The above image is a screen capture of our results. On the left is the low resolution fluid simulation, and on the right is the detailed target fluid simulation. The middle is our upsampled simulation.

While we don’t achieve the fine-grained details of the high resolution fluid simulation, our simulation still displays some of the curls and vortices that the low-resolution simulation could not capture. We also found that the deep-learning-upsampled simulation took about 25% less time to run when compared to the high resolution simulation, which was one of our ultimate goals. We predict that this number would be even more relevant as the target simulations became higher and higher resolution, and is a promising observation for production fluid simulations that have high speed and computational demands.

All in all, we feel pretty good about the chances of this being possible, even if our method wasn’t a complete success. Some possible further directions include,

- Using a conditional GAN, as opposed to an autoencoder with a discriminator loss. This is the approach taken by TempoGAN and it seems to work quite well.
- Projecting the sim into a higher-dimensional sparser representation to make it easier to separate distinct features—so-called “dictionary learning”, described in Dynamic Upsampling of Smoke Through Dictionary-based Learning.
- Passing in more prior frames as input to the network. Realistically, only 2 prior frames should be necessary, so that finite differences that are second order in time can be approximated more accurately.
- Applying the same forces at the same time to the high-res simulation and our upsampled simulation, so we can compare them side by side.

Want to see the code? This is a public repo on github!