Understanding Perlin Noise
Technical Writeup Posted on 09 August 2014 by Flafla2
The objective of this article is to present an easytounderstand analysis of Ken Perlin's Improved Perlin Noise. The code in this article is written in C# and is free to use. If you would prefer to just look at the final result, you can view the final source here.
Perlin Noise is an extremely powerful algorithm that is used often in procedural content generation. It is especially useful for games and other visual media such as movies. The man who created it, Ken Perlin, won an academy award for the original implementation. In this article I will be exploring his Improved Perlin Noise, published in 2002. In game development, Perlin Noise can be used for any sort of wavelike, undulating material or texture. For example, it could be used for procedural terrain (Minecraftlike terrain can be created with Perlin Noise, for example), fire effects, water, and clouds. These effects mostly represent Perlin noise in the 2^{nd} and 3^{rd} dimensions, but it can be extended into the 4^{th} dimension rather trivially. Additionally Perlin Noise can be used in only 1 dimension for purposes such as sidescrolling terrain(such as in Terraria or Starbound) or to create the illusion of handwritten lines.
Also, if you extend Perlin Noise into an additional dimension and consider the extra dimension as time, you can animate it. For example, 2D Perlin Noise can be interpreted as Terrain, but 3D noise can similarly be interpreted as undulating waves in an ocean scene. Below are some pictures of Noise in different dimensions and some of their uses at runtime:
Noise Dimension  Raw Noise (Grayscale)  Use Case 

1  Using noise as an offset to create handwritten lines. 

2  By applying a simple gradient, a procedural fire texture can be created. 

3  Perhaps the quintessential use of Perlin noise today, terrain can be created with caves and caverns using a modified Perlin Noise implementation. 
So as you can see, Perlin Noise has an application to many naturallyoccurring phenomenon. Now let's look into the mathematics and logic of the Perlin Noise Algorithm.
Logical Overview
NOTE: I would like to preface this section by mentioning that a lot of it is taken from this wonderful article by Matt Zucker. However, that article is based on the original Perlin Noise algorithm written in the early 1980s. In this post I will be using the Improved Perlin Noise Algorithm written in 2002. Thus, there are some key differences between my version and Zucker's.
Let's start off with the basic Perlin Noise function:
So we have an x, y and z coordinate as input, and as the output we get a double
between 0.0 and 1.0. So what do we do with this input? First, we divide the x, y, and z coordinates into unit cubes. In other words, find [x,y,z] % 1.0
to find the coordinate's location within the cube. Below is a representation of this concept in 2 dimensions:
Figure 1: The blue dot here represents an input coordinate, and the other 4 points are the surrounding integral unit coordinates. (Source)
On each of the 4 unit coordinates (8 in 3D), we generate what's called a pseudorandom gradient vector. This gradient vector defines a positive direction (in the direction that it points to) and of course a negative direction (in the direction opposite that it points to). Pseudorandom means that, for any set of integers inputted into the gradient vector equation, the same result will always come out. Thus, it seems random, but it isn't in reality. Additionally this means that each integral coordinate has its "own" gradient that will never change if the gradient function doesn't change.
Figure 2: Based on the above image, here are some example gradient vectors. (Source)
The image above is not completely accurate, however. In Ken Perlin's Improved Noise, which we are using in this article, these gradients aren't completely random. Instead, they are picked from the vectors of the point in the center of a cube to the edges of the cube:
The reasoning behind these specific gradient vectors is described in Ken Perlin's SIGGRAPH 2002 paper: Improving Noise. NOTE: Many other articles about Perlin Noise refer to the original Perlin Noise algorithm, which does not use these vectors. For example, Figure 2 represents the original algorithm because its source was written before the improved algorithm was released. However, the basic idea is the same.
Next, we need to calculate the 4 vectors (8 in 3D) from the given point to the 8 surrounding points on the grid. An example case of this in 2D is shown below.
Figure 3: Example distance vectors. (Source)
Next, we take the dot product between the two vectors (the gradient vector and the distance vector). This gives us our final influence values:
grad.x * dist.x + grad.y * dist.y + grad.z * dist.z
This works because the dot product of 2 vectors is equal to the cosine of the angle between the two vectors, multiplied by the magnitude of those vectors:
dot(vec1,vec2) = cos(angle(vec1,vec2)) * vec1.length * vec2.length
In other words, if the 2 vectors are pointing in the same direction the dot product would equal:
1 * vec1.length * vec2.length
..and if the two vectors are pointing in opposite directions the dot product would equal:
1 * vec1.length * vec2.length
If the two vectors are perpendicular, the dot product is 0.
Thus, the result of the dot product would be positive when it is in the direction of the gradient, and negative when it is in the opposite. This is how the gradient vectors define positive and negative directions. Here is a graphic representing these positive/negative influences:
Figure 4: A representation of these influences in 2D noise. (Source)
So now all we need to do is interpolate between these 4 values so that we get a sort of weighted average in between the 4 grid points (8 in 3D). The solution to this is easy: average the averages like so (this example is in 2D):
There is one final piece to this puzzle: with the above weighted average, the final result would look bad because linear interpolation, while computationally cheap, looks unnatural. We need a smoother transition between gradients. So, we use a fade function, also called an ease curve:
Figure 5: This is an ease curve. (Source)
This ease curve is applied to the u
and v
values in the above code example. This makes changes more gradual as one approaches integral coordinates. The fade function for the improved perlin noise implementation is this:
6t^{5}15t^{4}+10t^{3}
Logically, that's it! We now have all of the components needed to generate Perlin Noise. Now let's jump into some code.
Code Implementation
Once again, this code is written in C#. The code is a slightly modified version of Ken Perlin's Java Implementation. It was modified for additional clarity and deobfuscation, as well as adding the ability to repeat (tile) noise. The code is of course entirely free to use (considering I didn't really write it in the first place  Ken Perlin did!).
Setting Up
The first thing we need to do is set up our permutation table, or the p[]
array for short. This is simply a length 256 array of random values from 0  255 inclusive. We also repeat this array (for a total size of 512) to avoid buffer overflow later on:
The p[]
array is used in a hash function that will determine what gradient vector to use later on. The specifics of this will be explained later.
Next, we begin our perlin noise function:
This code is pretty self explanatory. First, we use the modulo (remainder) operator to have our input coordinates overflow if they are over the repeat
variable. Next, we create variables xi, yi, zi
. These represent the unit cube that our coordinate is in. We also bind our coordinates to the range [0,255] inclusive so that we won't run into overflow errors later on when we access the p[]
array. This also has an unfortunate side effect: Perlin noise always repeats every 256 coordinates. This isn't a problem though because decimal coordinates are possible with perlin noise. Finally we find the location of our coordinate inside its unit cube. This is essentially n = n % 1.0
where n is a coordinate.
The Fade Function
Now we need to define our fade function, described above (Figure 5). As mentioned earlier, this is the Perlin Noise fade function:
6t^{5}15t^{4}+10t^{3}
In code, it is defined like this:
The u / v / w
values will be used later with interpolation.
The Hash Function
The Perlin Noise hash function is used to get a unique value for every coordinate input. A hash function, as defined by wikipedia, is:
…any function that can be used to map data of arbitrary size to data of fixed size, with slight differences in input data producing very big differences in output data.
This is the hash function that Perlin Noise uses. It uses the p[]
table that we created earlier:
The above hash function hashes all 8 unit cube coordinates surrounding the input coordinate. inc()
is simply used to increment the numbers and make sure that the noise still repeats. If you didn’t care about the ability to repeat, inc(xi)
can be replaced by xi+1
. The result of this hash function is a value between 0 and 255 (inclusive) because of our p[]
array.
The Gradient Function
I have always thought that Ken Perlin's original grad()
function is needlessly complicated and confusing. Remember, the goal of grad()
is to calculate the dot product of a randomly selected gradient vector and the 8 location vectors. Ken Perlin used some fancy bitflipping code to accomplish this:
Below is an alternate way of writing the above code in a much more easytounderstand way (and actually faster in many languages):
The source of the above code can be found here. In any case, both versions do the same thing. They pick a random vector from the following 12 vectors:
This is determined by the last 4 bits of the hash function value (the first parameter of grad()
). The other 3 parameters represent the location vector (that will be used for the dot product).
Putting it all Together
Now, we take all of these functions, use them for all 8 surrounding unit cube coordinates, and interpolate them together:
Working with Octaves
One final thing I would like to discuss is how to process perlin noise to look more natural. Even though perlin noise does provide a certain degree of natural behavior, it doesn’t fully express the irregularities that one might expect in nature. For example, a terrain has large, sweeping features such as mountains, smaller features such as hills and depressions, even smaller ones such as boulders and large rocks, and very small ones like pebbles and minute differences in the terrain. The solution to this is simple: you take multiple noise functions with varying frequencies and amplitudes, and add them together. Of course, frequency refers to the period at which data is sampled, and amplitude refers to the range at which the result can be in.
Figure 6: 6 Example noise results with differing frequencies and amplitudes. (Source)
Add all of these results together, and you get this:
Figure 7: The addition of the 7 above results. (Source)
Obviously this result is much more convincing. The above 6 sets of noise are called different octaves of noise. Each successive octave has less and less influence on the final result. Of course, with each octave there is a linear increase in code execution time, so you should try not to use more than just a few octaves at runtime (for example, with a procedural fire effect running at 60fps). However, octaves are great when preprocessing data (such as generating terrain).
But how much influence should each successive octave have, quantitatively? This question is answered by another value called persistence. Hugo Elias defines Persistence as follows:
frequency = 2^{i}
amplitude = persistence^{i}
Where i is the octave in question. In code, the implementation is simple:
Conclusion
Finally, that's it! We can now make noise. Once again, you can find the full source code here. If you have any questions, please ask in the comments section below.
References
Here are some references that you can check out if you are interested:
 Ken Perlin's "Official" Improved Perlin Noise  This is the original algorithm as written by Ken Perlin.
 "The Perlin Noise Math FAQ"  This is excellent as a theoretical reference about the algorithm. Keep in mind however that it uses the original Perlin Noise algorithm from the 80s, not the one that I used in this tutorial.
 Hugo Elias' article  One of the most popular Perlin Noise articles. This is a good reference about octaves, persistence, and some uses of perlin noise in the real world. However, this is not true perlin noise! Hugo's algorithm is not based on gradients like Perlin Noise is. Instead it is what's known as value noise, which is essentially blurred white noise. Do not confuse yourself!
 "How to use Perlin Noise in Your Games"  Devmag  A cool article about some potential uses of Perlin Noise. A very interesting read, but once again this is NOT true perlin noise! Devmag uses value noise in that article.
 GPU Gems  "Implementing Improved Perlin Noise"  This article taps into the awesome power of the GPU to render stunning ocean scenes in realtime using Perlin Noise. Shaders are cool!
Thank you for reading!
Updates
8/9/14  I have updated the article to include a better explanation about dot products, and I fixed some typos in the article. Huge thanks to all of the advice given to me by reddit /r/programming, /r/gamedev, and /r/Unity3D.