One way to define a surface in three-dimensions is through a function $f:\mathbb{R}^3\rightarrow\mathbb{R}$ which satisfies
$$f(\boldsymbol{x})=0,$$ for every point $\boldsymbol{x}\in \mathbb{R}^3$ that belongs to the surface. Thus, the surface $S$ can be defined as
$$S=\{\boldsymbol{x} \in \mathbb{R}^3\,|\,f(\boldsymbol{x})=0\}.$$ For one to visualize the surface $S$, the intersection point $\boldsymbol{p} \in \mathbb{R}^3$ of a ray---starting from the camera position $\boldsymbol{c} \in \mathbb{R}^3$ to some direction $\boldsymbol{d} \in \mathbb{R}^3$, $\|\boldsymbol{d}\|=1$---and the surface must be sought. After these intersection points are known for sufficiently many rays, a visualization can be constructed (some examples follow).
One way to find the intersection point $\boldsymbol{p}$ is through analytical calculations: the surface defined by $S$ is sometimes so simple that this calculation can be done by hand. An example that considers the intersection of a ray and a sphere can be found
from Wikipedia. This is exactly what the traditional ray tracing is all about.
In a more general setting, no analytical relationship exists for finding the intersection point $\boldsymbol{p}$. In such a case, one must rely to numerical methods.
A proficient applied mathematician would immediately attempt to apply, e.g., Secant method to the problem: find $t \in \mathbb{R}$ such that
$$f(\boldsymbol{c}+t\boldsymbol{d}) \approx 0.$$ There is, however, a caveat. For a general $f$ the function
$$g(t):=f(\boldsymbol{c}+t\boldsymbol{d})$$ might have several roots and in Secant method (or similar) nothing guarantees that a root nearest to the camera is recovered. You might end up drawing the wrong side of the object of interest! Thus, the applied numerical method should be guaranteed to find a root nearest to the camera.
A somewhat trivial method to find the nearest (or approximately nearest) root is to fix a sufficiently small constant step size $\Delta t$ and then find the smallest $n \in \mathbb{N} \cup \{0\}$ such that
$$f(\boldsymbol{c}+n \Delta t \boldsymbol{d})\approx 0.$$ This is known as
ray marching and is indeed a valid method for visualizing the surface $S$. The problem is that it's
tremendously slow since one must try every possible $n$ starting from zero!
Fortunately, there exists a clever trick to make the computation faster by ensuring that $f$ defines a
distance field of $S$. This means that at every point $\boldsymbol{x}$ in the space $\mathbb{R}^3$ the value $f(\boldsymbol{x})$ represents the
minimum distance from $\boldsymbol{x}$
to the surface $S$. This sounds like a hard restriction to satisfy but for many fractals the distance fields (or approximate distance fields) have already been derived by enthusiasts.
The algorithm goes as follows:
1. $\boldsymbol{p}:=\boldsymbol{c}$
2. $\boldsymbol{p}:=\boldsymbol{p}+f(\boldsymbol{p})\boldsymbol{d}$
3. If $f(\boldsymbol{p})$ is larger than some tolerance go back to step 2. Otherwise, break.
Output of the prescribed algorithm is the intersection point $\boldsymbol{p}$.
If you happen to own a sufficiently fast graphics processing unit and know the function $f$ for some fractal then this method can be implemented in real time using, e.g., OpenGL Shading Language. See the following Shadertoy entry I constructed as an example:
Getting the coloring and other visuals right requires of course lots of experience (or trial-and-error) but this is the basic idea for generating such animations. I tried to add some comments to the shader code and attempted to use same symbols as in this post.