Improving performance of raytracing hit function

Tags:

python

I have a simple raytracer in python. rendering an image 200x200 takes 4 minutes, which is definitely too much for my taste. I want to improve the situation.

Some points: I shoot multiple rays per each pixel (to provide antialiasing) for a grand total of 16 rays per pixel. 200x200x16 is a grand total of 640000 rays. Each ray must be tested for impact on multiple Sphere objects in the scene. Ray is also a rather trivial object

class Ray(object):
    def __init__(self, origin, direction):
        self.origin = numpy.array(origin)
        self.direction = numpy.array(direction)

Sphere is slightly more complex, and carries the logic for hit/nohit:

class Sphere(object):
    def __init__(self, center, radius, color):
        self.center = numpy.array(center)
        self.radius = numpy.array(radius)
        self.color = color

    @profile 
    def hit(self, ray):
        temp = ray.origin - self.center
        a = numpy.dot(ray.direction, ray.direction)
        b = 2.0 * numpy.dot(temp, ray.direction)
        c = numpy.dot(temp, temp) - self.radius * self.radius
        disc = b * b - 4.0 * a * c

        if (disc < 0.0):
            return None
        else:
            e = math.sqrt(disc)
            denom = 2.0 * a
            t = (-b - e) / denom 
            if (t > 1.0e-7):
                normal = (temp + t * ray.direction) / self.radius
                hit_point = ray.origin + t * ray.direction
                return ShadeRecord.ShadeRecord(normal=normal, 
                                               hit_point=hit_point, 
                                               parameter=t, 
                                               color=self.color)           

            t = (-b + e) / denom

            if (t > 1.0e-7):
                normal = (temp + t * ray.direction) / self.radius                hit_point = ray.origin + t * ray.direction
                return ShadeRecord.ShadeRecord(normal=normal, 
                                               hit_point=hit_point, 
                                               parameter=t, 
                                               color=self.color)       

        return None

Now, I ran some profiling, and it appears that the longest processing time is in the hit() function

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2560000  118.831    0.000  152.701    0.000 raytrace/objects/Sphere.py:12(hit)
  1960020   42.989    0.000   42.989    0.000 {numpy.core.multiarray.array}
        1   34.566   34.566  285.829  285.829 raytrace/World.py:25(render)
  7680000   33.796    0.000   33.796    0.000 {numpy.core._dotblas.dot}
  2560000   11.124    0.000  163.825    0.000 raytrace/World.py:63(f)
   640000   10.132    0.000  189.411    0.000 raytrace/World.py:62(hit_bare_bones_object)
   640023    6.556    0.000  170.388    0.000 {map}

This does not surprise me, and I want to reduce this value as much as possible. I pass to line profiling, and the result is

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    12                                               @profile
    13                                               def hit(self, ray):
    14   2560000     27956358     10.9     19.2          temp = ray.origin - self.center
    15   2560000     17944912      7.0     12.3          a = numpy.dot(ray.direction, ray.direction)
    16   2560000     24132737      9.4     16.5          b = 2.0 * numpy.dot(temp, ray.direction)
    17   2560000     37113811     14.5     25.4          c = numpy.dot(temp, temp) - self.radius * self.radius
    18   2560000     20808930      8.1     14.3          disc = b * b - 4.0 * a * c
    19                                                   
    20   2560000     10963318      4.3      7.5          if (disc < 0.0):
    21   2539908      5403624      2.1      3.7              return None
    22                                                   else:
    23     20092        75076      3.7      0.1              e = math.sqrt(disc)
    24     20092       104950      5.2      0.1              denom = 2.0 * a
    25     20092       115956      5.8      0.1              t = (-b - e) / denom
    26     20092        83382      4.2      0.1              if (t > 1.0e-7):
    27     20092       525272     26.1      0.4                  normal = (temp + t * ray.direction) / self.radius
    28     20092       333879     16.6      0.2                  hit_point = ray.origin + t * ray.direction
    29     20092       299494     14.9      0.2                  return ShadeRecord.ShadeRecord(normal=normal, hit_point=hit_point, parameter=t, color=self.color)

So, it appears that most of the time is spent in this chunk of code:

        temp = ray.origin - self.center
        a = numpy.dot(ray.direction, ray.direction)
        b = 2.0 * numpy.dot(temp, ray.direction)
        c = numpy.dot(temp, temp) - self.radius * self.radius
        disc = b * b - 4.0 * a * c

Where I don't really see a lot to optimize. Do you have any idea how to make this code more performant without going C ?

976

asked Jun 29 '11 22:06

Stefano Borini

2 Answers

Looking at your code, it looks like your main problem is that you have lines of code that are being called 2560000 times. That will tend to take a lot of time regardless what kind of work you are doing in that code. However, using numpy, you can aggregate alot of this work into a small number of numpy calls.

The first thing to do is to combine your rays together into large arrays. Instead of using a Ray object that has 1x3 vectors for origin and direction use Nx3 arrays that have all of the rays you need for the hit detection. The top of your hit function will end up looking like this:

temp = rays.origin - self.center
b = 2.0 * numpy.sum(temp * rays.direction,1)
c = numpy.sum(numpy.square(temp), 1) - self.radius * self.radius
disc = b * b - 4.0 * c

For the next part you can use

possible_hits = numpy.where(disc >= 0.0)
a = a[possible_hits]
disc = disc[possible_hits]
...

to continue with just the values that pass the discriminant test. You can easily get orders of magnitude performance improvements this way.

answered Oct 03 '22 14:10

sholte

1) Ray tracing is fun but if you care at all about performance, dump python and switch to C. Not C++ unless you are some kind of super expert, just C.

2) The big win in scenes with multiple (20 or more) objects is to use a spatial index to reduce the number of intersection tests. Popular options are kD-trees, OctTrees, AABB.

3) If you're serious, check out ompf.org - it is the resource for this.

4) Don't go to ompf with python asking about optimization - most people there can shoot 1 Million to 2 Million rays per second through an indoor scene with 100 thousand triangles... Per core.

I love Python and ray tracing, but would never consider putting them together. In this case, the correct optimization is to switch languages.

answered Oct 03 '22 13:10

phkahler

Related questions
                            
                                Convert dictionaries with list of values into a dataframe
                            
                                Calculate percentage of similar values in pandas dataframe
                            
                                How do i separate overlapping cards from each other using python opencv?
                            
                                Writing to the windows logs in Python
                            
                                Is there something like Python's getattr() in C#?
                            
                                Is there a Python MTA (Mail transfer agent) [closed]
                            
                                Checking if a domain name is registered
                            
                                Python urllib3 and how to handle cookie support?
                            
                                Accessing relative path in Python
                            
                                How to make a simple table in ReportLab
                            
                                Differences Between Python and C++ Constructors
                            
                                Trouble using Vim's syn-include and syn-region to embed syntax highlighting
                            
                                Catching an exception while using a Python 'with' statement - Part 2
                            
                                List.append() changing all elements to the appended item [duplicate]
                            
                                How to save figures to pdf as raster images in matplotlib
                            
                                Django DateTimeField auto_now_add not working
                            
                                Working with multiple code files and folders in Python
                            
                                How to initialize classes (not instances) in Python?
                            
                                How to reload Python module in IDLE?
                            
                                Can I run GLU (OpenGL) on a headless server?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With