PDA

View Full Version : mhouston: On Raytracing


Nerve-Damage
12-22-2006, 11:13 AM
mhouston: On Raytracing (http://www.beyond3d.com/forum/showthread.php?p=896065#post896065)

By the way mhuston is programmer, GPU guru (IMO), Phd student at Stanford University, author, and a pretty cool guy.

There is also an upcoming I3D paper about GPU raytracing using optimized kd-tree algorithms and static scenes. (I happen to be one of the authors, and the work is an evolution of the Foley et al. work). The short answer is that GPUs are faster than a single CPU, but they aren't great at raytracing because of divergence in execution between rays. As the execution traces diverge in the acceleration structures, you end up with a lot of SIMD execution stalls. GPUs also have to currently to a bunch of extra work because there isn't an effective way to do a stack, so it has to be emulated or worked around via algorithm modifications. Sadly, the G80's 16KB of global memory between the threads isn't very helpful as it's too small to really do a stack for the number of parallel execution contexts to run efficiently, however, there might be fruit here. We currently are talking ~19Mray/s on an X1900XTX (Conference room, shadow rays), and about the same on a G80 with DirectX and the current state of the drivers and shader compilers. Using simpiler scenes, we can execute at much faster rates, but those aren't realistic. (All the current published fast raytracing numbers also do not do anything but trivial shading, but GPUs obviously do well here...) With heavier tuning via CTM/CUDA, we might be able to squeeze out a little more, but unless we can regain ray coherence, it's difficult to do leaps and bounds better.

Cell is actually a raytracing monster, compared to other non-custom architectures, in certain situations. The Saarland folks (and others including Stanford) have Cell raytracers >60Mrays/s for primary rays. Multi-core CPUs are also showing great promise as people are showing >5Mrays/s per processor for comparable designs (i.e. no tricks that only really work for primary rays), and there is impressive work from Intel on really optimizing the heck out of raytracing on CPUs. My main concern about CPU implementations is their ability to shade fast. It's going to be interesting to see hybrid CPU/GPU implementations here...

In our I3D paper, we argue that what you would likely do on a GPU is rasterize the primary hits, and raytrace secondary effects such as shadows, reflections, and refractions depending on your ray/frame rate budget. We have a demonstration implementation in Direct3D (Brook + D3D) as well as CTM+GL that demonstrates this hybrid system, and it was running in the ATI booth at SIGGRAPH and shown during the CTM presentations. The paper should go up when finalized in late January and will be presented at I3D 2007 by Daniel Horn. For raytracing to really get faster on GPUs, we need a way to deal with the cost of instruction divergence, and more importantly perhaps, ways to really build complex data structures.

Regardless, we are still quite aways away from the projected 400Mray/s needed to approach the performance of current games. (I can't remember who stated this as a rough lower bound, but it was in the SIGGRAPH 2006 raytracing course.) We need a few more years of Moore's Law and a few algorithm tricks, mostly involving dynamic scenes, and things will start to get interesting. But, rasterization will also increase at or above Moore's Law as well and game developers will continue to find tricks and hacks to make things look right, or use partial raytracing solutions like POM.

Applefiend
12-22-2006, 12:22 PM
And you can do all the stuff they're doing on PS3 Linux. No RSX needed.

Garfunkel
12-22-2006, 01:05 PM
we don't know if they are talking about ps3 cell or the other revisions.

We still need RSX.

Nice info ND.

frosty
12-22-2006, 01:39 PM
Not for raytracing, it can be done entirely on Cell. Real time raytracing, not so much.

False_Messiah
12-22-2006, 07:22 PM
Ok, cell is good at ray tracing, but it wont be doing only that. AI, physics, all the rest, probably will make a huge impact in performance(I guess...).

Lekko
12-23-2006, 01:39 AM
I wonder... for that 400M rays/sec, is that for HD? What resolution are they rendering at? If 1080p is 2 million pixels, than each pixel would need about 200 rays? From my limited understanding of raytracing, that doesn't sound quite right...

I wonder if realtime raytracing could be pulled off on PS3 in 480p, maybe even 480i.

hmmm... the PS3 store needs to offer these tech demo programs. Or have a site to download from to try out yourself on Linux, it would be cool to see how it looks.

xbdestroya
12-23-2006, 01:56 AM
we don't know if they are talking about ps3 cell or the other revisions.

All versions of Cell at the present time are in the PS3 revision family, so performance would be the same.

That said... even though we're all fans of raytracing - and it does show what sort of a beast Cell is - graphics in the games we buy at retail are not going to be raytracing-based, at least not as the primary rendering method.

Garfunkel
12-23-2006, 03:35 AM
Cell might be able to do raytracing but it will kill it performance wise, that is why we need RSX.

cpiasminc
12-23-2006, 04:39 AM
I wonder... for that 400M rays/sec, is that for HD? What resolution are they rendering at? If 1080p is 2 million pixels, than each pixel would need about 200 rays? From my limited understanding of raytracing, that doesn't sound quite right...
I would expect that to be an aggregate sum of all types of rays -- primary rays, (recursive) secondary rays, shadow rays, (recursive) reflected/refracted rays. Everything put together can probably easily add up, also keeping in mind that there will be multiple primary rays per pixel for AA purposes. And a single lightsource could get several shadow ray samples per pixel in order to generate soft shadows.

I wonder if realtime raytracing could be pulled off on PS3 in 480p, maybe even 480i.
Simply put... No.
Rays per second is not the problem -- Cell has the computational power. It's texture accesses and arbitrarily located scene geometry accesses that is the problem. You're going to get memory-access bound long, long, long before you're CPU bound. If you look at it from the point of how much complexity there is in a given in-game scene, there's a hell of a lot of data, none of which can be used in immediate mode renders (unlike a rasterizer path on a GPU), and a raytracer has really unpredictable access patterns, which is utter poison to any and all memory architectures.

In the magical land of Oz where any point in memory can be read in 1/3 of a nanosecond, and XDR can be widened to a 512kbit-wide bus for a dollar, and the government has increased the speed limit (of light) by about 800x, we might not have an issue.

Arnaud_M
12-23-2006, 07:02 AM
I wonder... for that 400M rays/sec, is that for HD? What resolution are they rendering at? If 1080p is 2 million pixels, than each pixel would need about 200 rays? From my limited understanding of raytracing, that doesn't sound quite right...

Do not forget that you would have more than one frame per second :-) So for 30 fps, you would have 400/(2*30) = 6.6 rays on average per pixel. Some pixel would use more (reflexion and/or refraction), some less, and around 6 on average seems perfectly plausible.
Arnaud

Lekko
12-23-2006, 08:13 AM
true.. but still, that has to be enough to render 30 fps at some resolution.

One thing that I was wondering, that I previously thought, was that raytracing is somewhat static in nature. With conventional rendering, you need exponentially more power to render more things happening. In theory, one would think that a good raytracer would be able to render any scene reguardless.

Lemme put it this way: If cell could render 60 M rays per second, it wouldn't really matter what it was rendering, it would always be 60 M per second. If there were more lighting sources and far more reflective surfaces, it would require more rays/pixel, but it would still process 60 M per second. Or is that not quite right?

Arnaud_M
12-23-2006, 08:24 AM
Lemme put it this way: If cell could render 60 M rays per second, it wouldn't really matter what it was rendering, it would always be 60 M per second. If there were more lighting sources and far more reflective surfaces, it would require more rays/pixel, but it would still process 60 M per second. Or is that not quite right?

That's globally correct. The number of rays per second would decrease only if more complex computations were used per ray (such as more complex illumination or shadowing algorithm). More complex calculations for each ray = less rays per second. But of course, if you keep rendering the same number of frames per second with more reflexion and lightsources, and the same number or rays per second, then the quality of each frame would degrade.

Arnaud

Garfunkel
12-23-2006, 08:35 AM
In any case i don't think we will be seeing this in any major way. Maybe on GT6 ;)

LaLiLuLeLo
12-23-2006, 08:41 AM
I find it interesting (and disheartening) that the speed of light is too slow for processors these days.

Garfunkel
12-23-2006, 10:09 AM
I find it interesting (and disheartening) that the speed of light is too slow for processors these days.

It has marked a point in time where human innovation and intelligence has grown too strong for nature to accompany.

MegaGrid
12-23-2006, 12:25 PM
Simply put... No.
Rays per second is not the problem -- Cell has the computational power. It's texture accesses and arbitrarily located scene geometry accesses that is the problem. You're going to get memory-access bound long, long, long before you're CPU bound. If you look at it from the point of how much complexity there is in a given in-game scene, there's a hell of a lot of data, none of which can be used in immediate mode renders (unlike a rasterizer path on a GPU), and a raytracer has really unpredictable access patterns, which is utter poison to any and all memory architectures.


They are proposing a hybrid CPU/GPU scheme. It will be interesting seeing their paper in the new year.
Anyway I am glad to see that the interest of some people here is also on how to do things and not just how to play things. :-)

Also next time someone mentions the wonderful work of a person he/she should mention his/her name. Not just a nickname which can or cant be his real name. I just happen to know where to search in order to find him. Others dont.
For anyone interested this specific Scientist is : Mike Houston ( http://graphics.stanford.edu/~mhouston/ )
Please do not spam him. :-)
What Cell based system they are using? : A Cell processor-based blade system
It will be interesting if he could port some his work to a PS3 system.

I said it and I am going to say it a thousand times. The RSX no access is a serious bottleneck for PS3 Linux.

Garfunkel
12-23-2006, 12:59 PM
I said it and I am going to say it a thousand times. The RSX no access is a serious bottleneck for PS3 Linux.

Exactly.

cpiasminc
12-23-2006, 06:31 PM
Do not forget that you would have more than one frame per second :-) So for 30 fps, you would have 400/(2*30) = 6.6 rays on average per pixel. Some pixel would use more (reflexion and/or refraction), some less, and around 6 on average seems perfectly plausible.
D'oh... now that you mention it, I overlooked that. In my defense, I had about 6 beers prior to posting that, so I'll shift the blame there... In that context, it sounds like that is a scale for more or less raycasting with shadows -- which pretty much covers the entirety of what rasterizers can do. When they say "raytracing", I always end up thinking full-blown Whitted raytracing.

Though I will say that when that's taken into account, the figure of 400M Rays sounds way, way too low. For a game running at 30 fps, you're more than likely CPU-limited (well, more like limited "on the CPU side"), so you don't want rendering to take up more than a medium-small fraction of that 33 ms. Especially not for a raytracer which doesn't render in immediate mode, and needs to buffer the whole scene.

I find it interesting (and disheartening) that the speed of light is too slow for processors these days.
I always get a good laugh when people talk of optical interconnect techniques "finally enabling us to send signals at the speed of light." The speed of light is slow, and it's not as if we haven't already been doing that. The real value of optical is being able to do lots of simultaneous asynchronous signaling.

They are proposing a hybrid CPU/GPU scheme. It will be interesting seeing their paper in the new year.
Well, you can easily use the GPU to fill in a g-buffer (akin to deferred rendering schemes) to accelerate the first hit, and assuming you're doing regular raycasting with shadows (no reflection/refraction on CPU-side), the CPU only needs to access the scene data for shadow tests. From an algorithms side, that's not particularly interesting -- it's been done a million times before... It'd only be of interest to see how well Cell does perform for such a situation.

MegaGrid
12-23-2006, 08:14 PM
Well, you can easily use the GPU to fill in a g-buffer (akin to deferred rendering schemes) to accelerate the first hit, and assuming you're doing regular raycasting with shadows (no reflection/refraction on CPU-side), the CPU only needs to access the scene data for shadow tests. From an algorithms side, that's not particularly interesting -- it's been done a million times before... It'd only be of interest to see how well Cell does perform for such a situation.

A small correction to what you said:
No shadows/reflection/refraction on the CPU side.

lips
12-24-2006, 04:41 AM
would ray-tracing instead of conventional rasterizing visuals help to reduce the number of artifacts due to digitization, or would the artifacts be exactly the same? Is there a great deal of increase in visual fidelity to ray-trace an image instead of using conventional techniques?

Lucent Beam
12-24-2006, 05:38 AM
PhD student in CompSci at Stanford?
I wonder if he knews my ex.