View Full Version : Cell Rendering Graphics - video inside
Insane Metal
03-25-2006, 08:13 PM
Much has been said about Cell?s presumed inability to texture map well. Given the small (256KB) local stores and DMA memory access, the SPEs were relegated by many to only handle nice streaming geometry type workloads. This seemed like an issue ripe for a little prototyping.
First, colleague Mark Nutter, implemented a software cache abstraction layer for the SPE giving us the ability to both hide the complexity of DMAs and benefit from transparent data reuse. Next, given the lessons learned from this paper, we tiled our textures, optimized our access patterns, and implemented several cache replacement policies. We then rewrote the shader in the Quaternion Julia Set Raytracer to add five cubemap texture lookup passes - 3 refraction lookups, a reflection lookup, plus a background lookup. These five texture lookups were then blended together with a fresnel calculation and modulated with the base lighting computation to form the final sample color.
The results were very pleasing.
We found that even with small 4-way set associative software cache sizes (8 KB), miss rates for this renderer were a low 7% and hit access times were only 12 SPE cycles.
Using only seven 3.2 GHz SPEs we were able to raytrace 15 frames per second with a frame resolution of 1024×1024. The texture buffer held a cubemap with 1024×1024x16 bit texel faces resulting in a 12.5 MB texture buffer in XDR system memory. The performance penalty for using the five pass texture shader vs the lighting only shader was just 13%.
Our miss handler was implemented as a blocking function and we still have ideas pending to further reduce the 12 cycle software cache hit access time so we believe the 13% performance gap between the two shaders will continue to close.
Video: http://www.gametomorrow.com/minor/barry/julia.mov
~15MB
Source http://gametomorrow.com/blog/index.php/2006/03/24/cell-cant-texture/
Very cool :pirate:
Sephiroth_VII
03-25-2006, 08:17 PM
Is this the PS3 cell, or just some random IBM tech demo?
LiquidEagle
03-25-2006, 08:18 PM
A lot of this is over my head, but I got the part where they said the results are pleasing :-p. Overall though, these are Cell-only tests, so the RSX in conjunction with the Cell could definitely make the raytracing & such a possibility, right? Or are these tests they're performing exclusive only to cell and not dependent on what GPU you use?
Old_Timer!
03-25-2006, 08:26 PM
Well one thing is for sure, some devs are picking up on the system very quickly. Second generation games on PS3 are gonna be lovely...:'(
Viper
03-25-2006, 08:38 PM
That was incredible. 15 fps real time is nothing to laugh at either. I like how they opened holes periodically so you could see the background.
Hey Old timer, Ralph Inventioned video gaming and hsi last video game innovation was in 1975. Nintendo is credited with innovating so much because they still do it. (in regards to your sig)
Old_Timer!
03-25-2006, 09:13 PM
Regardless if Nintendo is still in the business, they haven't developed any new hardware. It's all rehashed tech that they put a spin on and say look world we are truly innovative... please can someone name a new technoloy nintedo developed...Anyways sorry for going off topic guys, this demo is definitely looking great.
Sephiroth_VII
03-25-2006, 09:35 PM
Reading through the article again, I noticed that the CELLthey used for the demo, seems equivalent to the one in PS3. So with RSX as support, this should be a piece of cake for the PS3.
Domination
03-25-2006, 09:47 PM
Regardless if Nintendo is still in the business, they haven't developed any new hardware. It's all rehashed tech that they put a spin on and say look world we are truly innovative... please can someone name a new technoloy nintedo developed...Anyways sorry for going off topic guys, this demo is definitely looking great.
To be honest with you, it has little to do with pioneering the idea as much it it has to do with realising it's potenial in the way it has never been realised before. For instance, the stylus is a wonderful example. Using it for games has been incredibly successful and innovative, IMO. Yet, it has been out for years now.
But if we need to get technical, Sony is planing on using a GPS attach to make games more innovative. :smile:
jaxmkii
03-25-2006, 11:01 PM
sucks that its in QT for some reason my comp dont aggree with it
LiquidEagle
03-25-2006, 11:01 PM
Reading through the article again, I noticed that the CELLthey used for the demo, seems equivalent to the one in PS3. So with RSX as support, this should be a piece of cake for the PS3.
That's what I was thinking :)
unless this is the kind of stuff the RSX doesn't even have an influence on? Can a resident techie here help us out? :-D
Viper
03-25-2006, 11:07 PM
sucks that its in QT for some reason my comp dont aggree with it
Copy the link and paste it in other media players until one works.
jaxmkii
03-25-2006, 11:13 PM
^^^ i just wached it on one of my lesser systems...
i hope this tech finds its way into games
BlueTsunami
03-25-2006, 11:21 PM
I believe this Demo was originally meant for a GPU. It has been rewritten (code) to work on the Cell. Thats just amazing stuff.
Note: Appearently it IS a Raytracing Demo for GPUs BUT its been rewritten for Cell and implimented Texturing as a proof of concept for Cell.
LiquidEagle
03-26-2006, 12:08 AM
So this is the Cell acting as a GPU, without a stand-alone GPU for the Cell to assist (or to assist the Cell)?
So Tsunami (or anybody else with knowledge who wants to comment), what do you think we can expect with the RSX thrown into the mix?
BlueTsunami
03-26-2006, 12:20 AM
So Tsunami (or anybody else with knowledge who wants to comment), what do you think we can expect with the RSX thrown into the mix?
I can't really answer that myself (I lack the knowledge of some of the more elite members here) but all in all we should be getting amazing results with mixed rendering with Cell through Software and RSX through Hardware. Appearently Warhawk is already doing this in some form ;)
cpiasminc
03-26-2006, 04:40 AM
unless this is the kind of stuff the RSX doesn't even have an influence on? Can a resident techie here help us out?
This isn't the kind of thing that any GPU was really made to do. It's just that the original version that did run on a GPU depended on the fact that it didn't use any sort of texturing, so solving for implicit surface ray intersection was a purely computational issue. And raw computational power is something both GPUs and CELL have in spades.
There was a predecessor to this one on CELL that also simply relied on no texture to get really fast results at much higher precision (many more iterations to the surface solver), and of course, CELL ran faster than the G70. Largely because data streaming and locality of framebuffer block caches in software meant that the raw computational advantage went to CELL because of its much higher clock rate and much lower latency to dump out to local stores. Particularly in branch-heavy components of the test, where GPUs are disgustingly slow.
The thing is that in a game setting, CELL is well suited to vertex processing and not pixel processing because the SPEs just don't have the thread granularity to cover up texture read latencies. And that is still true regardless of what the authors of this demo might say. This demo is showing that for a single object of concern and a single texture to render, utilising low iterations on the refinement of the surface means you get a lot of temporal locality to your texture accesses and then CELL can handle it through software texture caching.
This has precisely zero impact on any game that will ever exist other than to say, "predictable access patterns and streaming computational tasks run well on the SPEs" which is something we all knew already.
Copy the link and paste it in other media players until one works.
Interestingly, I can play it in VLC, but not in QT, even after a complete update. Makes me wonder if it was an SGI .mov rather than a QT .mov.
but all in all we should be getting amazing results with mixed rendering with Cell through Software and RSX through Hardware. Appearently Warhawk is already doing this in some form
Cell won't really be rendering anything in practice. It will basically preprocess and postprocess. The rendering phase will be RSX all the way. What Warhawk does is much along those lines as well, just that it happens to be using raytests to do that. And it's dealing with a more explicit problem where textures are not a concern. Though I'll admit that's based on an educated guess.
CreativeWriter
03-26-2006, 06:01 AM
This isn't the kind of thing that any GPU was really made to do. It's just that the original version that did run on a GPU depended on the fact that it didn't use any sort of texturing, so solving for implicit surface ray intersection was a purely computational issue. And raw computational power is something both GPUs and CELL have in spades.
There was a predecessor to this one on CELL that also simply relied on no texture to get really fast results at much higher precision (many more iterations to the surface solver), and of course, CELL ran faster than the G70. Largely because data streaming and locality of framebuffer block caches in software meant that the raw computational advantage went to CELL because of its much higher clock rate and much lower latency to dump out to local stores. Particularly in branch-heavy components of the test, where GPUs are disgustingly slow.
The thing is that in a game setting, CELL is well suited to vertex processing and not pixel processing because the SPEs just don't have the thread granularity to cover up texture read latencies. And that is still true regardless of what the authors of this demo might say. This demo is showing that for a single object of concern and a single texture to render, utilising low iterations on the refinement of the surface means you get a lot of temporal locality to your texture accesses and then CELL can handle it through software texture caching.
This has precisely zero impact on any game that will ever exist other than to say, "predictable access patterns and streaming computational tasks run well on the SPEs" which is something we all knew already.
Interestingly, I can play it in VLC, but not in QT, even after a complete update. Makes me wonder if it was an SGI .mov rather than a QT .mov.
Cell won't really be rendering anything in practice. It will basically preprocess and postprocess. The rendering phase will be RSX all the way. What Warhawk does is much along those lines as well, just that it happens to be using raytests to do that. And it's dealing with a more explicit problem where textures are not a concern. Though I'll admit that's based on an educated guess.
Wow, so far above my tech understanding it's not even funny, but a fun read. Thanks for the answers. I'm starting to pick up some of the jargon (SPEs, GPU/CPU, etc), but I've still got a ways to go...
Smokey
03-26-2006, 11:22 AM
it worked as qt on mine just took a while to start
yeah, it automatically played with QT.
in any case, what is the significance of this video? all I see is reflecting light. this was to show what exactly?
xbdestroya
03-26-2006, 05:02 PM
Just watched it - pretty awesome.
Thanks for that great rundown by the way Cpi.
yeah, but for give my limited tech knowledge. I see nice reflections. I have been seeing this kind of stuff in screen savers, commericals, etc. I know this is being done real time, but what does this show exactly? Cell calculation of light rays, the GPUs ability in a certain task, what?
cpiasminc
03-27-2006, 05:40 PM
It's actually a refraction demo, and moreover, one that actually simulates spectral dispersion (though you can clearly see that they're just using 3 different IORs for red, green, and blue). But, yeah, this is a demo completely about Cell, and the main point isn't so much about the calculation of refracted light rays, but about Cell being able to perform texture accesses efficiently.
koldfuzion
03-27-2006, 06:11 PM
whoa
But, yeah, this is a demo completely about Cell, and the main point isn't so much about the calculation of refracted light rays, but about Cell being able to perform texture accesses efficiently.
now I see. so how do other current CPUs handle it compared to this?
cpiasminc
03-27-2006, 11:58 PM
now I see. so how do other current CPUs handle it compared to this?
As far as the computational side of it, Cell is far, far superior. As far as the texture memory access side of things, assuming you do some decently planned out prefetches, a modern CPU might do a little better overall -- mainly because it has a totally automated cache, whereas with the SPEs, you have to do all the caching explicitly.
That's not entirely surprising given the extent to which we could rasterize things in software back in the day. Obviously, the texture volume we could handle was a lot more than one texture, even though they were lower resolution. The thing is that when it comes down to caching and memory access patterns, larger contiguous blocks are less of a problem than multiple separate ones that could be scattered about in memory space.
thanks for clearing that one up :)
venomv
03-28-2006, 03:29 AM
a modern CPU might do a little better overall -- mainly because it has a totally automated cache, whereas with the SPEs, you have to do all the caching explicitly.
Does that mean that the CELL does have a higher potential if they get the SPE's working 'perfectly' within reason, or would that just equal it out?
cpiasminc
03-28-2006, 05:33 AM
Does that mean that the CELL does have a higher potential if they get the SPE's working 'perfectly' within reason, or would that just equal it out?
Well, the nature of caches themselves aren't really "perfect," so having the human input to knowledge of access patterns means that the software caching of the SPEs has greater potential towards becoming "perfect" insofar as avoiding misses. But there's a hell of a lot of work to doing that (and a type of work that young programmers aren't really being taught to do in this day and age -- god I feel old having said that), and it's highly unlikely that it would be possible outside a limited domain.
vBulletin® v3.7.1, Copyright ©2000-2010, Jelsoft Enterprises Ltd.