PDA

View Full Version : Programming/Development Discussion Thread


Matt
12-19-2004, 11:09 PM
Discuss anything programming-related here, whether it be XNA or Collada. Just remember to keep it on-topic please.

Pumster
12-20-2004, 06:05 AM
Well, this is the first time I've visited this section of the forum in the months - I'm surprised by all the Closed threads! Anyway, this is definitely great news - Especially after hearing the anti-PS2 remarks from Mikami, the creator of Resident Evil. Assuming the PS3 gains as much momentum as the PS2, we can expect more high-quality titles (Or more games in general.) and, hopefully, more praise from popular third-party developers.

Rallyracr420
12-20-2004, 06:25 AM
I know both deadmeat and cpiasminc both disagree with me here, but I still say the PS3 programmers won't have to feed each Cell core separately. The Cell is supposed to be scalable and feed off of other Cells. As a programmer you're never going to know how many Cells are going to be at your disposal. There's no way you can create code to send data to Cells that may or may not be there. I believe all this code routing will take place within the hardware itself, to be aided by the development enviroment that Sony will provide (ie. Colada and more). Without having to code in assembly the PS3 should be just as easy to develop for as the XBox, XBox2, PC or any other high level language.

The_One
12-22-2004, 04:57 AM
Well, this is the first time I've visited this section of the forum in the months - I'm surprised by all the Closed threads! Anyway, this is definitely great news - Especially after hearing the anti-PS2 remarks from Mikami, the creator of Resident Evil. Assuming the PS3 gains as much momentum as the PS2, we can expect more high-quality titles (Or more games in general.) and, hopefully, more praise from popular third-party developers. What? When did Mikami make those remarks? Got any links?!?! :shock:

Z
12-22-2004, 12:25 PM
The Cell is supposed to be scalable and feed off of other Cells. As a programmer you're never going to know how many Cells are going to be at your disposal. There's no way you can create code to send data to Cells that may or may not be there. I believe all this code routing will take place within the hardware itself, to be aided by the development enviroment that Sony will provide (ie. Colada and more). Without having to code in assembly the PS3 should be just as easy to develop for as the XBox, XBox2, PC or any other high level language.

Like PSP, PS3 will have inbuilt hardware and software that do a lot of the work. By that, dev. will have more time to focus on more important things rather than routine coding for the little parts. A person working on a PS3 title said to PSM that PS3 is VERY easy to develop for. Whether it would be easier than Xenon or not is to be determined later. In any case, such minor differences won’t be a major factor- look at the crazy PS2 support despite the nagging of numerous dev.

Marjoh
12-22-2004, 08:41 PM
Well, this is the first time I've visited this section of the forum in the months - I'm surprised by all the Closed threads! Anyway, this is definitely great news - Especially after hearing the anti-PS2 remarks from Mikami, the creator of Resident Evil. Assuming the PS3 gains as much momentum as the PS2, we can expect more high-quality titles (Or more games in general.) and, hopefully, more praise from popular third-party developers. What? When did Mikami make those remarks? Got any links?!?! :shock:

I think it has more to do with the Sony-Capcom drama rumor. Mikami seems happy with the PS2 before he ditch Devil May Cry and went to Nintendo. He was complaining how the PS2 is hard to program, and he even go as far as saying Sony makes defective hardware (he said his PSX broke fast). I don't know how much Nintendo paid Capcom, but now they're moving back to Sony because they learned it's more profitable.

I can't provide any link, but I'll look in to it. BTW, this was a few years ago.

EDIT:

Capcom's Shinji Mikami criticizes Sony and Square

Tuesday, August 27, 2002 - Shinji Mikami, producer of Resident Evil and Devil May Cry, strongly criticized Sony and Square in a recent radio interview in Japan.

Mikami accused Sony of purposely designing their consoles to break easily so that gamers will have to buy a replacement. He also said that Sony's high sales figures are helped by the fact that many gamers, himself included, have had to buy a second PlayStation and PlayStation 2.

From there, Mikami went on to accuse Sony of doing the same thing with their line of PCs, Walkmans, and cell phones. He asked why no one has complained about this and said that it was almost like cheating and committing a crime. The radio DJ tried to interrupt Mikami and shift the conversation to another topic, but when Mikami was asked if he thought Sony's customers are foolish, he replied, "Yes."

Next, Mikami said that Kingdom Hearts' strong sales are due to "aura purchase," meaning that customers are buying the game because their friends have bought it or like it, regardless of whether they like the game themselves. He said that Kingdom Hearts does not deserve its 780,000 sales or 6800 yen price tag.

Finally, he apologized to Square, saying he is upset that Kingdom Hearts has sold more copies than Biohazard for the Gamecube. He said that Biohazard is the better game, but Kingdom Hearts is not a bad game either.

Neither Square nor Sony has made any comments regarding Mikami's statements.

Source (http://www.gameforms.com/news/?414)

I guess he was just pissed for the fact that Sony is beating the crap out of Nintendo in terms of sales figure during that period. Meaning less money and stock for Capcom's so called "Big 5". Anyone remember that announcement? And Kingdom Hearts was for PS2 and has nothing to do with Nintendo what so ever. So what is he complaining about? And for the fact that it's an entirely new game compare to the REmake. It's almost the same game except the viewtiful graphics.

BTW, I'm not a Capcom haters. I have been a big supporter of their series (RE, DMC, Oni, etc). They just need to stop being a Nintendo-nuthuggers. Which I think they already have because they are starting to support Sony again (VJ series, Killer 7, and RE4 are no longer exclusive to Nintendo). And hopefully this will continue on to PS3 development.

cpiasminc
12-24-2004, 12:38 AM
I know both deadmeat and cpiasminc both disagree with me here, but I still say the PS3 programmers won't have to feed each Cell core separately. The Cell is supposed to be scalable and feed off of other Cells. As a programmer you're never going to know how many Cells are going to be at your disposal. There's no way you can create code to send data to Cells that may or may not be there. I believe all this code routing will take place within the hardware itself, to be aided by the development enviroment that Sony will provide (ie. Colada and more). Without having to code in assembly the PS3 should be just as easy to develop for as the XBox, XBox2, PC or any other high level language
I don't completely disagree with that. And in some ways, I expect it to be easier to code for than XBox2, because with the lower scaling limits of Xbox2, I'd expect the multithreading to be explicit in nature... e.g. actually managing your own thread pools and synchronization. While .NET languages (Comega in particular) provide some nice API tools for that, it's not a radical change, so much as a cleaner interface than say, pthreads.

I'm not so sure about the code routing taking place at the *hardware* level so much as the *OS* level. It's implicit at the game code level and rather than containing a series of low-level code chunks, it would probably be more like normal procedures issued to different APUs as apulets, but the key here is that the programmer doesn't have to worry about which resources are available and which aren't. It would instead depend on the OS to do that, and just point to which data and which code to send. Now you could make everything completely transparent, which is what languages like Cilk do... and I would indeed love the idea of a Cilk interface to C++ or C#... but having distinct issue-points does make debugging a little cleaner.

In fact, I think the transparent Cilk-like approach would make more sense for a Xenon-like CPU or a dual-core CMP chip than for CELL in general. Primarily because CELL can scale up much much farther in the long run than 1 PE w/ 8 APUs, and explicit multithreading would be far too much of a load on the programmer's shoulders. Whereas something like a dual-core K8 is only going to ever so slowly go up to quad-core and then maybe 8 cores, but with the complexity of each core comes a lot of die space for each one you add, so it can only scale so far up at only so fast a rate. Whereas the APUs that CELL would use would be very simple and low-cost, albeit individually poor performers in practice, so the individual cost is small, and several size CELL MPUs can be made for various purposes even within the same process generation.

Domination
12-24-2004, 03:45 AM
I know both deadmeat and cpiasminc both disagree with me here, but I still say the PS3 programmers won't have to feed each Cell core separately. The Cell is supposed to be scalable and feed off of other Cells. As a programmer you're never going to know how many Cells are going to be at your disposal. There's no way you can create code to send data to Cells that may or may not be there. I believe all this code routing will take place within the hardware itself, to be aided by the development enviroment that Sony will provide (ie. Colada and more). Without having to code in assembly the PS3 should be just as easy to develop for as the XBox, XBox2, PC or any other high level language

To tell you the truth, before XNA was brought up, I honestly thought the PS3 could have been a lot easier to develop for than the Xenon. But, all of that seems to have changed. With Microsoft involved, those odds are now a little uncertain to say at this time. I say that because this is an area in which I feel Microsoft are at their best. Inspite of that, I feel that one does not necessarily have to out do the other (so much as what it would actually seem on paper) to still maintain being a strong competitor. It's more than likely that if the API both Sony and Microsoft are working on is at least within a vicinity of one another, developers won't be nearly as reluctant to write neither side off. So, it won't really matter too much in the end once you think about it.

5ysT3m cR45h3r
12-27-2004, 09:44 PM
Yeah it doesn't make much of a difference.

VU_fan_boy
01-04-2005, 10:16 AM
Programming the PS3 efficiently will be hard, no doubt. But the same goes for the Xbox2, it won't be as hard because it only has 3 cores but your average programmer rarely worries about where data is in memory. Your average programmer will spam the bus with requests, far too many times, not realizing what they are doing. The processors will all sit there stalling because of the horrendous number of memory accesses going on.

We'll need to shift the way we organize code and data to make it run efficiently on a PS3 and Xbox2.

Initially I imagine we'll tap only a small percentage of the power from both platforms, but especially the PS3 with its apulets. Both platforms will appear to be equally matched as most developers know how to use shaders these days and I'm guessing the GPU performance will be similar (PS3 will be better though :wink: ). But as time goes on and more parallel processing makes it into games, we'll see the PS3 fly past the Xbox2, which means they'll have to bring out the Xbox3 even earlier than planned :D .

I think the efficient use of apulets will be determined by the language used to program them. I can envisage a stream processor type approach where apulets work like shader stream processors but not specific to graphics. Knowing Sony though, that will be just one way you can use them, but I'm sure they will leave it relatively flexible. So you could utilize the apulets as threads spawned from you main code stream, describing the parallel sections using OMP or something similar, but I personally hope that is not possible. As I said before most programmers don't structure data correctly, in terms of locality. So if you can easily spawn apulet threads from main code (c/c++) to work on a loop or something, bus stalls will occur too frequently. The system will be used incorrectly and therefore inefficiently. The only way to force good structure is to take an approach similar to stream processors in shaders. You can only access very local data, i.e. stream input data and constants. And you can only output locally too, that local output is essentially buffered. To make these systems even more efficient, "pipeline style", they must use double buffering, so that 128K on an APU will need to be slashed in half. Now will Sony enforce a memory/buffer structure or allow the user to specify the structure? I'm guessing the later, it is more difficult to manage but much more flexible. It once again comes down to flexibility over ease of use.

Alejux
01-04-2005, 11:11 AM
I don't think that programming the PS3 will be particularly hard, but more like different. They have a wonderfull opportunity here. It's a brand new chip, with a brand new design. It's not a patched up and repatched CPU with a million backward compatibility issues. We're talking about an opportunity to create a entirely new programming model that could have a simplicity we have not seen in a very long time.

Alejux
01-04-2005, 11:37 AM
Initially I imagine we'll tap only a small percentage of the power from both platforms, but especially the PS3 with its apulets. Both platforms will appear to be equally matched as most developers know how to use shaders these days and I'm guessing the GPU performance will be similar (PS3 will be better though :wink: ). But as time goes on and more parallel processing makes it into games, we'll see the PS3 fly past the Xbox2, which means they'll have to bring out the Xbox3 even earlier than planned :D .

I agree with you. It will take some time for programmers to mature on taking advantage of this kind of parallel programming. When that happens, the PS3 should clearly outdistance itself from the Xbox2.


Now will Sony enforce a memory/buffer structure or allow the user to specify the structure? I'm guessing the later, it is more difficult to manage but much more flexible. It once again comes down to flexibility over ease of use.

Don't forget this chip is also developed by IBM. They would, IMO try to adopt a more flexible model. Just a hunch.

VU_fan_boy
01-04-2005, 11:46 AM
Don't forget this chip is also developed by IBM.
When I say Sony, I mean all of them, Sony, IBM and Toshiba. But Sony is the main client as far as I can tell. As for memory structure, that could be enforced via the OS.


Programming the PS3 won't be that hard, but doing it efficiently will be. A big shift is needed in the way most developers structure their programs, there is a lot of legacy code out there which will take time to shed.

The CPUs (PUs) in the PS3 will not be that different from the CPU in the PS2, both are RISC, 32bit (128bit regs), with a few extras. You'll program them with the C/C++ language, the Xbox2 CPUs will be even similar to the PS3 than the Xbox. But the PS3 has these APUs which I consider as auxiliary processors, not the central or CPU part, or though they are all part of the BE. I doubt you will program the APUs with C/C++, well at least not in a conventional way. Learning how to harness these APUs and apulets effectively will be the hard part.

If the Xbox2 can do about 32GFLOPS, and the PS3 about 1TFLOP, I bet no one will actually get 32 times more processing out of the PS3. But if these figures are close to true it should be easy to process 4-8 times more on a PS3. But exactly what you can process is limited by the massively parallel architecture. It should be very well suited to realtime physics imho.

Domination
01-04-2005, 04:01 PM
I don't think that programming the PS3 will be particularly hard, but more like different. They have a wonderfull opportunity here. It's a brand new chip, with a brand new design. It's not a patched up and repatched CPU with a million backward compatibility issues. We're talking about an opportunity to create a entirely new programming model that could have a simplicity we have not seen in a very long time.

I totally agree with you. I'm not sure how different the Cell' architecture is from the GScube, but I do know that the make-up of the GScubes architecture is what inspired the Cell's multi, parallel functionality. Yet, developers witnessed very little problems.

amod20002004
01-04-2005, 04:27 PM
Oh my god. Ps3 is not going to support OpenGL1.5 or 2.0 Read this news. :shock: Below news says that ps3 will support Sony’s API. I want to know more about Sony’s API. Can any one help me? :shock:


A report from Japanese web-site PC Watch suggests that the PlayStation 3 graphics processing unit will use NVIDIA’s technologies found in the current NV40 generation of its own chips as well as numerous techniques developed for the next-generation part known under NV50 code-name.

Still, despite of circuitries of the company’s desktop chips found in the GPU, according to NVIDIA’s chief Jen-Hsun Huang, the PlayStation 3 GPU has nothing to do with Microsoft Windows, Microsoft DirectX or OpenGL and will use Sony’s API for the console.

Naturally, the PlayStation 3 graphics processing units supports XDR DRAM memory developed by Rambus. While there is nothing new in Rambus memory for Sony, NVIDIA has never worked with memory by Rambus.

Source (http://www.ps3portal.com/?view=article&article=77)

Alejux
01-04-2005, 06:27 PM
Sorry Amod. But no one can help you at the moment. Only the good folks that work for nVidia and Sony can. The fact is, that no one out of the loop has any clue about anything regarding CELL and PS3. Only the information realeased by them.

I never seen a project so well guarded as this one. I keep having these visions of these geeky engineers entering mysterious phonebooths only to slide down into some top secret facility 2 miles under the earth.

About this OpenGL rumor/news...it conflicts a little with what we heard earlier about nVidia providing the midware for the PS3. Or not. Whatever it is, sure hope it's really easy to use, and not too alien for developers, otherwise they'll suffer from heavy competition from the Xbox2/XNA.

amod20002004
01-04-2005, 07:42 PM
Yes, I think you are right. Let's wait and see.

cpiasminc
01-04-2005, 09:11 PM
About this OpenGL rumor/news...it conflicts a little with what we heard earlier about nVidia providing the midware for the PS3. Or not.
You can also interpret it a bit differently. It's as if to say that the GPU is not designed under the specific constraints of total compliance with certain versions of DirectX or OpenGL. Rather, it will simply support a feature set meant specifically to match what Sony has in mind.

For example, Sony can still use an OpenGL-esque API, but the thing is that there's no point in implementing several non-compliant features as API extensions the way you would on a PC since the hardware is anyway fixed. So many of what would otherwise be extensions would be rolled in as native features. At that point, it's not really an ARB-ratified version OpenGL anymore, but based on GL. SonyGL, for lack of a better name.

Alejux
01-04-2005, 10:03 PM
About this OpenGL rumor/news...it conflicts a little with what we heard earlier about nVidia providing the midware for the PS3. Or not.
You can also interpret it a bit differently. It's as if to say that the GPU is not designed under the specific constraints of total compliance with certain versions of DirectX or OpenGL. Rather, it will simply support a feature set meant specifically to match what Sony has in mind.

For example, Sony can still use an OpenGL-esque API, but the thing is that there's no point in implementing several non-compliant features as API extensions the way you would on a PC since the hardware is anyway fixed. So many of what would otherwise be extensions would be rolled in as native features. At that point, it's not really an ARB-ratified version OpenGL anymore, but based on GL. SonyGL, for lack of a better name.


That would make sense. A SonyGL like (OpenGLesq) approach would garantee the developers a certain amount of familiarity and at the same time be able to impose certain OpenGL-incompatible features made specifically to take advantage of the PS3.

As to how will be the lower level programming, I don't have a clue.


I don't know if people on this board ever read this paper on CELL. I thought it was incredible, and if I was smart enough, knowledgeable enough in this area, and not as lazy (no for the three), I might be able to predict with more precision some possible low level programming models. It's also kind of old, so it's possible that it has some imprecise information that has been corrected in more recent releases.

The author was one of the VU coders at Naughty Dog ( Jak & Daxter ) for some time. Take a look:


EDIT: Removed the article and put a link to another copy of it.

Danji Edit: The link was taken down by the request of the author of the article.

xbdestroya
01-04-2005, 10:10 PM
Wow, that article's awesome.

In any event, we'll know more or less whether he was accurate in his predictions a month from now. I don't know about the rest of you, but time really seems to be flying!

stanDarsh
01-04-2005, 11:16 PM
Alejux, that is an awesome article, however, could you please remove it (or have a PSINext staff member do it), as it could get the author in a lot of trouble. I know this because it has been posted on this forum several times before.

Danji
01-04-2005, 11:22 PM
Very very good read, thanks for quoting that. I hope we do reach his prediction of 800,000,000 polys a second. It would be an incredible thing to witness...I know that it would render me speechless to see a video of that sort of intensity in graphics. To own such a complex and powerful device for only $300(maybe more) is such a huge bargain that it is beyond words!

Alejux
01-04-2005, 11:48 PM
Alejux, that is an awesome article, however, could you please remove it (or have a PSINext staff member do it), as it could get the author in a lot of trouble. I know this because it has been posted on this forum several times before.

This article was on a thread in B3D for over an year. I couldn't imagine there would be a problem showing it here. Sorry.

I removed the quote and put a link instead. Is that a problem?

5ysT3m cR45h3r
01-05-2005, 12:19 AM
I can't see the trouble it would cause.

VU_fan_boy
01-05-2005, 12:29 AM
That article states that the PUs will not be user programmable (well he questions the statement at the end), but surely they will, it seems like such a waste if they are not. And I'm wondering if the PUs will run at 4.6GHz, I hope they will but then G5s currently only run at 2.5GHz. I suppose by the time the PS3 is released a 4.6GHz PowerPC may be feasible, especially at 65nm.

No dev team is going to adopt the PS3 if it can't easily port existing code, not to mention supporting multiple platforms like the Xbox2 and PC. I really think you will get a stock CPU (PU) that you can program in a standard way but if you want more power then you will have to delve into the world of apulets. If the PS3 can only run apulets it will fall flat on its face and no one will develop for it.

The only hope I can see for an APU only system is if the 128K local RAM is used like a cache and an APU can have random access to all of memory and run code that spans mega bytes of main memory. Most games use 2MB+ of instruction memory and that is on 32bit CPUs, if they were running on APUs that were VLIW they might need 2-4 times as much instruction memory. It just doesn't seem feasible. The APUs are essentially DSPs not CPUs.

stanDarsh
01-05-2005, 01:01 AM
Alejux, that is an awesome article, however, could you please remove it (or have a PSINext staff member do it), as it could get the author in a lot of trouble. I know this because it has been posted on this forum several times before.

This article was on a thread in B3D for over an year. I couldn't imagine there would be a problem showing it here. Sorry.

I removed the quote and put a link instead. Is that a problem?

Thanks for removing it. I see no problem with the link, but if there is I'm sure a staff member will remove it.

I can't see the trouble it would cause.

Some legal thing to do with the author, and Sony, probably an NDA, which would mean he'd be breaking the NDA, with that article he wrote.

cpiasminc
01-05-2005, 01:46 AM
That article states that the PUs will not be user programmable (well he questions the statement at the end), but surely they will, it seems like such a waste if they are not.
Well, I get that idea to a certain extent from the figures and the patent info as well. If you notice, with the BBE figures, it's all based on a 4 GHz APU that can achieve at best 8 FLOPs per cycle on 4 operands giving 32 GFLOPs and having a total of 4 PUs with 8 APUs each to give you 1 TFLOP... In essence, the PUs aren't even being included in that figure, which gives me the impression that the sole purpose those PUs have is load balancing and code+data distribution... memory access and such. Everything computational would be done on APUs. Yes, it does sound like a bit of a waste, but I imagine that they'd have their hands full doing that much for 8 APUs. If not for that, I think engineers would have no qualms about putting more APUs to a PE.

And I'm wondering if the PUs will run at 4.6GHz, I hope they will but then G5s currently only run at 2.5GHz. I suppose by the time the PS3 is released a 4.6GHz PowerPC may be feasible, especially at 65nm.
I look at it this way. Say you have a transistor that can be switched at speeds up to 20 GHz. Not at all unlikely. There are already THz transistors in the labs of almost every manufacturer. Now if the most complex pipeline stage in your CPU has a critical path length that is, at most, 10 transistors deep, then your CPU will not be able to clock higher than 20 / 10 = 2 GHz. And it will probably be a little less due to the delays of routing signals through interconnects and what not.

Anyway, I figure the whole idea of CELL is that the APUs would be very simple in nature to the effect that the APU by itself is not a great performer, but it's cheap to make and very low transistor budget, and a whole collection of APUs makes for some high overall throughput. That suggests to me that these APUs would not be loaded with all nature of extra performance boosters like out-of-order-execution, branch prediction, SMT, multiple issue, etc. They're essentially very basic straight-forward single-issue pipelines. That in turn, makes the pipelines and each pipeline stage very simple. So with that kind of simplicity, high clocks are very easy to achieve. BTW, I don't know how much stock I put in the idea that the APUs would be VLIW in the effect of being able to issue SIMD floating point and integer instructions in parallel. I actually find it easier to believe that you can only issue instructions on separate clock cycles regardless of type.

Now the PUs on the other hand are probably a lot closer to ordinary PPC devices, albeit probably lacking in the SIMD portion since you have all these extra SIMD pipelines attached. My thinking is that since the APUs are essentially all independent anyway, they would also be independent of the PU, which is far more complex in nature. So since the PU is essentially almost a PPC in itself, I'd think that if the APUs run at 4.8 GHz or something, the PUs would run at 2.4. That's certainly a feasible speed for a PPC, and at 65nm, the effective power consumption would be low enough that you may not have to worry.

Alejux
01-05-2005, 02:55 AM
That article states that the PUs will not be user programmable (well he questions the statement at the end), but surely they will, it seems like such a waste if they are not. And I'm wondering if the PUs will run at 4.6GHz, I hope they will but then G5s currently only run at 2.5GHz. I suppose by the time the PS3 is released a 4.6GHz PowerPC may be feasible, especially at 65nm.

I really don't know how fast they'll be. But regarding it's programability, if it's main task is synchronizing the APU's, I would much rather they be hard coded, to avoid software overhead.

No dev team is going to adopt the PS3 if it can't easily port existing code, not to mention supporting multiple platforms like the Xbox2 and PC. I really think you will get a stock CPU (PU) that you can program in a standard way but if you want more power then you will have to delve into the world of apulets. If the PS3 can only run apulets it will fall flat on its face and no one will develop for it.

I agree with you, that if the programming approach becomes too alien, it might scare a lot of developers. But I doubt that will be the case.

Maybe the cell programming will emulate somehow normal linear programming and distribute processing to it's APU's in an automatic but sloppy manner, but at the same time provide means of more control over intrisic elements of the cell architecture for those who venture to try...

Maybe they're counting on having a good, powerful and familar high-level language, and have the lower level API's handle the most of the peculiarity of the cell programming. Who knows? They didn't do a very good job in creating a developer friendly platform with the PS2, and according to them, they would correct their mistake in the PS3.

Or maybe they are counting on the PS3 being so damned fast that developers won't mind changing completely their current programming phylosophy in order to create games for it. Again...who knows.




The only hope I can see for an APU only system is if the 128K local RAM is used like a cache and an APU can have random access to all of memory and run code that spans mega bytes of main memory. Most games use 2MB+ of instruction memory and that is on 32bit CPUs, if they were running on APUs that were VLIW they might need 2-4 times as much instruction memory. It just doesn't seem feasible. The APUs are essentially DSPs not CPUs.

Are you sure about that? 2MB of instructions seems like a lot! We're not talking about a whole program here, but more like subroutines. Also , according to the article, a "software cell" can use more then one APU, sharing the resources between them ,abeit it will have to use the PE bus. The same bus used by the APU to access the DRAM. I'm sure there will be some type of "busy/not busy" bit flags to prevent accessing certain parts of the memory, since memory access will be a mess , with all these independent APU's accessing it at their content.

Here's what he wrote (I hope there's no harm in quoting this little segment):


Unlike the VU, which used a Harvard architecture (seperate program and data memories), the APU seems to use a traditional (von Neumann) architecture where the 128K of local RAM is shared by code and data. The local RAM appears to be triple- ported so that a single load or store can occur in parallel with an instruction fetch, mitigating the von Neumannism (the other port is for DMA). The connection is 256 bits wide (2 x 128 bits), so only one load or store can occur per cycle - it seems reasonable to assume therefore that the load/store instructions only occur on the integer side of the VLIW instruction, as was the case on the VU. Since there is no distinction between integer and floating point registers this works out just fine. The third RAM port attaches the APU to other components in the system and allows data to be DMAed in or out of the chip 1024 bits at a time. These DMAs can be triggered by the APU itself, which differs from the PS2 where only the host processor could trigger a DMA.

Note that the APU is not a coprocessor but a processor in its own right. Once loaded with a program and data it can sit there for years running it independently of the rest of the system. Cells can be written to use one or more APUs, thus multiple APUs can cooperate to perform a single logical task. A telling example given in the patent is where three APUs convert 3D models into 2D representations, and one APU then converts this into pixels. The implication is that PS3 will perform pure software rendering.

VU_fan_boy
01-05-2005, 10:44 AM
2MB is conservative.

Lets say 1 line of C produces on average 3 instructions, now in 2MB of memory you can hold 2MB/4 = 512K instructions (4 bytes each). 512K/3 to get number of lines of C, which is 170K lines of C. That is about the average for a console game these days.

It is fair to say that not all of those 2MB instructions are going to be used in any one part of the execution, graphics, ai, audio. But organizing the code so that it splits into 2MB/128K = 16 discrete sections would not be easy. In fact, you won't get 128K for instructions, if the architecture is von Neumann you need space for data. I'd say you might use 1:3 code:data in this case, which means you may only have 32K of instruction memory. Your code will be full of common functions such as maths functions, matrix and vector operations, that common code would need to be on the local APU memory. Unless like you say each APU can easily access the other 7 APUs memory on the PE, but I think that will be high latency via the DMA and not designed to fetch instructions. So the room for specialised code and common code is something like 32K. That's enough room for something like (32K/4)/3 = 2600 lines of C, which lets say half is common code, so you might get 1300 lines of specialised C to actually do something useful.

Oh I'm assuming the APUs are not VLIW, and 32bit instruction width. I haven't done the math yet but higher than 32bit would waste a lot of memory. I'm thinking they might use an instruction compression scheme like Thumb. As stated in that paper, to address 3 regs out of 128 you will need 7+7+7 bits which is 21, 4 for the component mask, which is 25 and leaves 7 for instruction encoding, that's 128 instructions. But that all seems a bit tight and I have left loads out in my estimation. So maybe they will need to be 64bit instruction words. But I doubt VLIW, it just makes the instruction words too long and practically you rarely do vector int and float calculations at the same time so it would be a waste. Also I would agree that this would mean you can only do int or float at
any one time.

I suppose having to manage the APUs will be a big task for the PU but I had a feeling that the user would be the one doing the managing via a Sony API. Not Sony doing the managing in a closed off chip.
I'm having my doubts now that the PU is user programmable. If the PU was design to just manage the APUs it will have no vector unit or fpu. Therefore it would be quite small but not very powerful and of no real use in games.

Sony had better not screw this up, I'm getting worried now.

Alejux
01-05-2005, 12:43 PM
2MB is conservative.

Lets say 1 line of C produces on average 3 instructions, now in 2MB of memory you can hold 2MB/4 = 512K instructions (4 bytes each). 512K/3 to get number of lines of C, which is 170K lines of C. That is about the average for a console game these days.

It is fair to say that not all of those 2MB instructions are going to be used in any one part of the execution, graphics, ai, audio. But organizing the code so that it splits into 2MB/128K = 16 discrete sections would not be easy. In fact, you won't get 128K for instructions, if the architecture is von Neumann you need space for data. I'd say you might use 1:3 code:data in this case, which means you may only have 32K of instruction memory. Your code will be full of common functions such as maths functions, matrix and vector operations, that common code would need to be on the local APU memory. Unless like you say each APU can easily access the other 7 APUs memory on the PE, but I think that will be high latency via the DMA and not designed to fetch instructions. So the room for specialised code and common code is something like 32K. That's enough room for something like (32K/4)/3 = 2600 lines of C, which lets say half is common code, so you might get 1300 lines of specialised C to actually do something useful.


But even with your big 2MB chunks of code, you'll separate it into smaller functions, procedures and objects. At least I can't possibly imagine a single chunck of code with 170,000 lines. I would have nightmares from it!

What I imagine you'd want to do in the PS3, will be separating game elements, that usually take a good deal of CPU power, into smaller semi-independent cells, and use a main cell as the main/outer loop. For example: You could have a cell controlling procedural cloud generation, another cell souly for character(s) animation, another for some determined physics effects, another one for character AI/logic. Some of these things do not use 1000's of lines of code, but 10's or 100's.

I agree the bitch here would be to coordenate/synchronize these threads. But given that the CELL chip is made for this kind of thing, it will probably have many resources for it.


As for porting games for cell...and the current developers who are NOT willing to deal with these parallel processing stuff...



9) To achieve these goals, Sony needs to solve several problems: network latency, scalability, hardware abstraction. PS3 will work much like the original PS, with SCE providing high-level libraries to abstract the hardware for the applications.


Even if the PS3 architecture turns out horribly complex, I honestly don't think it will be too much of a problem.

As long as Sony can create a reasonable number of foundation technologies (engines, discrete renderers, physics engines, etc.) to cover an acceptably wide range of different game types, the developers won't have to invent their own wheels in order to get their games made.

I'm pretty sure Sony realizes this and has been making an agressive effort in the middleware department over the last year or so and probably will continue to do so until launch. They'll want to release the PS3 having most of the developers' needs covered.

Don't loose faith my friend.

Personally, speaking as a programmer, I think I would love to get my hands on chip like this, and try to squeeze every single cycle of every APU in it. It would be a fun challenge. :)

amod20002004
01-05-2005, 04:46 PM
You are a programmer. Wow, means your job is related to this field. Nice to hear that. Now we have plenty of experts related to this area.

Alejux
01-05-2005, 05:28 PM
You are a programmer. Wow, means your job is related to this field. Nice to hear that. Now we have plenty of experts related to this area.

No, NO, NO!!! :shock:

I am a programmer, but games are NOT my area of expertise. The closest I came to making games was an education 2D adventure game I made 10 years ago. I'm afraid I was seduced long ago by the boring side of the force (I.E. corporate applications, case development, blah blah blah).

But I'm still an enthusiast, and I'm seriously thinking of getting back on this track sometime in the future, if life allows. :)

amod20002004
01-05-2005, 05:34 PM
In other words, you have detailed knowledge about this field. isn’t it? :D That's why you have provided differnt links with enough explanation.

Alejux
01-05-2005, 07:06 PM
hehe...yeah, right. :)

cpiasminc
01-05-2005, 07:14 PM
Your code will be full of common functions such as maths functions, matrix and vector operations, that common code would need to be on the local APU memory. Unless like you say each APU can easily access the other 7 APUs memory on the PE, but I think that will be high latency via the DMA and not designed to fetch instructions. So the room for specialised code and common code is something like 32K. That's enough room for something like (32K/4)/3 = 2600 lines of C, which lets say half is common code, so you might get 1300 lines of specialised C to actually do something useful.
Even then, that's a lot. It isn't very often that a single function has 1300 lines of purely computational code. It's perfectly normal to have a whole file contain less than 1300 lines of actual code excluding comments and whitespace and all. And really, I do expect that the ratio of data to code will go up. Previously simpler tasks may take routes that are more complex simply because you've got the computational power to do so. AI could now work on much larger scales... physics models could become that much more complex, and maybe even move towards soft bodies. For that matter, models will probably be animated with more bones, often times demanding individual node control.

Well, if the actual size of an apulet is sufficiently small, though, that may not be true. But it's not often normal for us to think in terms of breaking things down to a level beyond mere local ownership. Often times, we like to break it down to a point where we can keep pertinent code in one place so that it's easy to track while debugging. Still, the fact that likely all 3 consoles will be multiprocessor and benefit from multi threading may just force us to think a little differently and keep things as independent as possible and pass messages and throw events around like there's no tomorrow. It just leads to a fair bit of pain when trying to develop both on consoles and on PC when multi-core at the same scale is still a long ways off.

I suppose having to manage the APUs will be a big task for the PU but I had a feeling that the user would be the one doing the managing via a Sony API. Not Sony doing the managing in a closed off chip.
I still think it does mean that. It's just that the nature of the API and how it works is still very hidden from the developer. Rather, the developer just asks the API for resources and delivers apulets... and that would basically be the limit of your programming of the PU. Well, maybe that, and exception handling, message passing, event triggering and that sort of thing... but in essence, nothing computationally significant.

That does suggest a few annoyances in that an apulet probably needs to be entirely self-contained... i.e. no function calls, excluding perhaps forceinlined. Technically, you'd still be capable of it, but because of horrid latency and stall potentials, it just wouldn't be a good idea.

nAo
01-05-2005, 08:35 PM
If people would just read CELL patents .. ;)
A|SPUs support code overlays, so the amount of code one can fit in a single APU it's not really an issue.
A|SPUs also support, in a preferred embodiment, up to 32 outstanding DMA transactions (external reads or writes) than can be grouped and prioritezed.
A SPU doesn't stall when it issues one or more DMA requestes. DMA requestes completition can be checked per single request or per grouped requestes.
Moreover SPUs have special registers that hold informations that can be read/written to external devices, like other SPUs or PUs. So one doesn't even have to read/write to external mem to share info or pass messages between SPUs!
IBM has done a lot of astounding work on the SPUs architecture.
SPUs are just much much more than PS2 VUs in term of power and flexibility.
SPUs also have a (maybe shared) L1 cache that holds prefetched data, cause the DMA engine can autoprefetch data from memory examining mem accesses patterns.
One can even specify how much bandwith a SPU can use..that's very useful in realtime applications.
I believe SPUs elective language will be C/C++.

Dr. Gschwind has been working on a SPU compiler as one can read in his homepage (http://www.research.ibm.com/people/m/mikeg/):
Dr. Gschwind is one of the originators of the STI CELL Broadband Architecture and was a member of the joint architecture exploration and definition team. He was also the leading architect of the SPU architecture and the instigator of its compiled code SIMD focus. Dr. Gschwind was the developer of the first compiler targeting the BPA.
ciao,
Marco

Alejux
01-05-2005, 09:12 PM
If people would just read CELL patents .. ;)

Why would we, when we have you to do it for us? ;)



A|SPUs support code overlays, so the amount of code one can fit in a single APU it's not really an issue.

Nice! Where is the overlayed code stored? In the main RAM?


Moreover SPUs have special registers that hold informations that can be read/written to external devices, like other SPUs or PUs. So one doesn't even have to read/write to external mem to share info or pass messages between SPUs!
IBM has done a lot of astounding work on the SPUs architecture.
SPUs are just much much more than PS2 VUs in term of power and flexibility.
SPUs also have a (maybe shared) L1 cache that holds prefetched data, cause the DMA engine can autoprefetch data from memory examining mem accesses patterns.
One can even specify how much bandwith a SPU can use..that's very useful in realtime applications.
I believe SPUs elective language will be C/C++.

Thanks for this information.
Where do you consult these patents? Do you have a link?

VU_fan_boy
01-05-2005, 09:52 PM
Someone is going to have to explain to me what a "code overlay" is because I'm drawing a blank (and I've skimmed over the patent, but mostly looked at the pretty pictures :) )

Well that all sounds quite promising Marco, I'm guessing that the idea is to hide the limitations of the APUs as much as possible.

Alejux
01-05-2005, 10:04 PM
Someone is going to have to explain to me what a "code overlay" is because I'm drawing a blank (and I've skimmed over the patent, but mostly looked at the pretty pictures :) )

Well that all sounds quite promising Marco, I'm guessing that the idea is to hide the limitations of the APUs as much as possible.

Usually, code overlay refers to dynamic loading/allocation of code. It was done a lot in the triassic era, when we had to count the kbytes of code stored in RAM. But then, we had to force the overlays, maybe CELL will do these overlays automatically.

I haven't read the patent yet. Where is it? I think I'll google it.

nAo
01-05-2005, 11:02 PM
Nice! Where is the overlayed code stored? In the main RAM?
yes

Thanks for this information.
Where do you consult these patents? Do you have a link?
I advice you to search on beyond3d message boards (console's forum). You'll find a lot of stuff posted by me and other guys.
Otherwise you can look on the USPTO site (http://www.uspto.gov) or on the European Patent Office site (http://www.espacenet.com)

This is the patent about overlay management on the SPUs
Method and apparatus for overlay management within an integrated executable for a heterogeneous architecture (http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2 FPTO%2Fsearch-bool.html&r=9&f=G&l=50&co1=AND&d=PG01&s1=Gschwind& OS=Gschwind&RS=Gschwind)
ciao,
Marco

Vince
01-15-2005, 02:57 AM
If people would just read CELL patents .. ;)

No shit. I had an interesting PM conversation with Dave about our on-going debate between what comprises an X2 ALU and an S|APU and he had no clue -- in fact, he just never responded to when I explained basically the same information as you posted above concerning masking latencies. AFAIK, the SPC is much more flexible from a programming PoV than the X2 ALUs in terms of fully arbitrary set quering and execution based on arbitrary command rules or metrics (like bandwith). The SPU complex (SPC by right) built around the APU is really quite impressive -- actually the entire custom logic block is impressive -- many of those on B3D will be surprised IMHO.

Anyways, I believe Erik Altman, who is a colleague of Gschwind's, is a place to look for compiler technology.

RichardCypher101
01-15-2005, 07:52 AM
Vince from b3d right? I read the b3d forums (Console mainly) daily, and I tend to see you there quite a bit (along with nAo's patent finding abilities). Nice to have you around the boards, even if its for a short stay.

-Rich

5ysT3m cR45h3r
01-16-2005, 11:09 PM
Yeah cool havin you around.