PDA

View Full Version : IBM cpu article (PS3, Xbox 360)


diOndOrAntt
06-29-2005, 10:57 PM
http://anandtech.com/video/showdoc.aspx?i=2461

xbdestroya
06-29-2005, 11:10 PM
It's not bad news for anyone because that article is just approaching things from a direction game developers will need to learn not to approach them from. Basically, that 'terrible performance' ascribed to Cell and XeCPU is in running, essentially, Windows and it's apps. Single-threaded code wrttin with the forgiveness of Out-of-order processing in mind just isn't the plan this gen, don't know why Anand focused on it so much.

KKS
06-29-2005, 11:11 PM
The article states the developers who were interviwed were from studios working on multiplatform games, and being the multiplatformers they are, they cannot really take advantages of the REAL power of the systems considering the time, budget etc. I'll believe them if they interview console only developers (mainly japanese) such as Hideo Kojima (maker of Metal Gear Solid series). Most of the mulitplatform games don't look as good as exclusive titles anyways (examples being FF10, MGS3, ZOE2 etc)

Also consoles do not need to be as powerful as PC's. I mean could a PC with similar spec as PS2 deliver MGS3 level of graphics? I think not.

makeitlookreal
06-30-2005, 12:59 AM
The scariest part of that article is that the SPE's of the PS3 are more or less useless. Now, I don't believe that. However, could it be true that they are limited in what they can do? Have you heard anything from developers about the utilization of SPEs?

Danielhq
06-30-2005, 04:09 AM
Is it just me, or has the article been removed?

xbdestroya
06-30-2005, 04:20 AM
It's been removed.

Metal Sphere
06-30-2005, 04:45 AM
Apparently because Anand doesn't want his anonymous insider to be traced by Microsoft. It has me wondering if he really only had 1 insider, or that was just worded wrong.

Lumine
06-30-2005, 06:09 AM
can anyone find a google cache of the article?

Metal Sphere
06-30-2005, 07:15 AM
can anyone find a google cache of the article?

What do you mean? Explain it and I'll see what I can do. I wouldn't be surprised if a new article pops up or it comes back revised or summat.

Key
06-30-2005, 09:02 AM
Apparently because Anand doesn't want his anonymous insider to be traced by Microsoft. It has me wondering if he really only had 1 insider, or that was just worded wrong.

It does seem that way. Correct me if I am wrong, but I didn't think any developers had the final dev kit yet for either the Xbox 360 or the PS3. If so, the only ones that could possibly give out accurate information on this type of stuff is Sony and Microsoft employee's who have worked with the systems in one way or another.

Question is, do you think someone working for Sony or Microsoft would give this kind of information out, with basically nothing to gain and everything to lose (His/Her job, possible lawsuits, and more)?

I do find it hard to believe the excuse they have given for pulling the article, if anything the source may have ended up be reliable and they pulled the article to cover their butts.

Applefiend
06-30-2005, 09:16 AM
Right now, from what we’ve heard, the real-world performance of the Xenon CPU is about twice that of the 733MHz processor in the first Xbox. Considering that this CPU is supposed to power the Xbox 360 for the next 4 - 5 years, it’s nothing short of disappointing. To put it in perspective, floating point multiplies are apparently 1/3 as fast on Xenon as on a Pentium 4.

Being an evil Sony fanboy I've been belly laughing about that article all day, if they had gone down to a PC warehouse and bought some cheap knock off Athlon 64s the XBox 360 would have been faster.

But it's wrong to laugh at others misfortune. But fun. After all the abuse the CELL has taken over the last year, this article brings a well needed reality check. The CELL abuse in the article is the same abuse it's always gotten, the XBox CPU abuse is all fresh.

Lesson of the article is don't believe the hype.

Here's a google cache of the article:

http://groups-beta.google.com/group/alt.comp.periphs.videocards.ati/messages/5c3dc95130442d77

KlawHammer
06-30-2005, 03:20 PM
Well marketing does help - but put it this way...do you honestly think Sony & M$ would produce consoles with the claimed performance and sell them for under $500? Think about it.

Fats
06-30-2005, 03:30 PM
Well marketing does help - but put it this way...do you honestly think Sony & M$ would produce consoles with the claimed performance and sell them for under $500? Think about it.

I actually think that they will. It's basically the principle of, "Economy of scale". Because both of the consoles are mass produced in theory production costs should drop.

xbdestroya
06-30-2005, 03:35 PM
Well marketing does help - but put it this way...do you honestly think Sony & M$ would produce consoles with the claimed performance and sell them for under $500? Think about it.

Of course I think that - I mean what do you think that $1000 Athlon FX 57 is costing to make? $60 a chip? Probably....

Danielhq
06-30-2005, 03:51 PM
Well IF the IBM processors now are as weak as they state, then why has Nintendo, Sony AND Microsoft decided to go with them and not intel or AMD that they state are so superior? I find it hard to believe that all three companies would make such a mistake and that the fellows at AT are the ones who knows it right. Especially when Sony has invested so much money in the Cell processor.

Fats
06-30-2005, 03:52 PM
Well IF the IBM processors now are as weak as they state, then why has Nintendo, Sony AND Microsoft decided to go with them and not intel or AMD that they state are so superior? I find it hard to believe that all three companies would make such a mistake and that the fellows at AT are the ones who knows it right.

When did IBM ever state that their processors were "weak"? Am I missing out on something here? :?

Danielhq
06-30-2005, 03:53 PM
When did IBM ever state that their processors were "weak"? Am I missing out on something here? :?

No I was refering to the AT article

Fats
06-30-2005, 03:55 PM
When did IBM ever state that their processors were "weak"? Am I missing out on something here? :?

No I was refering to the AT article

Oh, sorry. Care for a link as I haven't read that particular article!

eva2112
06-30-2005, 04:17 PM
Oh, sorry. Care for a link as I haven't read that particular article!

The article was deleted from their database, but Applefiend found a good Google cache. *points up to Applefiend's post*

koldfuzion
06-30-2005, 06:25 PM
A little off topic, and this article is a few days old now, but this seems to explains the M$ PR about using only "33% of its power"

The problem is that today, all games are single threaded, meaning that in the case of the Xbox 360, only one out of its three cores would be utilized when running present day game engines. The PlayStation 3 would fair no better, as the Cell CPU has a very similar general purpose execution core to one of the Xbox 360 cores. The reason this is a problem is because these general purpose cores that make up the Xbox 360’s Xenon CPU or the single general purpose PPE in Cell are extremely weak cores, far slower than a Pentium 4 or Athlon 64, even running at much lower clock speeds.

http://www.anandtech.com/video/showdoc.aspx?i=2453&p=6

Just thought that was interesting. :)

Edit: Sorry this should have gone here, just found the right thread. Sorry mods. Duh.

http://www.psinext.com/forums/viewtopic.php?t=7310

KlawHammer
07-01-2005, 10:10 AM
That FX-57 only costs as much as it does to make a San Diego core, which is a slightly more efficient version of the Venice, which is a slightly modified Winchester...all theyve done is upped the clock speeds further. So yes they do oveprice their chips.

As for the thing about costs going down as time goes on - i certainly believe that. However, early on in the PS3's life cycle, they have to charge a premium to cover up certain costs, hence i doubt it will be under $400 launch price.

Key
07-02-2005, 09:20 AM
As for the thing about costs going down as time goes on - i certainly believe that. However, early on in the PS3's life cycle, they have to charge a premium to cover up certain costs, hence i doubt it will be under $400 launch price.

Keep in mind Sony makes most of thier money back in Software, addons, and other products for the system.

Probably a great deal of the cost will be made up by using Cell chips with 6 or less working SPE's in other products, even the RSX might be used in a similar way.

PhYmon
07-02-2005, 07:53 PM
Is it just me, or has the article been removed?

nop i wasnt removed here is the link http://anandtech.com/video/showdoc.aspx?i=2453

Tacitblue
07-02-2005, 08:50 PM
Ah, let it die, Anand really dropped the ball on that article. It's a dead horse

Schmeh
07-03-2005, 12:20 AM
nop i wasnt removed here is the link http://anandtech.com/video/showdoc.aspx?i=2453


Wrong article.

rpgamer_2k5
07-03-2005, 12:51 AM
I never saw sites like AnandTech or even Beyond3d too trustworthy. You just have a lot of individuals that tend to throw in their emotional issues into the mix. Seeing that I do not like any of that mickey mouse rethoric, I just turned away.

If anyone actually believes that the Cell is a slug than they are just fooling themselves quite heavily. The Cell design which was planned between IBM, Toshiba and Sony had two designs, one was a 8 core CPU (IBM) and the other was 8 Risc cores (Toshiba) design. In the end, after all the debates, after all the experimentations, the trio concluded that a 1 PPE-8 SPE configuration is better than either.
.
What does this tell us? It tells us that the Cell is a beast hence high floating point operation figure (218Gflops).
..
What does IBM allowing the Cell to be open and the trio putting alot of resource to get Linux working on the Cell tell us? It tells us that the Cell is truely a threat to the current CPU giants.
...
What about the military and medical industry having actual plans employing the Cell in their comps? Yup, the Cell is great, it is so great it is already reaching the closest stars, next it will reach other galaxies. :D

PhYmon
07-03-2005, 04:39 AM
Can somebody tell me which is the best processor between the 360 and the PS3 cuz i get all confuse by all the technical specs and i dont know yet which is the best!!. Also tell me why!? i want to LEARN! hehehe

Hrama
07-03-2005, 04:50 AM
Well, the usability of both of these processors is still up in the air as both have not even seen close to their full usage. From what I have gleamed from articles and other threads though is that the Xbox360's tri-core is much more general purpose while the PS3's Cell and its SPEs, while generally touted as being the superior of the two CPUs, are much more narrowly defined as far as the things that the SPEs can do. As of right now, it though, Devs hare not been able to unlock the true potential of each processor as there are still many technologies and advantages of both of these processors that have been fully exploited. The increadible bandwidth, the multithreading tech on each...

rpgamer_2k5
07-03-2005, 05:48 AM
Hrama:
The terms 'general purpose' doesn't seem to hold anything no more.

These days PCs are used for multimedia purposes. We see more graphics, videos and what not. We have videocards dedicated to improve our multimedia experience and even so CPUs still lag when processing multimedia. To run a Word processor (which also became very graphic intensive) isn't a problem, but we notice a slow down when running Adobe products. The Cell is intended to end this multimedia nightmare, it will allow us to convert avi to mpeg without the constant slow downs. CPUs of today do not have such an edge in the multimedia sector, the Cell will help the GPU allowing us to have a true multimedia experience.

The Cell has a huge floating point operation, it can really do some intense operations, whatever you call it, mathematics or physics. The Cell will be able to run all our current wares; it doesn't take much to run a word processor, we see the slow down because of all the fancy graphics. The Cell will actually speed things up even more, a powerpoint or flash presentation could be done without any lag. :)

The Cell will be really good in physics and be as good as the dedicated PhyX chip hence AGEIA is developing middleware for the Cell. The Cell should be giving us some very impressive physics along with the impressive graphic capability of the Cell and the RSX (dedicated graphics unit).

AI is another issue. However let us note that artificial intelligence aim is to emulate the intelligence of the human. Regardless of your ideology, the reason why we have intelligence is because of the design of our brain. The Cell sort of copies the human brain. The PPE can be seen as the thalamus. It will recieve, refine by sending it to the other modules. The PPE will watch over the SPEs that are independent. The PPE will manage and assign tasks to the SPEs. The SPEs could be seen as the cortex or the other modules that will process and refine the information (graphics, sound, physics, or artificial intelligence). For AI to work, It seems like the structure of the machine would have to match that of an intelligent biological device (the brain). With the PS3 one will be able to use the SPEs to manage certain AI (ex. characters, enemies, 'life' in the environment). The PPE will be able to watch over the AI processing and then send it off to the videoout. Heck, this can mean that sound can also trigger the AI. When one speaks, the character will actually react. The sound from the Webcam goes to the PPE, the SPE dedicated to the AI of that character (s) will then recieve the sound and then be able to determine how to react to the user (s). This would also mean that voice recognition would need to improve but since the Cell is geared towards multimedia, I expect that it should perform very beautifully in this area. :)
.
.
.
.
Maybe I'm a fanboy or not but the XeCPU does not seem to impressive. This design is quite modern but the extra core does seem neat. Expect some nice physics, sound processing and AI. The graphics component will be managed by the Xenos.

PhYmon
07-03-2005, 06:22 AM
U know! i read in some other forums and they say that the PS3 is going to have a big bootleneck because of the type of processor that has on it! and also the clamied that the 360's processor dont have any bootleneck!! and think that the 360 is going to have the biggest bootleneck cuz of the 3 proccesors that have on it! i dont know who can tell me that

Schmeh
07-03-2005, 08:29 AM
U know! i read in some other forums and they say that the PS3 is going to have a big bootleneck because of the type of processor that has on it! and also the clamied that the 360's processor dont have any bootleneck!! and think that the 360 is going to have the biggest bootleneck cuz of the 3 proccesors that have on it! i dont know who can tell me that

When people say the PS3 will be bottlenecked by the Cell, they are comparing the PS3 and it's architecture to the architecture of a PC. Until recently, cpu's were designed to perform 1 task at a time at extremely high speeds. So all games for PC's and most console games were programmed this way. With the cpu manufactures hitting the glass ceiling on clock speeds, they have been forced to find other ways to improve performance. This is why we are seeing dual core processors from all of the manufactures. The Cell, in very simple terms, is a cpu with 1 general core and 7 (in the case of the PS3) smaller very specialized cores. The problem is that current games and software is not written to take advantage of multiple cores. This is the bottleneck that people refer to. The problem with this kind of thinking is that, although it will take a while, games and programs will be written to use multiple cores.

The cpu for Xbox 360 is, like the PS3's cpu, a multicore design. Unlike the PS3, it uses 3 general purpose cores and no specialized cores. In the beginning it will suffer from the same "bottleneck" as the PS3 due to its multicore design. However, if there is one thing that worries me about the 360's cpu, it is that all 3 cores share 1MB of cache. For reference, new Pentium 4's have 2MB of cache for one core. This by itself wouldn't be a huge problem, if MS had put a memory controller in the cpu. The problem is that they didn't do this, they put the memory controller on the gpu, so every time there is a cache miss or the cpu needs something from memory it will have to go through the gpu. This will cause latency problems and could possibly cause delays in processing code.

In the end, we will just have to wait and see what developers are able to do with either cpu.

rpgamer_2k5
07-03-2005, 08:36 AM
The Xbox 360 Processor seems quite basic, it is meant to operate like a dual core Athlon. The Cell uses a completely different procedure. The PPE is like a Xbox 360 PPC core, but it is meant to manage the SPEs. The SPEs are very powerful and should not underestimated. The PPE with the VMX alone is quite powerful but the SPEs are the real gems.

In addition, one shouldn't compare the two apple to apple. There are no 'bottlenecks' on the Cell except for branch prediction...I guess. The PPE does have branch prediction capability and the SPEs lacks it since it takes transistor space plus it does not do so well performance per transistor. The SPEs will be doing it by software and I feel that it should be comparable to hardware+software.

The Cell will be doing quite handsomely when it comes to multimedia task and that is the performance most need. Whether it is movies, gaming, voice recognition, flashy vector based GUIs, etc, the Cell will perform. If one doubts the Cell's ability to run a word procesor then they are just fooling themselves. Word processors can run quite nicely even with all the features. The flashy GUIs is what makes word processors slow on older PCs.

Cell is the true multimedia processor, which EVERY microprocessor manufacture is trying to develop. The Cell will really make a good tag partner with the RSX. :)

Rukawa
07-03-2005, 09:41 AM
SPE can be more powerfull than PPE at number crunching calculation acording to Sony
I really dont know the flexibility of VMX vs SPE
Anyone can explain it to me
The GDDR3 also have high latency and GPU is always need this memory
The CPU will be hardly get any data from memory
GPU using L2 cache is really joke to me, why would GPU using it when the CPU need it very badly

I read interview that said PS3 and Xbox 360 can have full physic like PC with PPU. Its weird isnt it. Can a VMX perform Physic as good as PPU?

Tacitblue
07-03-2005, 05:29 PM
The vector units are the heart of the SPE, but that doesn't mean it cannot handle scalar type calculations as well. Programmers look at what can be shifted to the SPEs and what should be kept on the PPE for optimal performance. Vectorizing the code and getting it to fit on SPE resources is the trick it seems, but if done right the performance will come.

rpgamer_2k5
07-03-2005, 07:36 PM
SPE can be more powerfull than PPE at number crunching calculation acording to Sony
I really dont know the flexibility of VMX vs SPE


The SPE can do anything the VMX can do but it has a more additional instruction set.


The GDDR3 also have high latency and GPU is always need this memory
The CPU will be hardly get any data from memory
GPU using L2 cache is really joke to me, why would GPU using it when the CPU need it very badly

L2 Cache is a buffer. It is used to link the L1 to the system memory. Possibly the GPU will use the L2 as well to acess the 512mb of system RAM.

I read interview that said PS3 and Xbox 360 can have full physic like PC with PPU. Its weird isnt it. Can a VMX perform Physic as good as PPU?

The PPE-SPE design is actually pretty good in physics. Such operations usually require meticulous calculations. The PPE will be able to designate the SPEs to physics calculations while initial calculations can be done on the VMX too. No wonder Aegia is developing middleware for the PS3. The same can be said about Xbox 360's tri-core design, it should be very good in physics. The PC requires a dedicated chip because those CPUs are just weak compared to the beasts on the PS3 and Xbox 360.

MegaGrid
07-04-2005, 09:56 PM
On the contrary I believe that this article:
http://www.anandtech.com/video/showdoc.aspx?i=2453&amp
is far more interisting than the article that was erased. No wonder why, see in this article that Epic claims that the 7SPEs can be very good for physics due to the parellelization that efficient algorithms of simulating them have.
Also I liked the brain approach presented here in this forum. Actually the Cell is a perfect agent right? Decision making on the brain (General core) and the sensor/effectors (SPES)... A sensor could be a video/audio/control (well maybe this also) capture from an external system processed from the SPE.
The main Core applies the knowledge base of the world and decides on an action.
The effectors are the polygon list sent to the GPU and the audio output again been created by the SPEs.
It is interesting to read that Intel is planning a similar approach at about 2015. The dual core to multi core (expected to happen) with the addition of scalarity (now it makes sense why).
So CELL is propably a prodrome of a new age in CPUs...
So maybe this is what this guy in the erased was trying to say that PCs will do real AIs, physics etc but he did not count that CELL is going to evolve also. Anyway.
Also guys RSX differ from G70 (GeForce 7800 GTX). First the bus connecting the RSX with the CELL 35Gb/s is much faster than the 8Gb/s of the CPU(PC)-G70. Also the RSX can take advantage of the 256MB of the CELL processor and believe me if some latencies is taken care of it is extremely useful. The RSX clock is faster. So RSX is better than G70. Period. In the long run the figures are with SONYs side and it is surely going to be a better system than any PC in 2006 for games at an affordable price also.

xbdestroya
07-04-2005, 10:43 PM
LOL, ok RPGGamer, let's calm down for a second. :)

Cell is a very impressive design, but let's not act as if the XeCPU is terrible; they both have certain advantages. In the end it's not just how powerful the chips are, but how much of that power can be extracted and applied to useful tasks.

Likewise @Phymon, Cell is hardly what I would call constrained or bottlenecked. Best not to subscribe to any one ideology unless/until you understand how those conclusions are being reached, and if the conclusions make sense.

saxdawg00
07-04-2005, 11:56 PM
In my opinion, this is just another case of a new idea being rejected. Increasing clockspeeds and cache can only get you but so far, so there has to be a new or alternative way to reach higher levels of performance.

Just my 2 cents

PhYmon
07-05-2005, 12:19 AM
LOL, ok RPGGamer, let's calm down for a second. :)

Cell is a very impressive design, but let's not act as if the XeCPU is terrible; they both have certain advantages. In the end it's not just how powerful the chips are, but how much of that power can be extracted and applied to useful tasks.

Likewise @Phymon, Cell is hardly what I would call constrained or bottlenecked. Best not to subscribe to any one ideology unless/until you understand how those conclusions are being reached, and if the conclusions make sense.
Well man i dont know for sure! but as the matter of fact. read it on some forum and i came here for some answers! thats the purpose of the fourms!!

rpgamer_2k5
07-05-2005, 12:24 AM
xbdestroya:
Ah, atleast somebody bombards my comment. :aim:-D:

In my previous posts I did state that the XeCPU will be good in AI, physics, sound with the XeGPU doing the graphics work. I do not fall back from that stance but seeing the ever growing anti-Sony rethoric, I decided to become a Sony-extremist. :rant:

To be technically correct, the XeCPU will be a good CPU, heck it will kill all those dual core Athlons we see on the market. However, the Cell is a notch better. :)

saxdawg00: Totally agree. The old x86 architecture needs to go if we want to see real progress in the microprocessor realm.

PhYmon
07-05-2005, 12:26 AM
Hey guys this link has some good info about the Cell http://www.devhardware.com/c/a/Computer-Processors/Cell-Inside-the-Future-of-Processor-Architecture/
Check it out!

xbdestroya
07-05-2005, 12:28 AM
Ok phymon let's just start it all from the beginning so we're not all confusing each other and can help to address any questions anyone might have. :cool:

Anyway, what was the problem you had heard that Cell had? I get the feeling that somewhere or other you were given the impression that the PS3 has a bandwidth issue, but you seem to be talking in terms of the Cell specifically.

Is it the Cell or the RSX you are saying has the bandwidth issue? Or the system as a whole?

We'll see if we can take things from there. :)

PhYmon
07-05-2005, 12:43 AM
I meant the Cell itself! somebody wrote something about that the Cell will have a big bottleneck, so tell me man what u know about the Cell! to prove them wrong! or to prove myself wrong for that matter!

xbdestroya
07-05-2005, 01:01 AM
Well I mean I just don't understand where this bottleneck would be if it's exclusively Cell - it makes so little sense to me. Could you find that argument and reiterate it here?

rpgamer_2k5
07-05-2005, 01:17 AM
There is no bottle neck!

Phymon, you see, there are various deranged or maybe even paid individuals out there on the net trying their hardest to malign the Cell. You see then Cell is actually not just competiting against the XeCPU but also Intel and AMD line of CPUs. The Cell has the potential to dominate 'hi-end business' (military, medical, graphics dev't, etc) market and this is where the big $$$ comes from. From there the Cell can move onto the the personal computer and the commercial markets. Many want to make sure that the Cell dies for financial reasons or err...emotional issues (i.e. Xbox and PC fanboys).

The Workstations and the HDTVs will be the first to carry the Cell, the PS3 will be the first console to carry Cell and it shall then slowly crawl into the rest of the consumer markets. Remember, there is a reason we have Nagasaki 2 fab up and 1 going through an upgrade. Along with the joint Oita fab and Fishkill plant. Something big will certainly happen. :)

[Added Later]
I think the bottleneck you are speaking of is the lack of branch predictors on the SPEs and the inability to perform out-of-order-execution. The design of the Cell is not like the typical multi-core CPUs. The Cortex does not recieve information from the sensors before the thalamus.

RE: Branch Prediction
The PPE will always initiate all operations within the Cell. It will retrieve info, refine a bit, break up and then send it over to the SPEs. Since the PPEs have hardware predictors, it would mean that the code sent off to the SPEs will be optimized to ensure low amount of mis-predictions. Also the SPEs will employ software branch predictors which will be using the "Swing Modulo Scheduling" pipeline alogorithm.

RE: OOOE
Just because the Cell does not use OOOE does not mean anything. Also keep in mind that the XeCPU are pretty much identical to the PPE. This is why the XeCPU also uses OOOE which means their even in this regard. :laugh: Anyways most CPUs today use OOOE which was effectively able to ensure that cycles are not lost. However, today OOOE processor all we're seeing is an increase in the pipeline sizes (note: OOOE has alot more stages) and therefore an increase in latency. Increase in latency means a huge bog down in performance.

So you see, x86 needs to go, the Cell will be the first CPU to help ensure that it dies. :tada:

saxdawg00
07-05-2005, 01:21 AM
Hey guys this link has some good info about the Cell http://www.devhardware.com/c/a/Computer-Processors/Cell-Inside-the-Future-of-Processor-Architecture/
Check it out!

The author of the article pointed out what I think alot of people are forgetting:


....The way programs are handled on Cell will not be the same as on current processors....
Analyzing cell from a "traditional" processor standpoint does not do it any justice.

version
07-05-2005, 01:45 AM
anand and m$ suxxx

SPE use LS than CACHE , software-cache, if divide 128 registers for 4 threads, 32 regs per thread, then on SPE run 4 general purpose threads ,
if we have a cache miss ,then switch other thread similar than xenon

8 SPE and PPE can to run 34 general purpose thread same time !!!

cell faster than suxy xenon :LOL

rpgamer_2k5
07-05-2005, 01:55 AM
saxdawg00: Different techniques like the "Swing Modulo Scheduling" software pipelining approach will be used. Allow me to be repetitive once again, the Cell will be immensely powerful because of its unconventional architecture. :)

[Added Later]
The XeCPU isn't obsolete in any way; I hope no one takes my minor comment seriously. :laugh: The core of the XeCPU will be very similar, if not the same as the PPE of the Cell. The XeCPU will floor all of the PC counterparts hence the emotional outbursts by various PC enthusiast websites. :laugh:

PhYmon
07-05-2005, 04:24 AM
Thanks guys! i get to learn something new now! u see everytime that i read something new about the capabilities of the Cell i get all crazy and stuff cuz i havent see a breaktrought like this since Nvidia showed off the GeForce 3, but the big differents here is that the cell will least a couple years; and its going to be 2 o 3 times faster than the upcoming processors! so this is a big prediction; have anybody see the landscape made but the cell itself? what do u think about it? in my opinion its pretty amazing

MegaGrid
07-05-2005, 08:49 AM
I liked the article at : http://www.devhardware.com/c/a/Computer-Processors/Cell-Inside-the-Future-of-Processor-Architecture
Thanks for mentioning it.
76.8Gb/s the duplex bandwidth of FlexIO cool. It had to be from the numbers that Sony Presened us in E3.
Sony wanted to compete with the architecture of CPU(PC)-GPU. Many things makes sense now. Very nice thinking from SONY people.
It is 4 times faster than the current PCIex. If the AMD64 can give only 8Gb/s duplex bandwidth then the problem with bandwidth widening reached its full potential with the PCIex16 bridge between the CPU & GPU. I would dare to say that PCIex16 is not fully utilised by current PCs. I guess that the erased article did not mention this either...
So there needs to be also a change in the PCIex and CPU in the future for PCs to stand a chance in gaming.
I would like also seeing the FlexIO bandwidth being used to connect to other CELLs. You can't have it all I guess.

xbdestroya
07-05-2005, 05:20 PM
Well the PCI Express will only effect the communication between the CPU and the graphics card - something that may lead to the cards being CPU starved at one point or another, but it seems that they haven't reached that wall yet (rather for pure CPU weakness reasons). It also proves a bottleneck for the various turbo-cache implementations, but as far as the graphics card's performance, not too much of a limiting factor as long as it has sufficient local bandwidth. So I'm not one who believes the PC will find itself in graphics limbo anytime soon, though I do think that this gen of consoles will be able to hold it's own for quite awhile, and I find the technology much more interesting.

cpiasminc
07-05-2005, 06:07 PM
There is no bottle neck!
Never say "no bottleneck"... anything can be a bottleneck depending on the software written for it. I mean, you can point to bandwidth, in-order, EIB, cache, fillrate, blah, blah blah, and all of them can be potential bottlenecks. There is no such thing as a system architecture that has no bottlenecks -- There are more flying pandas than bottleneck-free system architectures.

Phymon, you see, there are various deranged or maybe even paid individuals out there on the net trying their hardest to malign the Cell.
I find it funny that you put "paid" as something more severe than "deranged."

The PPE will always initiate all operations within the Cell. It will retrieve info, refine a bit, break up and then send it over to the SPEs. Since the PPEs have hardware predictors, it would mean that the code sent off to the SPEs will be optimized to ensure low amount of mis-predictions.
The "breaking up" and "refinement" of which you speak is totally manual. The codebase itself has to be broken up to that level. The only thing preventing you from branching within SPE code is you. Typically, conditional moves are going to be a lot faster than branches on the PPE as well (and MS's whitepapers say the same about XeCPU). The branch prediction on the PPE is pretty poor, actually. What the PPE does is it keeps running that API which tracks available resources for execution and distributes apulets. You still have to explicitly write code for the SPEs and explicitly write code that requests SPEs and issues apulets and the corresponding data. None of that just magically happens.

When IBM refers to the process as "automated," it means you don't have to do your own thread pooling, resource management, message passing, etc. That's one thing you definitely have to worry about on Xenon.

Just because the Cell does not use OOOE does not mean anything. Also keep in mind that the XeCPU are pretty much identical to the PPE. This is why the XeCPU also uses OOOE which means their even in this regard.
I'll assume that you made a typo in saying XeCPU uses OOOE, since you said the two are even. The lack of OOOE greatly decreases complexity and transistor count per unit, really -- and you also end up being able to clock higher at lower power consumption because of that simplicity.

Anyways most CPUs today use OOOE which was effectively able to ensure that cycles are not lost. However, today OOOE processor all we're seeing is an increase in the pipeline sizes (note: OOOE has alot more stages) and therefore an increase in latency. Increase in latency means a huge bog down in performance.
In-order will always have higher net latency than OOOE. The thing about OOO is that it *hides* latencies well. Say you have an in-order and an OOO chip that both have 20 stages... and you have a small codeblock that starts with a RAW (Read After Write) data dependency followed by a bunch of otherwise independent ops. An in-order execution will show some latency on that RAW dependency and then execute those following ops. An OOO execution will fill in the independent ops in between that RAW dependency so everything else executes during that latency. End result is that while the latencies are about the same, it was hidden in the OOO case.

There's no rule at all that OOOE = longer pipelines. OOOE = more complex pipeline stages. P4 is the only OOO processor that has longer pipelines than CELL/XeCPU. A64 is 12 stages, G5 is 7-10, iirc, compared to the PPE's 21 stages. Conversely, though, the complexity does mean that A64 and G5 (and actually, for that matter P4, in spite of its 28 stages) cannot clock as high as CELL or XeCPU at the same voltage and current draw. Possibly not at all.

This is really an example of things coming full circle. Prior to the Pentium Pro, we had in-order CPUs on the PC, and people wrote code with that in mind. When OOO first showed up, game programmers were experiencing headaches because it became harder to keep track of what's going on internal to the CPU. Now we've gotten used to OOOE, and going back is becoming a headache. Surprise, surprise.

So you see, x86 needs to go, the Cell will be the first CPU to help ensure that it dies. :tada:
If anything, that would make Intel happy -- they've been wanting to get away from x86 for a while. Itanium was made with just that purpose. There are going to be a hundred more designs before x86 really dies. And there are plenty of cases prior that people hope will do something about x86. Hell, PPC first showed up with exactly that purpose. DEC also tried to sell Alpha on the PC as well.

rpgamer_2k5
07-05-2005, 06:27 PM
FlexIO matches Cell's philosophy, it is very simple and not complex like the other counterparts and for this reason is able to be fast and efficient like the Cell. But as Xbdestroya have just stated, the GPUs on PCs will continue doing most of the work and a dedicated PhysX part will also join in. PCI-E should be sufficient, but PC gaming will become alot more expensive.

Now the the Xbox 360 might have a problem. I have read (or is it just speculation?) that the XeCPU will have a similar ~39,000 active body count like the PhysX and Cell. Any chipset will have trouble managing that many objects. I have no doubt that the XeCPU has sufficient power since it is three Cell PPE (so 3 VMX units). However I wonder about the bandwidth? The XeCPU will be doing all the general CPU operations, sound and physics. Doing this is possible as I stated earlier, but moving all this data is going to be a trouble. This explains why Sony has opted for FlexIO databus because the Cell requires it to move all that heavy data. This is likely the same reason why we see AGEIA eyeing XDR DRAM.

The Cell will not only generate a high active body count but it should be able to manage all of its operations nicely.

rpgamer_2k5
07-05-2005, 11:00 PM
Never say "no bottleneck"... anything can be a bottleneck depending on the software written for it. I mean, you can point to bandwidth, in-order, EIB, cache, fillrate, blah, blah blah, and all of them can be potential bottlenecks. There is no such thing as a system architecture that has no bottlenecks -- There are more flying pandas than bottleneck-free system architectures.

I'm aware of that. Nothing is perfect and one can even say that 'perfection' is unattainable.

I find it funny that you put "paid" as something more severe than "deranged."
Well, you see, the type of claims we hear generating by certain sites is becoming quite low. I don't really them earning money but hey, they have to be more constructive.

The "breaking up" and "refinement" of which you speak is totally manual. The codebase itself has to be broken up to that level. The only thing preventing you from branching within SPE code is you. Typically, conditional moves are going to be a lot faster than branches on the PPE as well (and MS's whitepapers say the same about XeCPU). The branch prediction on the PPE is pretty poor, actually. What the PPE does is it keeps running that API which tracks available resources for execution and distributes apulets. You still have to explicitly write code for the SPEs and explicitly write code that requests SPEs and issues apulets and the corresponding data. None of that just magically happens.

I never said that the PPE-SPE configuration is automatic. IBM, Sony and Toshiba will be developing tools that will allow this radical multi-core to be easily tapped. Such support is really needed to allow such a CPU to fly to greater heights.

I'll assume that you made a typo in saying XeCPU uses OOOE, since you said the two are even. The lack of OOOE greatly decreases complexity and transistor count per unit, really -- and you also end up being able to clock higher at lower power consumption because of that simplicity.

That was a typo. Yes, and this is why the SPEs are very simple RISC cores without hardware predictors.

In-order will always have higher net latency than OOOE. The thing about OOO is that it *hides* latencies well. Say you have an in-order and an OOO chip that both have 20 stages... and you have a small codeblock that starts with a RAW (Read After Write) data dependency followed by a bunch of otherwise independent ops. An in-order execution will show some latency on that RAW dependency and then execute those following ops. An OOO execution will fill in the independent ops in between that RAW dependency so everything else executes during that latency. End result is that while the latencies are about the same, it was hidden in the OOO case.


Interesting...take a look at this:

The large performance advantage offered by out-of-order
execution on singlethreaded workload shrinks when several threads are executed on an architecture featuring SMT. This can be explained by several converging reasons: in the in-order pipeline, the opportunity to find independent instructions to fire increases when several threads are executing in parallel. In the out-of-order pipeline, the speculative execution of instructions which will be later canceled consumes resources potentially useful for the execution of valid instructions from the other threads.

Source:
Out-Of-Order Execution May Not Be Cost-Effective on Processors Featuring Simultaneous Multithreading; Sebastien Hily, Andre Seznec
© Copyright 2005, IEEE

Since the Cell and XeCPU will eventually be multithreaded, it would make more sense to opt for in-order execution than OOOE.

This is really an example of things coming full circle. Prior to the Pentium Pro, we had in-order CPUs on the PC, and people wrote code with that in mind. When OOO first showed up, game programmers were experiencing headaches because it became harder to keep track of what's going on internal to the CPU. Now we've gotten used to OOOE, and going back is becoming a headache. Surprise, surprise.

We'll be going in a 360 for sure. =) All new technologies have a very difficult transition phase. Many just cannot go back since they are so fixated with the technology they are using. I feel that the Cell bring upon CPUs from Intel and AMD that has a similar architecture.

If anything, that would make Intel happy -- they've been wanting to get away from x86 for a while. Itanium was made with just that purpose.

Intel wanted to develop their own proprietary architecture, so that they can kill AMD. The Cell architecture can be licensed by IBM/Sony/Toshiba to any computer, including AMD. Anyways, even though the Itanium was so much better than the x86, it had poor x86 support. The Itanium poorly emulated all the x86 designed processor, so obviously the new AMD CPUs with AMD64 extensions would do better. Cell however willl be able to support all 32-bit applications and would not experience performance lags like the Itanium.

MegaGrid
07-08-2005, 03:18 PM
It is not just a magnificent GPU that saves the day. Sufficient local bandwidth is not enough. I have done a lot of thinking about this story. I would have never posted here unless I was certain about the technological breakthrough that CELL offers.
It is simply impossible to run the game in GPU alone. Make a GPU put an OS in it and then run it there and then I will discuss about the sole significance of the GPU.
If I see a combination of CPU-GPU the way we know it now and I will just say no its not enough no matter how magnificent the GPU is. SONY knew this very well. A true multimedia system has LARGE pipes. It is the nature that multimedia has. There needs to be a redisgn of the PC motherboard and CPU to reach the levels of PS3. Right now it is impossible. Also to buy a good quality PC multimedia system you need lots of $s and also you can donate it afterwards to an Escimo for entertainment and heating.
Anyway this the First Rule when you enter multimedia, do you have big pipes? Are they used properly so that it is not a rock grinder? I think that SONY passed these two questions successfuly.

cpiasminc
07-08-2005, 06:51 PM
Since the Cell and XeCPU will eventually be multithreaded, it would make more sense to opt for in-order execution than OOOE.
Well, yes, assuming you have enough TLP in the codebase to fill in latencies. I guess I should have said that single-threaded performance will always be rife with greater latency on in-order vs. OOO. With an OOOE CPU, it tries to extract ILP on each thread and fills in gaps -- so there wouldn't be as big of gaps for SMT to fill in. Which is why, for instance, Hyperthreading on P4 doesn't give you a linear improvement per thread.

However, there's still the question of how many ways of SMT is enough. Personally, looking at some profiles we've been seeing on 360 since we got our beta kit (last week), I don't think 2 hardware threads per core is enough to hide latencies very well at all. And I think that applies to CELL as well, but not having any PS3 kits here, I couldn't say for sure.

You look at, for instance, Niagara, which is the closest established thing in the server market to CELL, and each of its small cores have 4-way SMT for a reason. Profiles of the small in-order cores showed it's latencies vs. execution times of each thread tended to be about 3:1 (i.e average of 3 out of 4 cycles per thread were stalls).

At the same time, in a server environment, it's easy to find that amount of TLP -- not very likely in games, if that.

Of course, some of the stalls we are seeing right now could be better covered given time. My officemate was mentioning a little while ago how for so long, as CPUs have been getting faster, the SoA in favor of AoS approach has been pushed. And while everybody understood the concept and how it can help streamline memory accesses, the repercussions of NOT doing it were never really clear until recently. With 360, oh boy, does it become achingly clear, and PS3 is likely not to be an exception either.

Intel wanted to develop their own proprietary architecture, so that they can kill AMD. The Cell architecture can be licensed by IBM/Sony/Toshiba to any computer, including AMD. Anyways, even though the Itanium was so much better than the x86, it had poor x86 support. The Itanium poorly emulated all the x86 designed processor, so obviously the new AMD CPUs with AMD64 extensions would do better. Cell however willl be able to support all 32-bit applications and would not experience performance lags like the Itanium.
In the beginning, HP had so many things in mind for Itanium whereas Intel pretty much didn't. And the problem was that they tried to make 1 product that did everything for everybody and committed to certain things that not many people within Intel entirely agreed with. e.g. predication -- yes, it's nice and all in theory, but it's entirely useless until compilers are good at dealing with it. And just because Intel can make a compiler that beautifully optimizes SPEC, doesn't mean much if it can't do the same for everything else.

Also, there's a lot of argument over which direction the future is really taking, and I really don't think Itanium filled that niche.

PhYmon
07-10-2005, 06:27 AM
Hey people i found some interesting stuff at the IBM website check this out http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/D9439D04EA9B080B87256FC00075CC2D/$file/MPR-Cell-details-article-021405.pdf

PhYmon
07-10-2005, 06:28 AM
Hey people i found some interesting stuff about the Cell at the IBM website check this out http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/D9439D04EA9B080B87256FC00075CC2D/$file/MPR-Cell-details-article-021405.pdf