PDA

View Full Version : Sony Research: Deferred Shading pushes PS3 performance to state-of-the-art level


Arnaud_M
09-12-2007, 06:19 AM
Here is a very interesting paper from Sony Research. It shows how an intelligent use of SPU+GPU can give the PS3 a performance level on par with top-of-the-line (or close to) PC GPU. It also shows the complexity of programming this beast, and as such, is a very interesting reading. I suggest you at least try to read it, even if you do not understand all of it (I am not understanding all of it either), it will enlighten you. And if you have any question, I will gladly let CPI take them :-D :-D

Paper link (free access, no login required):
http://research.scea.com/ps3_deferred_shading.pdf

Some sentences from the paper:

This paper studies a deferred pixel shading algorithm implemented on a Cell/B.E.-based computer entertainment system.The pixel shader runs on the Synergistic Processing Elements (SPEs) of the Cell/B.E. and works concurrently with the GPU to render images. The system's unified memory architecture allows the Cell/B.E. and GPU to exchange data through shared textures. The SPEs use the Cell/B.E. DMA list capability to gather irregular fine-grained fragments of texture data generated by the GPU. They return resultant shadow textures the same way. The shading computation ran at up to 85 Hz at HDTV 720p resolution on 5 SPEs and generated 30.72 gigaops of performance. This is comparable to the performance of the algorithm running on a state of the art high end GPU. These results indicate that the Cell/B.E. can effectively enhance the throughput of a GPU in this hybrid system by alleviating the pixel shading bottleneck.

(...)

We chose an extreme test case that stresses the memory subsystem and generates a significant amount of DMA waiting. Despite this waiting the algorithm scaled efficiently with speedup of 4.33 on 5 SPEs. This indicates the Cell/B.E. can be effective in speeding up this sort of irregular fine-grained shader. These results would carry over to less extreme shaders that have more regular data access patterns.

(...)

We study variations of a Cone Culled Soft Shadow algorithm. This algorithm belongs to a class of algorithms known as shadow mapping algorithms.

(...)

The algorithm is not physically correct and we accept many approximations for the sake of real time performance.

(...)

the dandelion is a challenging test for shadow algorithms. The algorithm correctly reproduced the fine detail at the base of the plant as well as the internal self-shadowing within the leaves.

(...)

The key to running any algorithm on the SPEs is to develop a streaming formulation in which data can be moved through the processor in blocks. We move eye data in scanline order and double buffer the scanline input. While one scanline of pixels is being processed we prefetch the next scanline. As each scanline is completed it is written to the shadow texture. We have measured the DMA waiting for the scanline data and it was negligible.

(...)

By having multiple DMA lists in flight concurrently we buffer fragment data in order to minimize DMA waiting. We experimented with the number and size of the DMA lists in order to minimize runtime. We found that having four DMA lists was optimal and that larger numbers did not reduce the runtime. We found similarly that fetching 128 pixels per DMA list was optimal and that longer DMA lists did not reduce runtime.We parallelized the computation across multiple SPEs by distributing scanlines to processors. This is straightforward and provides balanced workloads. We scheduled tasks using an event queue abstraction provided by the operating system that is based on one of the Cell/B.E. synchronization primitives, the mailbox. We measured the cost of this abstraction at less than 100 microseconds per frame. When running in parallel on multiple SPEs the individual processors completed their work within 100 microseconds of each other. Each SPE computes a set of scanlines for the shadow texture. They deliver their result directly into GPU memory in order to minimize the final render time.

(...)

1-SPE 19Hz (=frames per second)
2-SPEs 34 Hz
3-SPEs 59 Hz
4-SPEs 75Hz
5-SPEs 85Hz

(...)

The monochromatic shader ran at 85 Hz using 5 SPEs and at 34 Hz using 2 SPEs. Videogames are typically rendered at 30 or 60 frames per second. Shading calculations should generally run at these rates, but for shadow generation it is possible to use lower frame rates without affecting image quality. It would also be possible to use shadows generated at 720p resolution with a base image rendered at a higher 1080p resolution (1920x1080 pixels).

(...)

We implemented the same algorithm on a high end state of the art GPU, the NVIDIA GeForce 7800 GTX running in a Linux workstation. This GPU has 24 fragment shader pipelines running at 430 Mhz and processes 24 fragments in parallel. By comparison the 5 SPEs that we used process 20 pixels in parallel in quad-SIMD form. The GeForce required 11.1 ms to complete the shading operation. In comparison the Cell/B.E. required 11.65 ms including the DMA waiting time, and would require only 8.56 ms if the DMA waiting were eliminated. The performance of the Cell/B.E. with 5 SPEs was thus comparable to one of the fastest GPUs currently available, even though our implementation spent 27% of its time waiting for DMA. Results would resumably be even better on 7 SPEs, or on fewer SPEs if we could reduce or eliminate the DMA waiting.

(...)

Our initial results are encouraging as they show it is feasible to attain scalable speedup and high performance even for shaders with irregular fine-grained data access patterns. Removing the computation from the GPU effectively increases the frame rate, or more likely, the geometric complexity of the models that can be rendered in real time. We can also conclude that the performance of the Cell/B.E. is superior to a current state of the art high end GPU in that we achieved comparable performance despite performance limitations and despite using only part of the available processing power.

Arnaud

curryking1
09-12-2007, 06:29 AM
Interesting.... I wish... I could understand.

Something cool I saw was rendering shadows at 720p but having the base image at 1080p, that's interesting.

Pluto
09-12-2007, 06:53 AM
^
You were brave enough to post a response to this? Wow.

section
09-12-2007, 07:51 AM
Arnaud man, thanks for the heads up!

Will have to read through that paper on better time. Publications at research.scea.com aren't released so often so every new paper is good.

Nova, this is what this part of the forums used to be all about, info, metadata about technologies behind the Playstation brand. You may not be interested in it but there used to be a good bunch of people who did / still are.

Lost|Identity
09-12-2007, 08:41 AM
After a first read through, it seems this article shows that there is a large bottleneck in DMA gather functions which is limiting the effectiveness of deferred rendering (due to fine-grain irregular memory accesses) using Cell. This means that Cell is waiting for data to come in from memory for about 27% of the time. Really, this article all in all is more about the advantage of moving the shading operations in a deferred model over to Cell from the RSX (or as they keep referring to it "the GPU" ;)) then it is about deferred rendering on the PS3 in general.

(oh and yea, they estimate a 4.33 times increase when using 5-SPEs for shading a ~100,000 polygonal model of a tree using a monochromatic shader :) which the researchers think they can push up to 4.90 if they can take care of the bottleneck, taking it from 85Hz (fps) to 116. I'm a bit skeptical of that gain being so clear cut)

makeitlookreal
09-12-2007, 09:34 AM
What is so awesome is that we now know the CELL can do just about anything. It can help with pixel shading, vertex shading, geometry, physics, animation, etc. It is simply awesome!

yoshaw
09-12-2007, 09:51 AM
Interesting.... I wish... I could understand.

Something cool I saw was rendering shadows at 720p but having the base image at 1080p, that's interesting.


^
You were brave enough to post a response to this? Wow.

:lol: @ Curry and Nova!!



I mean seriously, :laugh:

VG Aficionado
09-12-2007, 12:14 PM
Interesting :)

yoshaw
09-12-2007, 12:17 PM
What caught my attention was like 5 spu's working in unison to match the power of a 7800GTX graphics card. Now I may not know the itsy bitsy details that might throw the whole comparison off if taken seriously but maaaayen, on paper it sounds frikkin awesome! LOL

Fazares
09-12-2007, 01:04 PM
only cell i care about...

kaphwan
09-12-2007, 01:05 PM
Oh, god.

I'm reading the paper now.


THIS IS THE PS3INSIDER OF MY CHILDHOOD

+REP TIMES A MILLION, I LOVE YOU

makeitlookreal
09-12-2007, 01:12 PM
This is an awesome paper. The fact the CELL can perform pixel shading this well is in direct contradiction to what some people said a year ago. They were saying the the CELL could not perform pixel shading well. To know 5 SPEs can match the pixel shading power of a 7800GTX is pretty stunning!

yoshaw
09-12-2007, 01:54 PM
Yea, but the thing is, how many developers are even gonna do it?

makeitlookreal
09-12-2007, 02:06 PM
I don't think every developer will use SPEs for pixel shading. It would cost more money that many developers would not want to spend. However, I would expect at least some big budget games would utilize this to some extent. Of course after it's been used several times and it is more understood more developers will be utilizing it.

My guess is that we will see this used in games later on in the life of the PS3, but not immediately.

frosty
09-12-2007, 02:39 PM
Killzone is already using it.

makeitlookreal
09-12-2007, 03:16 PM
Hello Frosty,

I don't think Killzone is already using deferred pixel shading as in this method. They are just adding post processing effects with the CELL.

Sephiroth_VII
09-12-2007, 03:32 PM
Ah, now this what I want to see. :cloud9:

I've only just skimmed through the paper, but now I'm going to print it and take my good time reading it this evening.

+rep!!!

cpiasminc
09-12-2007, 06:08 PM
Geez, people love to jump to conclusions. It can help with deferred shadow rendering at 85 Hz if you do absolutely nothing else per frame, but to make the leap to say that you can do any and all pixel shading on Cell is absurd. Sure, if all pixel shading is to amount to sampling 1 texture and rendering 1 screen-sized quad in grayscale.

I'll have to look at this in more detail after I get into work, but even at a glance, I can guess that the filter kernel will be a major hell.

Arnaud_M
09-12-2007, 06:15 PM
This means that Cell is waiting for data to come in from memory for about 27% of the time.

I am not a graphic programmer, but this mean to me (as they suggest in their conclusion) that their method is simply a first approach that needs to be refined over time, and that with refinments, they may lower this waiting time, and increase the performance even more.


(oh and yea, they estimate a 4.33 times increase when using 5-SPEs for shading a ~100,000 polygonal model of a tree using a monochromatic shader :) which the researchers think they can push up to 4.90 if they can take care of the bottleneck, taking it from 85Hz (fps) to 116. I'm a bit skeptical of that gain being so clear cut)

It is a "monochromatic" shader because it only calculates the intensity of shadow, from light to dark, it is not a limitation. And a scale up of 4.33 for 5 SPU is very good, I have seen far worse scale ups than that in my programming history :-D (but granted, this is a task that is well suited to parallelization). I am not sure though that using 5 SPU to reach a shadow calculation rate of 85 frames per second would be an ideal scenario for a game. As they suggest, they can use only 2 SPU and reach 30fps for shadow calculation, which would be sufficient even for a 60 fps game. And that would leave the other 3 SPU free for other tasks.

This paper and others show what SPU are really fascinating little beasts. With them, a PS3 game could be made to either be graphically spectacular, or to have a physical simulation system with an incredibly high number of interacting elements, or with a powerfu AI system (still to be seen that one, but I am confident ;-)). Or a little bit of everything. Or a varying amount of everything, depending on the level in the game. PS3 is an extraordinary customizable machine, with a very different perspective on development than either a 360 or a regular PC, platforms on which you do not have such a raw, customizable, power that could be allocated to any task at hands. The geek in me likes this :-D Now, let's hope that Sony will integrate all of these approachs in an easy-to-use framework for game developpers, because if not, i doubt a lot of them will spend that much time optimizing them by hand.

Arnaud

cpiasminc
09-12-2007, 07:37 PM
And a scale up of 4.33 for 5 SPU is very good, I have seen far worse scale ups than that in my programming history (but granted, this is a task that is well suited to parallelization).
Moreover, it's a parallelizable task that doesn't demand a lot of data, a lot of which is shared and read-only. There's a shared writable datablock, but all of it can be tiled to subsections so that nobody steps on anybody's feet. You certainly could never achieve that with a more complex shader that depends on more data -- you'll hit memory access bounds in no time (that's the key difference between doing this on Cell and doing it on a GPU).

As they suggest, they can use only 2 SPU and reach 30fps for shadow calculation, which would be sufficient even for a 60 fps game. And that would leave the other 3 SPU free for other tasks.
Huh? How is 30 fps for shadow calculation sufficient for a 60 fps game? You're taking twice the time you have to spare in a frame just for shadows. And the moment you mix it in with other stuff dealing with other data and other unrelated jobs to be managed, there's no way you can maintain that 30 fps... Really, you should be aiming at 180+ fps if you're going to defer shadowing to Cell.

Or a little bit of everything. Or a varying amount of everything, depending on the level in the game.
In general, yes, though this particular example is more of an "all or nothing" thing. You couldn't really partition it with the GPU except maybe across different passes/lights -- and if you tried to do that, you could run into issues where the CPU and GPU are fighting for memory. That's a fight that the GPU will always win.

It's a nice academic exercise, but it's a ways off from being useful in practice. I think the biggest thing to gain out of it is really ideas of how to manage common resources across multiple SPEs. It's relatively easy to parallelize stuff that doesn't share any data (and that's what most people will do first), but that alone won't account for much.

LiquidEagle
09-12-2007, 07:49 PM
Well it's a good thing I didn't know what this was in the first place, so I'm not disappointed it isn't very realistic... Thanks for the info cpi :)

GTShotoKen
09-12-2007, 08:20 PM
I find it kind of odd when the high-end "state of the art" gpu they tested this code on happened to be the 7800 GTX. Why not the 7900 instead(I know for sure that the 8000 serious of GPUs are out of the question even when dealing with DX9 performance)?

Jay Gee
09-12-2007, 08:26 PM
CPI should really just start all of the tech-related topics on this board before we all start scratching our heads in utter confusion or going "Oooh! I knew PS3 could hack the Matrix!!!"

VG Aficionado
09-12-2007, 08:26 PM
I find it kind of odd when the high-end "state of the art" gpu they tested this code on happened to be the 7800 GTX. Why not the 7900 instead(I know for sure that the 8000 serious of GPUs are out of the question even when dealing with DX9 performance)?

Because this document is older than it seems? There are a few things in it that make me think it's not that recent.

And PS3 doesn't use any kind of DirectX anyway. It uses OpenGL.

Lost|Identity
09-12-2007, 08:27 PM
It is a "monochromatic" shader because it only calculates the intensity of shadow, from light to dark, it is not a limitation.

I didn't say it was a limitation... did I? I was saying that they thought they could push the gains to 4.90 times the use of 1 SPE if they took care of all the DMA waiting (14 to 17% of which is gather, and the rest being waiting). The smiley was because calling what is essentially "gray-scale" a "monochromatic shader" was amusing to me :P.

As to this being realistically usable, it is in as much as there can be a use found for it. Some of the gains they say they got were as a viable solution to complex self shadowing.

"the dandelion test for shadow algorithms. The algorithm correctly reproduced the fine detail at the base of the plant as well as the internal self-shadowing within the leaves."

Judge for yourself:

http://i35.photobucket.com/albums/d185/OpenSource/danelion_celldeferred.jpg

I'm sure this isn't the best there is, but I think they are saying that it is good relative to the efficiency of the shader. Maybe this technology is part of EDGE... or not.

As to the method their algorithm uses, they say it is a variation on the Jensen and Christensen extended photon mapping algorithm.

Jensen and Christensen extended photon mapping [10] by prolongating the rays shot from the lights and storing the occluded hit points in a photon map which is typically a kdtree. When rendering a pixel x the algorithm looks up the nearest photons around x and counts the numbers of shadow photons ns and illumination photons ni in the neighborhood. The shadow intensity is then estimated as V = ni / (ns + ni). Our algorithm uses similar concepts to gather fragments and shade pixels, and in addition works with translucent materials.

lips
09-12-2007, 09:51 PM
Interesting debate over dma latency. I would not mind to add dma limits may be a hard limit, as pipelining streaming imposes therein its own logarithmically complexities. Not for nothing, IBM has also been known for stating cell crossbar has already created some hurtles for them in terms of latency and bandwidth, so we can agree why such piping would be exponentially difficult. I think the paper needs to add that while it already states dramatic improvement in cases of high cull for especially complex frames, much experimentation is needed for improvement over any actual existing standard to be built against.

cpiasminc
09-12-2007, 10:33 PM
Some of the gains they say they got were as a viable solution to complex self shadowing.
It's regular shadow mapping with PCF filters (albeit not necessarily fixed-width in this case). Everybody and his brother already does this -- they've simply mapped it to a deferred shadowing pass on the CPU. You already get self-shadowing with this, but as with any shadowmapping, the quality thereof is going to be limited by things like map resolution and filtering quality. Neither of those things can be solved "perfectly."

As to the method their algorithm uses, they say it is a variation on the Jensen and Christensen extended photon mapping algorithm.
There's no direct variation-type relationship between Photon mapping and their algorithm at all. The only thing is that both the photon mapping shadowing estimation and regular old PCF filtering (which is what they're using) ultimately estimate shadowing levels by treating it as a statistical problem, which means the math looks the same. They're not actually storing any photon maps or any irradiance caches or anything at all of the sort.

Technically, this algorithm in its fundamental sense predates Jensen/Christensen's papers on photon mapping by about 8 or 9 years (Reeves et al in 1987), though it wasn't that long ago that it became feasible to do on GPUs. I think it would be fair to say that the dependency is the other way around.

GTShotoKen
09-12-2007, 10:48 PM
And PS3 doesn't use any kind of DirectX anyway. It uses OpenGL.

Well I was under the impression that the opengl scheme used on the PS3 was equivalent to DX9?

I just assumed that they would compare their applications on the PS3 to a PC gpu running directx, but OpenGL is definitely viable too since nvidia gpus have always excelled at that.

lips
09-12-2007, 11:01 PM
opengl ftw!!!!

Arnaud_M
09-13-2007, 02:23 AM
Huh? How is 30 fps for shadow calculation sufficient for a 60 fps game?


May be it is a silly idea :-D But I wonder, do you really need to update the shadows at the same frame rate of the game itself ? Would it look that bad to only update shadows one frame out of two, if the game runs at 60fps ? Would that be even visible for the player ? They discuss in the paper the idea of calculating shadows at a smaller image resolution than the game itself, and I was wondering if you could not also use a smaller "time resolution".


You couldn't really partition it with the GPU except maybe across different passes/lights -- and if you tried to do that, you could run into issues where the CPU and GPU are fighting for memory. That's a fight that the GPU will always win.


That's my next question: if the shadow pass in on the Cell, can the GPU be doing other passes at the same time ? Do you think it will only be a nightmare to code :-D, or just plainly impossible ?


It's a nice academic exercise, but it's a ways off from being useful in practice. I think the biggest thing to gain out of it is really ideas of how to manage common resources across multiple SPEs. It's relatively easy to parallelize stuff that doesn't share any data (and that's what most people will do first), but that alone won't account for much.

I have a last question. How game developers are informed by Sony of their new research papers ? Does Sony systematically send them to you ? Offer courses ? Or are you on your own to go and browse them regularly ? More generally, what are the links between Sony Research and game development studios ? (if you are allowed to comment on that)

Arnaud

KRA
09-13-2007, 03:52 AM
May be it is a silly idea :-D But I wonder, do you really need to update the shadows at the same frame rate of the game itself ? Would it look that bad to only update shadows one frame out of two, if the game runs at 60fps ? Would that be even visible for the player ? They discuss in the paper the idea of calculating shadows at a smaller image resolution than the game itself, and I was wondering if you could not also use a smaller "time resolution".


yeah it would be visible for me :)

cpiasminc
09-13-2007, 04:47 AM
May be it is a silly idea But I wonder, do you really need to update the shadows at the same frame rate of the game itself ? Would it look that bad to only update shadows one frame out of two, if the game runs at 60fps ? Would that be even visible for the player ? They discuss in the paper the idea of calculating shadows at a smaller image resolution than the game itself, and I was wondering if you could not also use a smaller "time resolution".
How fast is the action? Is it an online turn-based strategy title that happens to be rendering at 60 fps, or a fast-paced single-player action title running at 60 fps? If it's the latter, you can bet it will be visible every single frame without exception. If it's the former, you're maybe 50:50.

And rendering at a different spatial resolution is partly done because it means less data to write, fewer pixels to fill, and when the GPU finally actually applies the shadow pass image to the screen, the GPU's own texture filtering can cover up artifacting for you. That's part of the reason why people even bother with deferring shadows alone (as opposed to a more complete deferred shading).

That's my next question: if the shadow pass in on the Cell, can the GPU be doing other passes at the same time ? Do you think it will only be a nightmare to code , or just plainly impossible ?
Shouldn't really be a problem unless you're starting to use XDR memory space for textures. Basically, if the CPU is busy with this pass and is largely limited by data access, you don't want to let the GPU get in the way of that by also sharing the main memory pool.

I have a last question. How game developers are informed by Sony of their new research papers ? Does Sony systematically send them to you ? Offer courses ? Or are you on your own to go and browse them regularly ? More generally, what are the links between Sony Research and game development studios ? (if you are allowed to comment on that)
All of the above.
A lot of stuff, we just every now and then come across (and this is the sort of stuff is often quickly or even immediately out in the public domain).

Sony does send us stuff (though third-parties like us sometimes have to specifically ask) regarding new things, new tools, TRC updates, etc, which isn't really research, but this includes that sometimes the fruits of research come out as something we as developers can use (EDGE being the prime example thereof).

Courses, talks, lectures, mini-conferences, and/or on-site visits happen once every so often. Of course, the content of these is pretty well meant NOT to be in the public domain at that point in time.

Arnaud_M
09-13-2007, 05:34 AM
Thanks a lot for your thorough answer CPI, really appreciated !

Arnaud

Arnaud_M
09-13-2007, 06:04 AM
Hello Frosty,

I don't think Killzone is already using deferred pixel shading as in this method. They are just adding post processing effects with the CELL.

Actually, Frosty is right, I just read the interview linked to in the Killzone 2 thread, and the developpers have this to say:

"Another thing we can also do with our engine is a thing called Deferred Rendering, which allows us to draw lots and lots of shadow-casting lights, and that again adds to the immersion and makes everything feel like part of the world because everything's casting shadow and behaving the way it should behave."

I take that to mean they are using the same technique (or same kind thereof) described in the research paper. But I am not a 3D programmer, so my understanding of what they are doing may be incorrect.

Arnaud

cpiasminc
09-13-2007, 08:56 AM
Deferred shading, they're doing all right... though they're not making shadows a separate pass IIRC (i.e. shadows AND lighting are both deferred to combination pass(es)), and they're not actually shading pixels on the CPU, either. A few parts of that are being done via the CPU like generating light-range geometry to cull which pixels actually get lit by each light, but the majority of the work is still GPU-based.

If you want examples of deferred shadows, you can look at Heavenly Sword or pretty much any UE3 title.

lips
09-13-2007, 02:08 PM
Shouldn't really be a problem unless you're starting to use XDR memory space for textures. Basically, if the CPU is busy with this pass and is largely limited by data access, you don't want to let the GPU get in the way of that by also sharing the main memory pool.


its a good thing the ps3 has multi-channel memory access, and not the xbox 360, whose cpu and gpu share memory access by default.

Segitz
09-13-2007, 07:15 PM
I remember reading, posted by numerous highly credited members of B3D, that the CELL cannot by used for Shading, as the "the latency" (dunno if I remember correctly) is too high or so.

So, is DR so much less intensive for the CELL to be used for shading, or were those people wrong (I know, they mostly are under NDA, I always remember Marcos posts like "you are wrong" :D).

What I like about this is, that games like FFXIII could use it to a great extend, as that game usually does not feature masses of enemys, usually no AI at all (in XII there is some rudimentairy "fun away from foes" ai, but that's about it) and no physics, so they can use the CELL for graphics nearly exclusively. Also cutscenes in games (MGS anyone?) could very much benefit from this.

But for the usual game, this seems a) to be too expensive to use and b) not feasable (too much work for the effort)

cpiasminc
09-13-2007, 09:36 PM
I remember reading, posted by numerous highly credited members of B3D, that the CELL cannot by used for Shading, as the "the latency" (dunno if I remember correctly) is too high or so.

So, is DR so much less intensive for the CELL to be used for shading, or were those people wrong (I know, they mostly are under NDA, I always remember Marcos posts like "you are wrong" :D).
This means diddly-squat about Cell's ability to shade pixels. You cannot make the leap from this to Deferred rendering in general or pixel shading in general. Shading pixels means lots of textures, lots of data, lots of granularity, lots of threads, and 8 SPUs isn't going to get you there. 80 SPUs is a good starting point, though.

This is one block of data to read, one lower resolution block of data to write, and it simply scans over the pixels, and it's working with ONE field of data for all operations -- it's not rasterizing polygons, it's not blending 20-odd texture layers, it's not alpha blending, it's not interpolating several components of vertex stream data, it's not acquiring information about several aspects of every pixel in a piece of rendered geometry.

The fact that Cell is not meant for pixel shading is tautological on account of the fact that it is a CPU. All it shows is that Cell has computational throughput. Same as every other Cell-based rendering example -- it's just showing that Cell can render in realtime if the entire scene is a single texture on a single scene element. If you'd like to prove that that is a reasonable definition for every scene in every game that can ever be made, then have at it.