polygons per second
by Kory Imaginism · in Torque 3D Professional · 04/26/2009 (11:12 pm) · 38 replies
An artist on my team asked me how many polygons per second does T3D renders, and didn't have the answer can somebody "shed some light" on this question?
thanks
thanks
#2
Although, I did have to use make a minor decrease to the visibility for a bit more performance, it worked perfectly fine. Hope this helps.
04/27/2009 (12:52 am)
From my testing in TGE, I was able to run my main character at 6k poly's with a full environment on top of that. I'm sure T3D won't have an issue with at least matching this type of performance.Although, I did have to use make a minor decrease to the visibility for a bit more performance, it worked perfectly fine. Hope this helps.
#3
So far as object polygon limits go, I think that some of the existing .dts exporters are limited to around 10,000 triangles -- that's always been a factor -- but the capability to use the collada format may be a viable alternative for that.
04/27/2009 (4:38 am)
Polygons per second? Let's just say that the answer is a lot. Individual results will vary depending upon whose system (especially the graphics card) and the in-game artwork.So far as object polygon limits go, I think that some of the existing .dts exporters are limited to around 10,000 triangles -- that's always been a factor -- but the capability to use the collada format may be a viable alternative for that.
#4
I say vertices and not polygons because it is the vertex data which gets manipulated via transforms. All polygons (mesh) are "closed", meaning you can connect the vertices within them starting at one vertex draw straight lines connecting all the vertices in the polygon until you arrive back at the starting vertex.
I'm leaving requirements (e.g. convex vs concave) out but if you think about a square with 4 vertices connected by lines you start at one vertex, draw a line to the next and continue until you get back to the start.
Why is that important? Well you only have one poly but you have four vertices connecting them. In order to do transformations (rotation, scale, translate) you need to multiply each vertex by a matrix. This provides position and orientation for rendering and the higher number of vertices in the model, the longer it takes to transform.
All of that being said I have to think about how this affects shading (whether vertices are relevant in a pixel shader). I would say it has an effect because from what I know about shading, the per-pixel color value is dependent upon interpolation of vertex attributes such as normal and color.
sorry about the long post.
04/27/2009 (5:06 am)
I read somewhere that the crysis development team targeted 10,000 vertices for their characters and 25,000 for the vehicles.I say vertices and not polygons because it is the vertex data which gets manipulated via transforms. All polygons (mesh) are "closed", meaning you can connect the vertices within them starting at one vertex draw straight lines connecting all the vertices in the polygon until you arrive back at the starting vertex.
I'm leaving requirements (e.g. convex vs concave) out but if you think about a square with 4 vertices connected by lines you start at one vertex, draw a line to the next and continue until you get back to the start.
Why is that important? Well you only have one poly but you have four vertices connecting them. In order to do transformations (rotation, scale, translate) you need to multiply each vertex by a matrix. This provides position and orientation for rendering and the higher number of vertices in the model, the longer it takes to transform.
All of that being said I have to think about how this affects shading (whether vertices are relevant in a pixel shader). I would say it has an effect because from what I know about shading, the per-pixel color value is dependent upon interpolation of vertex attributes such as normal and color.
sorry about the long post.
#5
Pixel shader performance tend to dominate performance, as they are both quite long and complex and ran at every pixel covered by the triangles more than once (overdraw). An additional often misunderstood (even by graphics programmers) are triangle setup costs and pixel quad edge costs.
Triangle setup costs are from the movement from vertex output to rasterisor (depending on hardware, they may be primitive construction cache, interpolator packing and limitations, and the actual triangle setup), pixel quad edge costs occur because hardware fills triangle in groups rather than pixel by pixel and at the edge of triangle some of that work gets wasted...
There are lots of other factors of course, which is why accurate numbers really are only based on 'what are you actually doing?' than rough pick a number out the air.
04/27/2009 (5:17 am)
Actually vertices are rarely the bottleneck on modern engines/hardware, the only time it usually matters is on Intel IGP hardware that sometimes does vertex shading in software.Pixel shader performance tend to dominate performance, as they are both quite long and complex and ran at every pixel covered by the triangles more than once (overdraw). An additional often misunderstood (even by graphics programmers) are triangle setup costs and pixel quad edge costs.
Triangle setup costs are from the movement from vertex output to rasterisor (depending on hardware, they may be primitive construction cache, interpolator packing and limitations, and the actual triangle setup), pixel quad edge costs occur because hardware fills triangle in groups rather than pixel by pixel and at the edge of triangle some of that work gets wasted...
There are lots of other factors of course, which is why accurate numbers really are only based on 'what are you actually doing?' than rough pick a number out the air.
#6
Island in Crysis i belive is around 40 million polygons (based on what i could find out).
I belive they went for around ~500.000 poly on screen at any given time tops for crysis, wich is 500.000 * 30 fps = 15 mill poly pr second at 30 fps.
Modern cards, like 8800 and up can easy handle that, then shader passes comes, shadows, and what else they wish to cram onto your card for calculating.
And, in the end, ofcourse the engine optimization.
It cant be calculated, and thats the reason you see those nvidia and such logo's in games, as they provided different spec test system for them to test on :)
04/27/2009 (5:45 am)
It depends on the target minumum system spec of who you want to release too.Island in Crysis i belive is around 40 million polygons (based on what i could find out).
I belive they went for around ~500.000 poly on screen at any given time tops for crysis, wich is 500.000 * 30 fps = 15 mill poly pr second at 30 fps.
Modern cards, like 8800 and up can easy handle that, then shader passes comes, shadows, and what else they wish to cram onto your card for calculating.
And, in the end, ofcourse the engine optimization.
It cant be calculated, and thats the reason you see those nvidia and such logo's in games, as they provided different spec test system for them to test on :)
#7
Can you explain how you can interpolate values in the shader without considering vertices? Yes I'm sure it takes longer to do bilinear interpolation than it does to perform one transformation (or a vector of them).
If you have 10 vertices in a polygon I still have to believe it will take longer to process that polygon than it would for a polygon with 3. That is what I was addressing in the OP question.
I will say that I'm not familiar with the hardware requirements, just the math involved. The above post seems to assume triangles. Does the hardware automatically triangulate polygons?
04/27/2009 (6:13 am)
@deanQuote:Pixel shader performance tend to dominate performance, as they are both quite long and complex and ran at every pixel covered by the triangles more than once (overdraw). An additional often misunderstood (even by graphics programmers) are triangle setup costs and pixel quad edge costs
Can you explain how you can interpolate values in the shader without considering vertices? Yes I'm sure it takes longer to do bilinear interpolation than it does to perform one transformation (or a vector of them).
If you have 10 vertices in a polygon I still have to believe it will take longer to process that polygon than it would for a polygon with 3. That is what I was addressing in the OP question.
I will say that I'm not familiar with the hardware requirements, just the math involved. The above post seems to assume triangles. Does the hardware automatically triangulate polygons?
#8
04/27/2009 (6:49 am)
I haven't used T3D yet but I've never seen any engine using something other than triangle or triangle strips/fans (and points/lines, but still, max 3 vertices).
#9
What happens if you submit a mesh that contains a primitive that is not a triangle? Do most engines not accept it or do they triangulate? Curious now for my own knowledge.
04/27/2009 (6:59 am)
@MichaelWhat happens if you submit a mesh that contains a primitive that is not a triangle? Do most engines not accept it or do they triangulate? Curious now for my own knowledge.
#10
04/27/2009 (7:06 am)
Thanks all, alot of informative info! Now I have a pretty clear answer to give back to that artist!
#11
Most engines I've used came with tools that made everything triangles, or - further ago - a single triangle strip (though I believe that got less common with more complex materials being used, and it just isn't practical with some types of models (not to mention it's rarely the bottleneck anymore))...
04/27/2009 (7:42 am)
@JoshuaMost engines I've used came with tools that made everything triangles, or - further ago - a single triangle strip (though I believe that got less common with more complex materials being used, and it just isn't practical with some types of models (not to mention it's rarely the bottleneck anymore))...
#12
There have been a couple of posts stating that geometric complexity is not the bottleneck (at least that's how I understand the posts).
While to some degree I agree let me pose a scenario.
Scenario 1:
You are rendering a cube (8 vertices) with some kind of really complex effect that takes lots of math and multiple passes.
Scenario 2:
You are rendering the death star (40 bazillion vertices) with the same effect.
I would speculate that scenario 1 would render faster.
Now, the shaders in these two scenarios are identicle, so what is the bottleneck?
*Edit - In other words, all things being equal aside from geometric complexity.
04/27/2009 (8:25 am)
@beating a dead horseThere have been a couple of posts stating that geometric complexity is not the bottleneck (at least that's how I understand the posts).
While to some degree I agree let me pose a scenario.
Scenario 1:
You are rendering a cube (8 vertices) with some kind of really complex effect that takes lots of math and multiple passes.
Scenario 2:
You are rendering the death star (40 bazillion vertices) with the same effect.
I would speculate that scenario 1 would render faster.
Now, the shaders in these two scenarios are identicle, so what is the bottleneck?
*Edit - In other words, all things being equal aside from geometric complexity.
#13
Well yes of course geometric complexity matter if you take the arguement too far, BUT were talking about the bounds of reasonable real-time situation. And as with most things, if that were the situation there are ways around that too.
But its really important to note, that within certain limits more vertices/polygons do not make it slower. i.e. 10000 polygons might be the same speed as 11000 polygons.
Its all about the bottlenecks and GPUs are designed to avoid many of the obvious ones like that.
04/27/2009 (8:51 am)
@JoshuaWell yes of course geometric complexity matter if you take the arguement too far, BUT were talking about the bounds of reasonable real-time situation. And as with most things, if that were the situation there are ways around that too.
But its really important to note, that within certain limits more vertices/polygons do not make it slower. i.e. 10000 polygons might be the same speed as 11000 polygons.
Its all about the bottlenecks and GPUs are designed to avoid many of the obvious ones like that.
#15
Almost all hardware only renders triangles, its never given anything but triangles by the software.
Depending on the hardware the shaders may be shared between vertex and pixel or not. In the older unshared case, saving transforms on a vertex may make NO difference at all, cos the vertex shader units are just idling.
Indices allow reuse of post-vertex-transform caches, which means that its not vertices that matter but connected vertices. So for example if that bazillion vertex death star only connects 8 vertices to triangles, it will AT MOST 8 vertex transforms.
Hardware doesn't interpolate, it generates the homogeneous space representation of the interpolant. This is then passed into the pixel shader unit and then calculated at every pixel, it allows the pixel processing to be decoupled from the triangle. Which is required for the massively parallel systems in high GPUs, where hundreds if not thousands of math ops are occurring every GPU cycles.
04/27/2009 (9:12 am)
Some other points:Almost all hardware only renders triangles, its never given anything but triangles by the software.
Depending on the hardware the shaders may be shared between vertex and pixel or not. In the older unshared case, saving transforms on a vertex may make NO difference at all, cos the vertex shader units are just idling.
Indices allow reuse of post-vertex-transform caches, which means that its not vertices that matter but connected vertices. So for example if that bazillion vertex death star only connects 8 vertices to triangles, it will AT MOST 8 vertex transforms.
Hardware doesn't interpolate, it generates the homogeneous space representation of the interpolant. This is then passed into the pixel shader unit and then calculated at every pixel, it allows the pixel processing to be decoupled from the triangle. Which is required for the massively parallel systems in high GPUs, where hundreds if not thousands of math ops are occurring every GPU cycles.
#16
Not sure what sharing shaders between pixel and vertex means. Shader is an abstract term, pixel and vertex shaders are implementations of that.
Post: After
Vertex: (self defining)
Transform: affine transform in this case
Cache: Storage, in this case for vertex data
In order to cache this data tranforms need to be applied at least once to each vertex to be rendered, per transform, right?
So if I perform an affine transform on a mesh, which by definition is a connected graph, it must touch every vertex in the mesh. We aren't talking about culling, clipping, or otherwise optimizing geometry. A bazillion vertices means render, or at least process for depth testing, a bazillion verticies.
Describe for me how a phong shader calculates per pixel lighting, but does not interpolate normals. If you are saying the hardware doesn't do it... I can accept that I guess because I'm not a graphics card expert. But if software does it, that would seem very slow to me and a waste of highly specialized vector capabilities of GPUs.
Not even sure what that means. The interpolated data is passed to the pixel shader? The pixel shader's job is to determine the correct lighting per pixel. In what stage is this interpolation computed, if not pixel?
This is all in good fun mind you... no hard feelings.
04/27/2009 (10:05 am)
@Dean - Let's break this downQuote:Depending on the hardware the shaders may be shared between vertex and pixel or not. In the older unshared case, saving transforms on a vertex may make NO difference at all, cos the vertex shader units are just idling.
Not sure what sharing shaders between pixel and vertex means. Shader is an abstract term, pixel and vertex shaders are implementations of that.
Quote:Indices allow reuse of post-vertex-transform caches
Post: After
Vertex: (self defining)
Transform: affine transform in this case
Cache: Storage, in this case for vertex data
In order to cache this data tranforms need to be applied at least once to each vertex to be rendered, per transform, right?
Quote: So for example if that bazillion vertex death star only connects 8 vertices to triangles, it will AT MOST 8 vertex transforms.
So if I perform an affine transform on a mesh, which by definition is a connected graph, it must touch every vertex in the mesh. We aren't talking about culling, clipping, or otherwise optimizing geometry. A bazillion vertices means render, or at least process for depth testing, a bazillion verticies.
Quote:Hardware doesn't interpolate, it generates the homogeneous space representation of the interpolant.
Describe for me how a phong shader calculates per pixel lighting, but does not interpolate normals. If you are saying the hardware doesn't do it... I can accept that I guess because I'm not a graphics card expert. But if software does it, that would seem very slow to me and a waste of highly specialized vector capabilities of GPUs.
Quote:This is then passed into the pixel shader unit and then calculated at every pixel, it allows the pixel processing to be decoupled from the triangle
Not even sure what that means. The interpolated data is passed to the pixel shader? The pixel shader's job is to determine the correct lighting per pixel. In what stage is this interpolation computed, if not pixel?
This is all in good fun mind you... no hard feelings.
#17
04/27/2009 (10:30 am)
(note: if someone can tell me how you actually quote, heheh)Quote:Not sure what sharing shaders between pixel and vertex means. Shader is an abstract term, pixel and vertex shaders are implementations of that.Older hardware had (in the hardware) vertex shaders and separate pixel shaders. They were separate and independant. With recent hardware, they are "unified", which means a vertex/pixel/etc shader will all run on the same place. The disadvantage is that it's less specialized, so less optimized, but the advantage is that you never have shader units that sit there doing nothing...
Quote:In order to cache this data tranforms need to be applied at least once to each vertex to be rendered, per transform, right?A vertex is supposed to always be transformed the same way, so the system can process a vertex (in a vertex shader), store the result and re-use the result if you re-use the same vertex. Of course it's a bit more complicated and it depends of how many you have, the cache size on the GPU, etc...
Quote:Quote:Hardware doesn't interpolate, it generates the homogeneous space representation of the interpolant.Not sure what was meant here, but shaders have interpolation instructions. Also depending on the semantic of your data, it might interpolate between the vertex and the pixel shader (let's just skip the geometry shader for now hehe).
Describe for me how a phong shader calculates per pixel lighting, but does not interpolate normals. If you are saying the hardware doesn't do it... I can accept that I guess because I'm not a graphics card expert. But if software does it, that would seem very slow to me and a waste of highly specialized vector capabilities of GPUs.
#18
Open Square Bracket OSB
Close Square Bracket CSB
OSB Quote (the word Quote, not content) CSB
Content
OSB /Quote CSB
You have to cut/paste content to make this work
04/27/2009 (10:43 am)
@michaelOpen Square Bracket OSB
Close Square Bracket CSB
OSB Quote (the word Quote, not content) CSB
Content
OSB /Quote CSB
You have to cut/paste content to make this work
#19
04/27/2009 (10:48 am)
Yay, I quoteh!
#20
RT Graphics use of the terms your using from a outside in specific way. For example a shader is simply a program, a number of points through the pipeline there are hardware that execute different programs. It used to be different hardware for different shaders, now with unified shaders some of this is shared. As of Dx11 we have 5 shaders in the most complex pipeline. The old renderman style shaders also had multiple types of shaders but people tend to forget the non surface shaders.
Because interpolate isn't quite true in some rare corner cases (MSAA non centroid sampling) we actually extrapolate as well. What actually happens when hardware 'interpolate' is it simply solves the homogeneous plane equation at that particular point in space, whether that point is interior or exterior (interpolate or extrapolate) is irrelevant. This also removes all dependecancies on differentials, so rather than a traditional DDA style approach which would require an order to pixel executions, hardware uses the homogeneous plane equation approach.
Now there are interpolator units in the triangle setup system, which generate the plane equations based on the interpolated values but as it possible to construct cases where they never interpolate just extrapolate, so being slightly oddly named ;)
04/27/2009 (10:52 am)
@JoshuaRT Graphics use of the terms your using from a outside in specific way. For example a shader is simply a program, a number of points through the pipeline there are hardware that execute different programs. It used to be different hardware for different shaders, now with unified shaders some of this is shared. As of Dx11 we have 5 shaders in the most complex pipeline. The old renderman style shaders also had multiple types of shaders but people tend to forget the non surface shaders.
Because interpolate isn't quite true in some rare corner cases (MSAA non centroid sampling) we actually extrapolate as well. What actually happens when hardware 'interpolate' is it simply solves the homogeneous plane equation at that particular point in space, whether that point is interior or exterior (interpolate or extrapolate) is irrelevant. This also removes all dependecancies on differentials, so rather than a traditional DDA style approach which would require an order to pixel executions, hardware uses the homogeneous plane equation approach.
Now there are interpolator units in the triangle setup system, which generate the plane equations based on the interpolated values but as it possible to construct cases where they never interpolate just extrapolate, so being slightly oddly named ;)
Torque Owner Dean Calver
Cloud Pixies Ltd
For example its not uncommon these days for shadow map polygons to consist of a large percentage of the polygons in a frame. And yet I bet having those counted isn't what your artist wanted to know :D
What I expect the artist really wanted to know is what kind of polygon counts can the main character have (in say a third person game), again a very difficult question to answer as it depends on the range of hardware you hope to ship on.
After saying that T3D shows all the signs of being in the same range as most modern console/PC engines which if pushed the right way will render a few million artist visible polygons per second.