Click to See Complete Forum and Search --> : Removing bottleneck from rendering algorithm


BytePtr
April 14th, 2009, 01:05 PM
Im creating map editor for one game, so far OK. But i found one bottleneck from my algorithm.

At the moment this is the case.

I have 3 for loops (x,y), 0-35,0-7, in this for loop i check for block sides (CheckSides) and also at the same time im drawing the block (DrawIt(x)).

procedure Render;

for x:= 0 to 35 do
for y:= 0 to 35 do
begin
glPushMatrix();
glTranslatef( 1.0 * y, 0.0, 1.0 * x );
// For all blocks in column..
for z:= 0 to 7 do
begin
glPushMatrix();
CheckSides( z, x, y );
DrawIt( 0 );
glPopMatrix();
glTranslatef( 0.0, 1.0, 0.0 );

end;
glPopMatrix;
end; }

DrawIt() procedure is something like this (basically it contains all glVertex3f calls.

if WireFrame = True then glBegin( GL_LINE_STRIP ) else glBegin( GL_QUADS );
glColor3f(1,1,1);
for i:=0 to 3 do
begin
glTexCoord2f(SLOPE_RAW_DATA[Which,0,i,0],
SLOPE_raw_DATA[Which,0,i,2]);

glvertex3f(SLOPE_RAW_DATA[Which,3,i,0],
SLOPE_RAW_DATA[Which,3,i,1],
SLOPE_RAW_DATA[Which,3,i,2]);
end;
glVertex3f(SLOPE_RAW_DATA[Which,3,0,0],
SLOPE_RAW_DATA[Which,3,0,1],
SLOPE_RAW_DATA[Which,3,0,2]);
glEnd;
end;

Yes, it's Pascal, but code is simple.

So Render procedure basically checks and draws everything.

So as you see: for loops are in MAIN rendering procedure. This is bad.
Eats alot of CPU, when running i get 70-80 CPU usage.


If i remove for loops from main rendering procedure then CPU usage goes down to 03%.



How could i rewrite my algorithm, so i can remove for loops from rendering routine?
Read and check the data somewhere else and store also somewhere, but how can i read them?
Im lost totally.



Any ideas, hints or suggestions?

bitshifter420
April 14th, 2009, 09:04 PM
I dont know this language but from here it looks like all data is fixed and calculated in the loops.
For this kind of situation you can store the information in display lists.
This should really speed up the rendering process since stuff is precalculated.

glGenLists
glNewList
glEndList
glCallList(s)

I reccomend you get the BlueBook and also the RedBook.
These contain the specifications and some good examples.

BytePtr
April 15th, 2009, 07:29 AM
Yes, vertex coordinates is fixed, i have hard coded Vertex values in one of my unit.
Program reads from game map the types of slopes. If it finds specific slope, then it draws it using the hard coded vertex coordinates.

The only problem is that it loops every time and loops forever, even if it already has rendered all the scene. For example: program renders block 2, so it loops again and renders this block again.
But this is pointless, since this block is already rendered.

That causes almost 100% CPU (mostly 60-70%) usage.


OK, i can store my fixed data in VBOs or even display lists, but here im stuck.


I think that i should READ all data only once to some list, buffer or array and then call it only once, to render it. And not read and draw at once over and over again.


It's confusing i know. But that's the case.



I will look at the books you suggested.

JVene
April 21st, 2009, 11:03 PM
You said that you removed the loops and the speed improves (CPU usage goes down) - what did you do about display all the material? Did you only display 1?

I'd expect quite a difference between 1 and 1225 items being drawn.

The real problem here, though, is hinted to by BytePtr, and I'd take it a little further.

Depending on the version of OpenGL you're using, even display lists, once considered much better than individually calling OpenGL for each vertex, have fallen into disfavor and replaced with vertex buffers.

The general point to optimized OpenGL development follows along these lines:

Call OpenGL functions as few times as possible, they're heavy.

Provide as much data as possible up front, so it's in the graphics card and doesn't have to dross the bus to the graphics card every frame.

Use strips when you can, indexed triangles when you can't.



These are basics, and obviously there's lots more detail to it.

Now an implementation specific issue is with the notion of using GL for linear algebra (the translations and/or rotations of the matrix).

Given the right circumstances it can be much faster to perform the math on a matrix within the application code, and simply provide the resulting matrix to OpenGL. This is highly dependent, however, on several factors, including the nature of your OpenGL implementation, and of your matrix math library.

The OpenGL versions are usually 'client side' implementations, meaning it's like you're using a matrix library in your application code. It will probably be optimized with the best features of your CPU's floating point unit, but in some rare circumstances it might be accelerated by your GPU, but not often (at least not yet).

This means that if you have a good implementation of the matrix work which takes advantage of SIMD (often inline assembler), your own library can be much faster.

The push/pop of a matrix is actually quite heavy to perform, especially inside loops. If you simply supply a matrix with glLoadMatrix, which you've prepared yourself, the weight of the push/pop can be eliminated. If your matrix operations are lighter, too - then the result is a general performance improvement.

A much bigger improvement is to be found using vertex buffers, as I mentioned - too large a subject to go into here, but you're basically skipping all the calls to glVertex3f and glTexCoord2f, replacing that with a single call of prepared data, which then stays inside the graphic's card's RAM.

BytePtr
April 30th, 2009, 06:29 PM
Sorry for late reply, busy days.

Very good suggestions. Thanks.


I am now sure that i should learn more about matrixes and such, because another problem that i now have is generating ray for creating blocks at mouse cursor in 3D space. I got suggestion that look for linear algebra and matrix stuff and how they work and so on. I will.


And one suggested me to go with vertex array. With VBO's i wouldn't see much difference and vertex arrays are maybe simpler. I should just rewrite my hard coded data probably, for vertex arrays.


I will not even touch any texturing, until i got my CPU usage down.

Original editor uses 0% CPU and only 1-2 when it gets some command.


And i want that my editor would be same.