Author Topic: OpenGL polygon sorting (Read 8285 times)

sfx1999 · « **on:** 2005-01-09 22:57:33 »

OK I am confused. I know that you need to sort polygons with an alpha channel before you draw them, but I have been looking through some code and not finding the information. I plan to use BSPs, so I know I could look them up in a reverse order easily. The problem I am having is with models and brush based entities. Sure, you can get them by looking at what node they are in, but the thing is, some models span across multiple nodes. I don't know what to do about that.

Another thing is that if I use skeletal animation, how can I optimize the it so I can still make use of hardware transform?

halkun · « **Reply #1 on:** 2005-01-09 23:47:53 »

I found a tiny BSP tutorial at the bottom of this page. It uses AllegroGL. (Which I highly recommend as it's the only API I can even remotely program in.

There are links too if it's too easy

http://home.planet.nl/~monstrous/tut3d.html

sfx1999 · « **Reply #2 on:** 2005-01-10 00:57:53 »

That link doesn't help me. Thanks for trying, though.

Anyway, I find the BSP technical details for Dummiesis more infomative, but it still does not help me with my problem. Check it out here -> http://www.planetquake.com/qxx/bsp/

Qhimm · « **Reply #3 on:** 2005-01-10 01:36:10 »

First of all, I'm most confused by the term "alpha channel". I haven't heard it used in this context. Polygon sorting can be a nasty and time-consuming business, especially if you want to try and get transparent polygons done right.

1) Some models span across multiple nodes?
The point of a BSP is that each node subdivides its space completely. While the model as a whole may "span" several nodes, there's never a conflict between those nodes.

2) Brush-based geometry.
From what I understand from that page of yours (I haven't actually worked with brush-based geometry on the programming side), the BSP generation from brushes is fairly straight-forward. Unless you're asking for the actual algorithms used in BSP?

Also, just to make sure: You do know that BSP is primarily useful for static scenes, right? If you plan to animate or even move your model, you destroy the BSP; thus moving things should be kept as separate BSPs (subdivided into individual static parts) or handled differently altogether.

As for hardware-assisted skeletal animation... Well, I suppose shipping off the matrix transformations to the GPU through the OpenGL interface might do the trick... I'm not actually sure though, it's not easy to know which parts of OpenGL are assisted by the GPU. Maybe simple glPushMatrix/glRotate/glPopMatrix would work? Otherwise if you're feeling modern, you could always write a vertex shader for it.

halkun · « **Reply #4 on:** 2005-01-10 01:59:46 »

This is such a drag, I can *reconize* a BSP tree when I see them, but I can't make the connection between "this is a tree" and "this is a 3d object"

If you look at my map example, I *KNOW* it's a BSP tree just by how it behaves, but I can't even begin to hack it. However looking at my example you almost get the idea I know that I'm talking about ^_^

sfx1999 · « **Reply #5 on:** 2005-01-10 02:06:18 »

Let me elaborate. Models, such as player models, can span across multiple nodes. Brush based entities don't, because they get split (at least I think they do, but I am not sure). An alpha channel is ho transparent it is. You have the colors red, green, blue, and then you have the alpha channel.

I was considering pulling the matrix values out of the card. It seems a little tough to figure out. I don't know what commands I would use. I do plan to write a vertex shader eventually, but not at first because it would be to difficult.

Another thing that confuses me is lighting models with pre-generated lightmaps, but I am not worried about it right now.

Qhimm · « **Reply #6 on:** 2005-01-10 04:38:01 »

Quote from: sfx1999

Let me elaborate. Models, such as player models, can span across multiple nodes. Brush based entities don't, because they get split (at least I think they do, but I am not sure).

Whoa there, I still think we need to get you past the initial obstacle that a BSP is calculated for a static, non-changing scene. Move a single vertex and you have to recalculate the BSP. That's why you typically don't include the moving objects in said BSP. The brush-based models will face the same problem if they move. There's no way to "include" a moving object in a BSP so it "works", they will have to be treated separately (for example, have some simpler sorting or solid/transparent polys and let the Z-buffer handle it).

Quote from: sfx1999

An alpha channel is ho transparent it is. You have the colors red, green, blue, and then you have the alpha channel.

Right, then we're on the right track. I was fearing you were talking about some completely different kind of alpha channel involved in polygon sorting only. Yes, semi-transparent polygons require special handling, and optimally you would have all polygons sorted, then draw solid one front-to-back (to maximize use of Z-buffer) and then transparent ones back-to-front. I'm working on some means to do this efficiently, but many of the methods in use today are quite advanced or not efficient enough, or both

I was considering pulling the matrix values out of the card. It seems a little tough to figure out. I don't know what commands I would use. I do plan to write a vertex shader eventually, but not at first because it would be to difficult.

Quote from: sfx1999

Another thing that confuses me is lighting models with pre-generated lightmaps, but I am not worried about it right now.

Lightmaps are simply texture maps applied during the lighting stage. An initial light value is first calculated using the selected light model, then the value from the texture map is added to that light value before it is passed on to rasterization.

halkun · « **Reply #7 on:** 2005-01-10 04:48:38 »

This might be an odd question, but doen't FF7 use a painters Algo for the graphics system? The datapackets are placed into a linked list (ordering table) by Z order and shuffled via DMA to the GPU. Or am I a little too far down on the rendering pipline.

/adding to the converstion in order to learn...

mirex · « **Reply #8 on:** 2005-01-10 10:36:31 »

I've been coding sorting of transparent polygons some time ago into Biturn. It was giving me headaches because its quite problematic to synchronize rotations in OpenGl and in my math functions. It also slows down rendering alot, because you have to recalculate order of polygons each time the camera moves or model changes.

Imo you cannot use OpenGl rotations to rotate the skelet bones, because you need to know positions of polygons when they move. Hmm maybe if the animations are pre-defined you could pre-calculate polygon positions and rotate bones by OpenGl ... but that would take up alot of memory.

sfx1999 · « **Reply #9 on:** 2005-01-10 19:42:08 »

Qhimm: What I am saying is that you can add models into a BSP scene. Quake does this, along with Unreal and others. The problem with these dynamic objects is accurate lighting. I can't figure it out, but I do not need to know it right now.

Brush based entities are entities that use a model that is built into the level. They are not part of the BSP. They are used to help decrease the amount of BSP splits. They are not needed like they used to be, though, because you can give brushes lower priorities when splitting. Their main use would be for doors and breakable glass, and triggers.

Mirex: Have you tried looking into the radix sort? I hear it is very fast. I don't quite understand it exactly, so look it up. I understand it, but I don't know how to implement it. There are certain things that would need to be done to conserve memory.

Halkun: Painer's algorithms suck. It is best to have your polygons and the right order and draw them front to back instead.

mirex · « **Reply #10 on:** 2005-01-11 06:50:35 »

Quote from: sfx1999

Mirex: Have you tried looking into the radix sort? I hear it is very fast. I don't quite understand it exactly, so look it up. I understand it, but I don't know how to implement it. There are certain things that would need to be done to conserve memory.

No i used some simple sort. What i did was: calculate center of the polygons (by calculating average coord of its vertices). Then i rotate these centers by angles of current camera, and then i sorted polys by the z-coordinate with some simple sort. I have to agree that sorting is really slow.

ficedula · « **Reply #11 on:** 2005-01-11 12:13:50 »

Depth peeling is a particular neat (albeit fairly complex

) way of doing transparency, mainly because it doesn't require your polygons to be sorted at all. Essentially, it uses two depth buffers; unfortunately, you have to multi-pass the model (ie. render it multiple times). The first time through, the two buffers pick out the rearmost visible transparent polygons, second time through, they pick out the transparent polys in front of the first layer...

The two depth buffers are so you only pick out polygons in front of existing geometry (normal Z buffering) that are also behind the other polygons currently being rendered (the depth peeling effect).

The real advantage is that when used in combination with occlusion queries, you can tell absolutely what the maximum number of passes necessary is. In other words, if the maximum number of transparent "layers" visible is 3 (a transparent poly, in front of another transparent poly, in front of another), then you can tell that a fourth pass was unnecessary. You can also just cut it off at 3-4 passes on the grounds that it's likely good enough

Whether its worthwhile is another question

You need at least a GF3 to run it, preferably something with proper shaders. On a recent card, with complex models, rendering 3000 polys each frame 4 times may be cheaper than re-sorting those 3000 polys each frame - especially since, if the geometry and matrices are on the card, re-rendering might just be a matter of a few GL function calls...

Qhimm · « **Reply #12 on:** 2005-01-11 14:00:46 »

Am I the only one who feel we actually need a new method of rendering to replace the useful but inadequate Z-buffer? Imagine a hardware-supported depth buffer which would contain depth values (like a Z-buffer), but when a transparent pixel is rendered on top of an earlier one, the old depth value is preserved and the depth, color and opacity of the pixel is kept in a linked list associated with that point. If an opaque pixel is rendered on top of any of the stored transparent ones, all the underlying pixels are discarded. It wouldn't necessarily require that much more memory since only pixels with transparency is stored, a fast implementation on the GPU is certainly possible, and it would permanently solve the sorted transparency issue.

ficedula · « **Reply #13 on:** 2005-01-11 16:25:16 »

I'd venture to suggest it's /technically/ possible with a pixel shader nowadays. Just that the speed would suck horribly ... mainly because linked lists aren't a structure GPU's are really designed to handle, at least not as far as the exposed interface goes

If I had to guess why we don't have anything similar, it would be;

1) Unbounded memory structures are not good on video cards. Spilling over to system memory because VRAM is full is generally hideously expensive if you weren't prepared for it.

That said, modern cards were supposed to be implementing smarter paging/memory management. I wonder what happened to that? (Could be they already did it and stopped making such a fuss about it, I guess.)

Traditionally, though, data never grows in VRAM; it's all in one fixed-sized chunk (a texture, a screen buffer, a vertex array) that can be hoisted in and out as necessary.

Of course, you could allocate it in big chunks ... which is how I'd imagine a pixel shader doing it if you were insane enough to code it for a current-generation card

2) Probably incompatible with pixel shaders. It means you'd never be able to know what colour an existing pixel was ... which might not matter to some rendering techniques, but would to some. Also, what if some pixels were rendered with additive blending, some with subtractive, and some with traditional alpha blending...?

3) I could imagine it screwing up antialiasing, because you've possibly lost some of the information about polygon edges by the time you *actually* get to drawing the pixels.

...all that said, though, it's certainly plausible. I can't think of any obvious hole in the idea, but I can't help but think that the graphics card manufacturers would've considered it. Maybe I give them too much credit

Perhaps it's just that it's a brute force solution to the problem, and forcing the programmer to apply some relatively simple optimisations to his scene (these two models don't overlap so their polys don't need to be sorted relative to one another...) reduces the workload considerably, more than low level techniques can?

sfx1999 · « **Reply #14 on:** 2005-01-12 01:56:27 »

I guess that transparency is one of the reasons why we use BSPs. You can get polygons in order.

Mirex: http://codercorner.com/RadixSortRevisited.htm

It is faster than qsort (which is faster than the bubble sort). That is how to modify it for floating point numbers. Here is standard radix sort information: http://www.cubic.org/~submissive/sourcerer/radix.htm.

Qhimm · « **Reply #15 on:** 2005-01-12 03:55:13 »

Quote from: ficedula

1) Unbounded memory structures are not good on video cards. Spilling over to system memory because VRAM is full is generally hideously expensive if you weren't prepared for it.

I agree that this is probably the weak spot of the idea. Though you could probably limit the amount of memory allowed to be used (maximum depth etc), and have a frame "pool" instead of a simple buffer. Anyway, GPUs need to evolve further before it becomes plausible.

Quote from: ficedula

2) Probably incompatible with pixel shaders. It means you'd never be able to know what colour an existing pixel was ... which might not matter to some rendering techniques, but would to some. Also, what if some pixels were rendered with additive blending, some with subtractive, and some with traditional alpha blending...?

Not necessarily incompatible, since the whole point of the algorithm would be to defer blending until you knew the final order. This also introduces possibilities for order-independent pixels shaders which aren't even shaded until the order is known (they could even be rendered in any order based on some priority). Again, this requires more GPU power.

The problem with various blending modes is therefore no problem; the "intelligent" Z-buffer doesn't care about the blending details, only that "a pixel is somehow dependant on other pixels along the depth axis", be it additive, subtractive, modular or alpha-based.

Quote from: ficedula

3) I could imagine it screwing up antialiasing, because you've possibly lost some of the information about polygon edges by the time you *actually* get to drawing the pixels.

Hmm yes, this could be quite a real issue I suppose... including edge information would mean quite high memory usage for complex scenes. Although I'll admit I'm not too familiar with the actual implementations of current anti-aliasing methods. Super-sampling, I guess, but doesn't that imply some edge information is stored with the frame buffer already? Or is the frame buffer itself high-resolution?

Quote from: ficedula

...all that said, though, it's certainly plausible. I can't think of any obvious hole in the idea, but I can't help but think that the graphics card manufacturers would've considered it. Maybe I give them too much credit

Perhaps it's just that it's a brute force solution to the problem, and forcing the programmer to apply some relatively simple optimisations to his scene (these two models don't overlap so their polys don't need to be sorted relative to one another...) reduces the workload considerably, more than low level techniques can?

I do think it's been considered and rejected because it requires a greater amount of resources to accomplish. But with how GPUs today strive to be ever more generic and relieve the CPU from pure graphical workload, I can't help but think it's the logical next step of evolution. Every single 3D application of some complexity requires this to be done, and yet it's almost always processed on the CPU. Agreed it is brute-force, but then what other parts of the GPUs seen today aren't brute-force-based? Simple sorting could still massively improve things by not having to draw unnecessary polygons, but a method like the one above would permanently solve the transparency issue and probably greatly help in a couple of other areas (heck it even handles intersecting/transparent polygons). It is a completely logical solution to the problem that any programmer would consider, but ultimately reject because there's no efficient way to implement efficiently it without GPU support.

Cyberman · « **Reply #16 on:** 2005-01-12 16:16:53 »

Knowing a bit about GPU's the primary concern of the GPU designer is to keep the rendering pipeline from being starved by branches. The instruction queue and look ahead optimizations in there pipeline are quite long. A P4/AMD Athalon might have 4 to 7 stages in the pipeline. However a GPU is more like 60 to 120. This means that they desire to keep things simple to keep the GPU from flushing the pipeline of instructions. The later slows things down incredibly. Alas this is how they are able to get those monsterous speeds they claim. So instead they look to the API (OGL and DirectX) to handle the dynamic compilation of instructions that are sent to the GPU. I hope this makes sense to you. Bottom line is try it and if it doesn't work try something else.

Your card's abilties vary much depending on the manufacturer and support the manufacturer has for the API. It reminds me of 3dfx's OGL implementation sucking because they converted textures into floating point numbers and then converting the textures back into the same texture format. Simple optimizations in the dynamic recompilor make a big difference huh?

So how to do this in OGL? You could use a simplified version of ray tracing. Basically shoot a ray from a point in the view screen and see if it hits something. If it does find the something and work from there. You could then sort the objects by there visible exposure and trim polygons acordingly.

In OGL polygons have only one visible side correct?
I could never get transparent polygons to be visible in OGL, let alone get textures working (SIGH).

Cyb

ficedula · « **Reply #17 on:** 2005-01-12 18:04:56 »

Qhimm: The real problem with pixel shaders suddenly occurred to me today. In order to run a pixel shader "later", ie. when you were actually rendering the transparent pixels, you'd need to save the full shader state with each entry in your linked list. In GL fragment programs - fairly closely mapped to pixel shader hardware - there's a set of global parameters shaders can access, as well as each individual shader having a set of local parameters. I think at minimum, there's 24 parameters of each type (although obviously a program might not use all of them), each being a vector of four floats.

Worst case, you're pulling along 1.5KB of data with each pixel in your linked list, so the shader runs in the same "state" as it would have if you'd rendered it straight away. It needs that state, because the solid pixels from the same batch of polygons (or even from the same polygon) ran through the pixel shader with those parameters.

Of course, not all shaders would use that many parameters. But ... GL state is also accessible to a shader. That includes the current fog parameters, matrices, texturing environment...

In an *absolute* worst case you could be talking 3KB of data per pixel in your linked list. That's a few gigabytes of storage...

Of course, you might store only the attributes the shader used. However, that's still quite a lot - if code analysis shows it reads one element of a matrix, you've really got to pull in the whole matrix - and worse, you've now got to deal with dynamically allocated memory for each linked list entry!

Optimising storage is possible, so a group of pixels point to a common buffer of "shared state", but the more complex you make storing this data, the more complex retrieving it is ... not good for real time rendering. As Cyberman pointed out, the great thing about classical 3d hardware was that a given operation took a fixed amount of a time. Even on early pixel shaders that held true - the first cards to support them, you couldn't abort a shader halfway through; exitting the shader caused all subsequent instructions to run, but the results were discarded, so you didn't update any pixels ... but the shader absolutely always took the same amount of time to run, making synchronisation, parallelisation (and other long words) easy to do

Of course, the whole problem is easily avoided by not using pixel shaders ... but that's not a solution people want to hear nowadays - especially when I hear a lot of OpenGL 2.0 will be defined in terms of shaders even for the basic functionality...

Cyberman: Polygons in OGL can have both sides visible, although the environment (meaning the card nowadays) is quite happy to cull back-facing polygons for you. Well, or front facing, if you want it to ... look up glCullFace (IIRC).

Transparency is a bitch to get working WELL, but basic results should be easy; same with texturing.