Friday, March 11, 2011

OpenGL Vertex Buffer Objects

It took me some time, but I've finally got them working the way I want, and now you can too. But first, an example...



Now you can see why that broken foot silhouette was so exciting for me. Now onto the making it happen for you.

VBO's in a nutshell


Traditionally Vertex Buffers were a place you could upload carefully formatted geometry to the video card, and that's about it. Then came the ARB_matrix_palette extension, and things got a bit exciting. Nowadays we have GLSL, and can put whatever we feel like in them. Lets take a look at how that's done...
// VBO init code
GLuint vbo = -1;
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, num_vert * sizeof(vertex), vert_ptr, GL_STATIC_DRAW);

After that, you upload the vertex indicies for your polygons...
// Element Array Init Code
GLuint ebo = -1;
glGenBuffers(1, &ebo);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, &ebo);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, num_quad * 8, quad_ptr, GL_STATIC_DRAW);

And to draw you call...
// Draw Code
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo);
glVertexPointer(3, GL_FLOAT, 0, 0);
glEnableClientState(GL_VERTEX_ARRAY);
glDrawElements(GL_QUADS, num_quad * 8, GL_UNSIGNED_SHORT, 0);

At least, that's what you would expect from reading the spec. And indeed this code may just render your pile of flat untextured polygons. On some cards.

This is where it gets fun. Buckle up.

Thar be dragons ahead


Lets start with the VBO init code. Actually that's almost perfect - however a lone vertex doesn't cut it. Now a float is 4 bytes, so a lone vertex is 12 bytes. We also want Normals - another 12 bytes, and a Texture Coordinate, 8 more bytes bringing us to 32 bytes exactly, which is really convenient as we will find out, but first allow me to segway onto skeletal animation techniques for a moment.

Skeletal Animation


So you have 2 bones connected by a joint, lets imagine a knee. Now when you bend your knee, say 90°, everything from a bit below the knee moves in a straight forward 100% of 90degrees motion. It's the points along your knee that are tricky, otherwise you get nasty artifacts like this...



So we need a method of partially rotating the points along your knee, depending how close they are to the joint, or whatever, we don't care, that's the modellers problem. What we need to care about is how it's done, with Weights. So the bone will have a list of which points it affects, and to what extent it affects each of them. In the bad old scary days when this was all done in software people used Quaternions. Quaternions can be used similar to matrices, in that they can store rotation, and you can bash some vertices against a quaternion and it will spit out rotated vertices. Quaternions have the very cool property that it is possible to interpolate between 2 quaternions, depending on the method you use, it will correctly follow the plane you want it to rotate in. Also Quaternions use less operations that Matrices do. Anyhow, these days we have hardware that does Matrix * vertex multiplication really fast, and the modern day trick is to use these weights to do weighted averaging of the points that are spit out by the two separate matrices.

Back to our Vertex Buffer Object


So our nice friendly lovable artists have given us a model with a bunch of bones, which each reference a bunch of vertices and weights. We need to find out how many bones reference each vertex, and specifically find out what is the most bones referencing a single vertex. For the particular build of MakeHuman I'm using models from, this is 6.

ATI recommends that your vertex entries are multiples of 32 bytes, to speed up fetch operations. This is sound advice no matter what hardware you're using. So the vertex entry I'm uploading looks like this...
12 bytes of Vertex (3 floats)
12 bytes of Normal (3 floats)
8 bytes of TexCoord (2 floats)
24 bytes of Bone Weights (6 floats)
6 bytes of bone indicies (1 byte each)
1 byte of number of bones actually referencing this vert
1 byte of zero padding

Now we have that settled, lets look at the final version of the VBO init code...
// VBO init code - Final
GLuint vbo = -1;
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, num_vert * 64, vert_ptr, GL_STATIC_DRAW);

It's nice when things work out like that. Onto the Element Buffer. This is where things get a bit annoying. Here it is again to refresh your memory.
// Element Array Init Code
GLuint ebo = -1;
glGenBuffers(1, &ebo);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, &ebo);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, num_quad * 8, quad_ptr, GL_STATIC_DRAW);

On my ATI 4870x2, that call to glBufferData(), while it does allocate the appropriate memory on the video card, doesn't actually copy any data. No problem, a quick read of the spec says we can do this...
glBufferSubData(GL_ELEMENT_ARRAY_BUFFER, 0, num_quad * 8, quad_ptr);

No you can't. That call actually crashes the video driver on my card. Yeah not happy at all. All is not lost... here is the Element Buffer init code that actually works...
// Element Array Init Code
GLuint ebo;
void *tmp;
glGenBuffers(1, &ebo);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, num_quad * 8, NULL, GL_STATIC_DRAW);
tmp = glMapBuffer(GL_ELEMENT_ARRAY_BUFFER, GL_WRITE_ONLY);
memcpy(tmp, quad_ptr, num_quad * 8);
glUnmapBuffer(GL_ELEMENT_ARRAY_BUFFER);


Last Wave of Dragons, I Promise


Here's the code I'm using to draw my mesh...
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glVertexPointer(3, GL_FLOAT, 64, 0);
glNormalPointer(GL_FLOAT, 64, 12);
glTexCoordPointer(2, GL_FLOAT, 64, 24);
glVertexAttribPointer(bw1, 4, GL_FLOAT, GL_FALSE, 64, 32);
glVertexAttribPointer(bw2, 2, GL_FLOAT, GL_FALSE, 64, 48);
glVertexAttribPointer(bi1, 4, GL_BYTE, GL_FALSE, 64, 56);
glVertexAttribPointer(bi2, 4, GL_BYTE, GL_FALSE, 64, 60);

glEnableVertexAttribArray(bw1);
glEnableVertexAttribArray(bw2);
glEnableVertexAttribArray(bi1);
glEnableVertexAttribArray(bi2);
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_NORMAL_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, obj->ebo);

glDrawElements(GL_QUADS, obj->nquad*8, GL_UNSIGNED_SHORT, 0);

glDisableVertexAttribArray(bw1);
glDisableVertexAttribArray(bw2);
glDisableVertexAttribArray(bi1);
glDisableVertexAttribArray(bi2);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glDisable(GL_TEXTURE_2D);

glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);

And here is the vertex shader I used...
uniform mat4 SkinMat[80];

attribute vec4 weight1;
attribute vec2 weight2;
attribute vec4 boneid1;
attribute vec4 boneid2;

varying vec4 pos;
varying vec3 normal;

void main()
{
 normal = gl_NormalMatrix * gl_Normal;

 pos = vec4(0,0,0,0);

 vec4 weight = weight1;
 vec2 w2 = weight2;
 vec4 bone = boneid1;
 vec4 b2 = boneid2;
 int nbone = int(boneid2.w);

 if(nbone==0)
 {
  pos = gl_Vertex;
 }
 else
 {

  for(int i=0; i<nbone; i++)
  {
   pos += weight.x * (SkinMat[int(bone.x)] * gl_Vertex);
 // Rotate variables
   weight = vec4(weight.yzw, w2.x);
   w2.x = w2.y;
   bone = vec4(bone.yzw, b2.x);
   b2.x = b2.y;
  }

 }
 gl_FrontColor = weight1;
 gl_Position = gl_ModelViewProjectionMatrix * pos;
 pos = gl_ModelViewMatrix * gl_Vertex;
}

So it's important to mention that all of those..
glEnableVertexAttribArray(bw1);
glEnableVertexAttribArray(bw2);
glEnableVertexAttribArray(bi1);
glEnableVertexAttribArray(bi2);

calls were preceded with...
bw1 = glGetAttribLocation(prog, "weight1");
bw2 = glGetAttribLocation(prog, "weight2");
bi1 = glGetAttribLocation(prog, "boneid1");
bi2 = glGetAttribLocation(prog, "boneid2");
SkinMatUnif = glGetUniformLocation(prog, "SkinMat");


Lastly I should mention this... looking at the spec may make you think that you can just throw whatever you want at the shader, and it's all good. Not so! While I'm sure one day there may be a device that supports every conceivable combination out there, something like...
glVertexAttribPointer(foo, 3, GL_BYTE, GL_FALSE, 64, 60);

Is going to end in tears. The short version is this... only expect it to work with the types that can be sent individualy old-school style... as documented here. Want to send 3 GL_SHORT's to the video card? Is there a glVertexAttrib3s() function? no. Hope that helps.

2 comments:

  1. Hello, amazing tutorial, extremely useful. I have a few questions though. I could be mistaken and totally confused, but your code doesn't seem to have a bone hierarchy. Your SkinMat[80] matrices hold transformations independent of the parent bone? The reason I ask this is because all the info I found online about skinning in general recreated the skeleton hierarchy inside their engine. And I wondered why they would do that rather than export the bones' absolute world transformations individually directly from maya (or whatever). So most of their codes involved multiplying a bone's matrix by all its parents matrices. I just wish you would go into the nasty details like bind pose matrices and proper exporting, since there is literally no comprehensive tutorial online for skinning.

    Thanks!

    ReplyDelete
  2. I know it's almost two years since the post but the concept still holds and it's great!

    Thanks!

    ReplyDelete