sil2100//vx developer log

Welcome to my personal web page


The fglrx bug mystery solved

In my yesterday's post, I over-viewed the situation of a fglrx-related bug in compiz that I have been working on recently. Today, after consultation with the developers from ATI and a joint bug-search with Sam Spilsbury, we were finally able to find the root cause of the issue - resulting in a one liner fix for a bug in compiz. So why did this bug only happen for fglrx? Easy. Due to implementation differences in the drivers.

2012-04-20 14:25

Quoting the specification [here]:

    If dpy and draw are the display and drawable for the calling thread's
    current context, glXBindTexImageEXT performs an implicit glFlush.

    The contents of the texture after the drawable has been bound are defined
    as the result of all rendering that has completed before the call to
    glXBindTexImageEXT.  In other words, the results of any operation which
    has caused damage on the drawable prior to the glXBindTexImageEXT call
    will be represented in the texture.

I have been pointed to this by William (from ATI), that I should do a glXBindTexImageEXT and glXReleaseTexImageEXT around doing any drawing in my test code. The reason for this is that, as we read in the specs, the bind call actually performs a flush of all Pixmap modifications to the texture. These modifications are normally buffered and flushed whenever the driver feels like it. That's also why the Pixmap size mattered - since the bigger it was, the bigger chances were that the changes will get automatically flushed.

In compiz, due to a really really old mistake in code of someone, the decoration damage events actually were not setting a flag (damaged = true) that was required to rebind the texture on enable () calls. So a whole step of glX(Bind/Release)TexImageEXT was missing.

This, however, was not a problem for other drivers at all. It seems it's completely implementation dependent when the modifications are flushed to the target texture - fglrx seems to be more economical here, not flushing needlessly, which resulted in this bug happening. Crazy stuff.

Big thanks to Sam Spilsbury for spotting the 'early return' in the decor plugin code. Also, big thanks to the developers from ATI (AMD) for their help and patience!