Sorry, late to the atomic counters party!
Besides the ImageAtomicAdd() function I had also been using the atomicCounterIncrement() function with success.
I think the atomicCounter() function had also left me baffled but I took another look, it seems despite what https://www.khronos.org/opengl/wiki/Atomic_Counter says, just reading the value is not an atomic operation and depends on what has been completed in the different threads.
(Interestingly this https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/atomicCounter.xhtml doesn’t say it reads the value atomically, but this does https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/atomicCounterIncrement.xhtml)
It seems calling it after the atomicCounterIncrement() and after a memoryBarrier() (after " You will need to ensure internal visibility if you want to use ordering guarantees within a rendering command." ) gives the final result of the counter for all the threads.
Strangely enough it still seems to give unstable results with atomicCounterAdd() after a texelFetch(), as in Tim’s sum example. Might have to do with the fact that you can’t sync different groups no matter what (barrier() only works within the workgroup)
And about David’s question doing a decrement after the imageStore(), it seems all the increments are completed by the various threads before the write, so anything after doesn’t matter.
And doing a decrement just after an increment seems to give random results because of a race condition.
Adding memoryBarrier() gives a more predictable outcome but since all the increments would then be complete before the first decrement, the counter still goes from 0 to max overall.
See attached tox for various tests compute_shader_atomic_questions_workgroup_sync_issues.1.toe (5.6 KB)
So in short it seems the only reliable thing is one increment per thread and the rest is kinda confusing ;D
Some interesting insights:
In all case it seems atomic counters are going away with vulkan, only imageAtomic operations remain.