In H.264 I-macroblocks have dependencies on neighboring macroblocks in terms of prediction. In other words one can only encode macroblock at (mb_x, mb_y) when macroblocks at (mb_x - 1, mb_y) [A], (mb_x, mb_y - 1) [B], (mb_x + 1, mb_y - 1) [C], (mb_x - 1, mb_y - 1) [D] have been encoded.
Renderscript doesn't guarantee any scan order, i.e. we can not rely on any of the top or left macroblocks being done before we start to process the current one.
The solution to this is wave-front processing (see below). The idea is to schedule macroblock processing in waves, each wave will process the spots that are ready to be processed. In the image below each wave is represented by a different fill color of the rectangles. I highlighted one particular wave with red borders in the image below.
However how to make Renderscript process image only at the specified locations? The answer I figured out would be to use 'fake allocations' of the right size just for the purpose of having RenderScript iterate the amount of times we want. Let's call these 'fake allocations' -- the iterators.
So for the first wave we'll use a 1x1 'fake allocation' (iterator), for the second wave a 1x1 as well; for the third and 4th wave -- a 2x1 iterator and so forth (all the iterators are displayed to the right of the grid in a picture below with the appropriate wave color).
The question now is how do we derive a macroblock position (mb_x, mb_y) within a wave from the iterator postion (it_x, it_y) within the iterator. This it totally task-dependent, but in this case we'll use the following technique:
- fill the iterator with the current wave number;
- mb_x = wave_number - it_x * 2;
- mb_y = it_x;
PS: I event think it would be better to schedule the waves from withing the RenderScript function, this way we don't have to go back to Java too many times.