Wednesday, August 19, 2015

Just got my Nexus 6

Just got my Nexus 6! Adreno (GPU) driver for RenderScript is present in '/vendor/lib' and is called '/vendor/lib/libRSDriver_adreno.so'!

When I run my DCT 4x4 test (see below) I see message: 'Successfully loaded runtime: libRSDriver_adreno.so' -- good sign.

#pragma version(1)
#pragma rs java_package_name(org.jcodec.codecs.h264.rs)
#pragma rs_fp_relaxed

rs_allocation input_alloc;
rs_allocation output_alloc;

// Computes forward 4x4 h.264 approximated DCT of the input block-wise horizontal pass
ushort __attribute__((kernel)) fdcth_264_hor(uchar coeff, uint32_t x, uint32_t y)
{
    uint8_t off_x = x & 0x3;
    uint8_t blk_x = x & ~0x3;

    const static int COEFFS[][4] = {
        {1, 1, 1, 1}, {2, 1, -1, -2}, {1, -1, -1, 1}, {1, -2, 2, -1}
    };

    ushort res =
        rsGetElementAt_uchar(input_alloc, blk_x    , y) * COEFFS[off_x][0] +
        rsGetElementAt_uchar(input_alloc, blk_x + 1, y) * COEFFS[off_x][1] +
        rsGetElementAt_uchar(input_alloc, blk_x + 2, y) * COEFFS[off_x][2] +
        rsGetElementAt_uchar(input_alloc, blk_x + 3, y) * COEFFS[off_x][3];


    return res;
}

// Computes forward 4x4 h.264 approximated DCT of the input block-wise vertical pass
ushort __attribute__((kernel)) fdcth_264_vert(ushort coeff, uint32_t x, uint32_t y)
{
    uint8_t off_y = y & 0x3;
    uint8_t blk_y = y & ~0x3;

    const static int COEFFS[][4] = {
        {1, 1, 1, 1}, {2, 1, -1, -2}, {1, -1, -1, 1}, {1, -2, 2, -1}
    };

    ushort res =
        rsGetElementAt_ushort(output_alloc, x, blk_y    ) * COEFFS[off_y][0] +
        rsGetElementAt_ushort(output_alloc, x, blk_y + 1) * COEFFS[off_y][1] +
        rsGetElementAt_ushort(output_alloc, x, blk_y + 2) * COEFFS[off_y][2] +
        rsGetElementAt_ushort(output_alloc, x, blk_y + 3) * COEFFS[off_y][3];

    return res;
}

It gives me 105fps on a 1920x1080 frame! And it's all from CPU unfortunately, at least Trepn tells me my GPU is not loaded at all. At the same time both CPUs are at 100%.

After I looked in my Logcat I found a couple of error messages. So apparently there was a problem and Adreno couldn't execute my kernels so it fell back to the CPU.

08-19 22:34:21.235  10160-10178/com.example.android.basicrenderscript E/Adreno-RS﹕ : ERROR: Address not found for fdcth_264_vert.COEFFS
08-19 22:34:21.235  10160-10178/com.example.android.basicrenderscript W/Adreno-RS﹕ : ERROR: rsdQueryGlobals returned -30

Turns out my kernels couldn't run on the GPU because I used non-static non-const array COEFFS (the code above is corrected). When I made the array 'const static' the errors from Adreno went away and I can now see that my GPU is loaded at 50%. And it is currently giving me around 65fps on 1920x1080 video. Great success!

Just for reference -- Moto X (first gen) still gives me 16fps out of its 100% loaded dual core CPU.

No comments:

Post a Comment