Months ago, I spent a lot of time getting the Raspberry Pi to perform fast JPEG decoding.
While it is fast, it is just barely not fast enough for what I want to do with Dexter so I am now going to spend another chunk of time optimizing it further.
Here are some current decoding speed statistics. I used a high bitrate JPEG to stress it to its limit.
--------------------------------
Decoding from a memory buffer and rendering on the screen repeatedly:
62 fps (_fields_ per second)
When disabling the fragment shader that handles interlacing the video:
63 fps
Without transferring the decoded image to GLES2 but leaving everything else the same:
80 fps
Transferring the decoded image but not rendering it:
77 fps
Only decoding the JPEG, not transferring it or rendering it:
103 fps
-------------------------
Bottlenecks:
- Transferring the decoded image costs about 20 fps
- Rendering the image costs about 20 fps (but exercising the fragment shader only costs 1 fps so this is a good sign since I rely on it). It's possible I could optimize the vertex shader to reduce rendering speed.
Conclusions:
- If I can decode JPEG directly to a texture then I don't have to transfer it in memory (this may boost up to 20 fps)
- Rendering performance may be running sub optimally
- I may be able to increase JPEG decoding speed by reducing some internal memcpy's I am doing right now and increasing OpenMAX input buffer size
Goal:
- Increase overall decoding/rendering speed from 62 fps to over 100 fps if I can manage it. I think bare minimum I will need to get it over 80 fps to give me some breathing room.
what was the resolution of these JPEGs?
ReplyDelete