Thanks for the feedback. Addressed your points and attached an updated patch - Simplified the documentation to just point to the pad filter for usage but kept the examples - Made the adjustments to vf_pad_cuda.c - Updated vf_pad_cuda.cu. Should be easier to add 10-12 bit support down the line now.