Hi, On Sun, Feb 25, 2024 at 11:28 AM James Almer wrote: > On 2/25/2024 1:22 PM, Ronald S. Bultje wrote: > > On Sun, Feb 25, 2024 at 10:56 AM Ronald S. Bultje > > wrote: > > > >> Hi, > >> > >> On Sun, Feb 25, 2024 at 3:28 AM J. Dekker wrote: > >> > >>> Weak filter can overflow in delta0 calculation before >> 4 in int16. > >>> > >>> Signed-off-by: J. Dekker > >>> --- > >>> > >>> I do not know x86 simd at all, so this is just an attempt to fix > >>> the implementation rather than write extremely performant code. > >>> > >>> Suggestions welcome. > >>> > >> > >> https://pastebin.com/KvcbQ2nK > >> > > > > Attached a slightly adjusted version which does sse2 in 16bit also. > > > > Ronald > > > diff --git a/libavcodec/x86/hevc_deblock.asm > b/libavcodec/x86/hevc_deblock.asm > > index 85ee4800bb..869301caff 100644 > > --- a/libavcodec/x86/hevc_deblock.asm > > +++ b/libavcodec/x86/hevc_deblock.asm > > @@ -31,6 +31,7 @@ cextern pw_1023 > > pw_pixel_max_12: times 8 dw ((1 << 12)-1) > > pw_m2: times 8 dw -2 > > pd_1 : times 4 dd 1 > > +pd_8 : times 8 dd 8 > > This is unused. > Fixed. Ronald