From: Ramiro Polla <ramiro.polla@gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2 2/4] swscale/x86: add sse4 {lum, chr}ConvertRange
Date: Fri, 14 Jun 2024 17:46:22 +0200
Message-ID: <CALweWgB80WH7o0TfE0NoNYiaA4hL58dKtpu_71AdUqbPQeb=tA@mail.gmail.com> (raw)
In-Reply-To: <CALweWgBXFbTu2mdcO8ac-t6AkcJyuY9RBK0stSZtv+j3Eiu_6Q@mail.gmail.com>
On Wed, Jun 12, 2024 at 4:54 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
>
> Hi,
>
> On Tue, Jun 11, 2024 at 8:42 PM James Almer <jamrial@gmail.com> wrote:
> >
> > On 6/11/2024 3:26 PM, Michael Niedermayer wrote:
> > > On Tue, Jun 11, 2024 at 02:28:56PM +0200, Ramiro Polla wrote:
> > >> chrRangeFromJpeg_8_c: 28.7
> > >> chrRangeFromJpeg_8_sse4: 16.2
> > >> chrRangeFromJpeg_24_c: 152.7
> > >> chrRangeFromJpeg_24_sse4: 29.7
> > >> chrRangeFromJpeg_128_c: 366.5
> > >> chrRangeFromJpeg_128_sse4: 233.0
> > >> chrRangeFromJpeg_144_c: 408.0
> > >> chrRangeFromJpeg_144_sse4: 182.5
> > >> chrRangeFromJpeg_256_c: 698.7
> > >> chrRangeFromJpeg_256_sse4: 325.5
> > >> chrRangeFromJpeg_512_c: 1348.7
> > >> chrRangeFromJpeg_512_sse4: 660.2
> > >> chrRangeToJpeg_8_c: 37.7
> > >> chrRangeToJpeg_8_sse4: 16.2
> > >> chrRangeToJpeg_24_c: 115.7
> > >> chrRangeToJpeg_24_sse4: 36.2
> > >> chrRangeToJpeg_128_c: 631.2
> > >> chrRangeToJpeg_128_sse4: 163.7
> > >> chrRangeToJpeg_144_c: 710.7
> > >> chrRangeToJpeg_144_sse4: 183.0
> > >> chrRangeToJpeg_256_c: 1253.0
> > >> chrRangeToJpeg_256_sse4: 343.5
> > >> chrRangeToJpeg_512_c: 2491.2
> > >> chrRangeToJpeg_512_sse4: 654.2
> > >> lumRangeFromJpeg_8_c: 11.7
> > >> lumRangeFromJpeg_8_sse4: 10.5
> > >> lumRangeFromJpeg_24_c: 38.5
> > >> lumRangeFromJpeg_24_sse4: 19.0
> > >> lumRangeFromJpeg_128_c: 237.5
> > >> lumRangeFromJpeg_128_sse4: 79.2
> > >> lumRangeFromJpeg_144_c: 255.7
> > >> lumRangeFromJpeg_144_sse4: 90.5
> > >> lumRangeFromJpeg_256_c: 441.5
> > >> lumRangeFromJpeg_256_sse4: 161.7
> > >> lumRangeFromJpeg_512_c: 879.0
> > >> lumRangeFromJpeg_512_sse4: 333.2
> > >> lumRangeToJpeg_8_c: 20.0
> > >> lumRangeToJpeg_8_sse4: 11.7
> > >> lumRangeToJpeg_24_c: 61.5
> > >> lumRangeToJpeg_24_sse4: 17.7
> > >> lumRangeToJpeg_128_c: 357.5
> > >> lumRangeToJpeg_128_sse4: 80.0
> > >> lumRangeToJpeg_144_c: 371.5
> > >> lumRangeToJpeg_144_sse4: 93.2
> > >> lumRangeToJpeg_256_c: 651.5
> > >> lumRangeToJpeg_256_sse4: 164.5
> > >> lumRangeToJpeg_512_c: 1279.0
> > >> lumRangeToJpeg_512_sse4: 333.7
> > >> ---
> > >> libswscale/swscale_internal.h | 1 +
> > >> libswscale/utils.c | 2 +
> > >> libswscale/x86/Makefile | 1 +
> > >> libswscale/x86/range_convert.asm | 130 +++++++++++++++++++++++++++++++
> > >> libswscale/x86/swscale.c | 36 +++++++++
> > >> 5 files changed, 170 insertions(+)
> > >> create mode 100644 libswscale/x86/range_convert.asm
> > >
> > > breaks x86-32 build
> > >
> > > LD ffmpeg_g
> > > /usr/lib/gcc-cross/i686-linux-gnu/7/../../../../i686-linux-gnu/bin/ld: libswscale/libswscale.a(utils.o): in function `sws_setColorspaceDetails':
> > > ffmpeg/linux32/src/libswscale/utils.c:1086: undefined reference to `ff_sws_init_range_convert_x86'
> > > collect2: error: ld returned 1 exit status
> > > make: *** [Makefile:139: ffmpeg_g] Error 1
> > >
> > > thx
> >
> > The functions are wrapped in ARCH_X86_64 checks for seemingly no reason,
> > so they should be removed in the next iteration.
>
> Fixed.
>
> James walked me through on IRC to optimize and improve the functions
> in a way that they work both with sse2 and avx2. New patch attached.
I'll apply tomorrow if there are no more comments.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-06-14 15:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-11 12:28 [FFmpeg-devel] [PATCH v2 1/4] checkasm: add tests for " Ramiro Polla
2024-06-11 12:28 ` [FFmpeg-devel] [PATCH v2 2/4] swscale/x86: add sse4 " Ramiro Polla
2024-06-11 12:32 ` James Almer
2024-06-11 18:26 ` Michael Niedermayer
2024-06-11 18:43 ` James Almer
2024-06-12 14:54 ` Ramiro Polla
2024-06-14 15:46 ` Ramiro Polla [this message]
2024-06-11 12:28 ` [FFmpeg-devel] [PATCH v2 3/4] swscale/x86: add avx2 " Ramiro Polla
2024-06-11 12:28 ` [FFmpeg-devel] [PATCH v2 4/4] swscale/aarch64: add neon " Ramiro Polla
2024-06-18 17:42 ` Ramiro Polla
2024-06-18 21:15 ` Ramiro Polla
2024-06-14 15:45 ` [FFmpeg-devel] [PATCH v2 1/4] checkasm: add tests for " Ramiro Polla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALweWgB80WH7o0TfE0NoNYiaA4hL58dKtpu_71AdUqbPQeb=tA@mail.gmail.com' \
--to=ramiro.polla@gmail.com \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git