From: "Martin Storsjö" <martin@martin.st>
To: Hubert Mazur <hum@semihalf.com>
Cc: gjb@semihalf.com, upstream@semihalf.com, jswinney@amazon.com,
ffmpeg-devel@ffmpeg.org, mw@semihalf.com, spop@amazon.com
Subject: Re: [FFmpeg-devel] [PATCH 0/5] Provide optimized neon implementation
Date: Fri, 9 Sep 2022 10:32:48 +0300 (EEST)
Message-ID: <8d79d28-6336-fe5d-7c15-e5f0aa951731@martin.st> (raw)
In-Reply-To: <20220908092507.63319-1-hum@semihalf.com>
On Thu, 8 Sep 2022, Hubert Mazur wrote:
> Fix minor issues in the patches.
> Regarding vsse16 I didn't change saba & umlal to sub & smlal.
> It doesn't affect the performance, so left it as it was.
> The majority of changes refer to nsse16:
> - fixed indentation (thanks for pointing out),
> - applied the patch from Martin which fixes the balance
> within instructions,
> - interleaved instructions - apparently this helped a little
> to achieve better benchmarks.
Thanks! I measured a small further improvement on A53 with this change;
from 377 to 370 cycles.
> I have also updated the benchmark results for each function -
> not a huge performance improvement, but worth the effort.
> For nsse and vsse are shown below (these are the biggest changes).
> - vsse16 asm from 64.7 to 59.2,
> - nsse16 asm from 120.0 to 116.5.
It's kinda surprising that the difference is so small, since we reduced
the amount of work done in the functions quite significantly (IIRC on A53,
the speedup was something like 1.5x compared with the original), but I
guess it's understandable if the Graviton 3 is so powerful, that there's
enough spare execution units so that a bunch of redundant instructions
doesn't really matter.
Anyway, this revision of the patchset looked good to me, so I pushed it
now. Thanks!
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-09-09 7:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-08 9:25 Hubert Mazur
2022-09-08 9:25 ` [FFmpeg-devel] [PATCH 1/5] lavc/aarch64: Add neon implementation for vsad16 Hubert Mazur
2022-09-08 9:25 ` [FFmpeg-devel] [PATCH 2/5] lavc/aarch64: Add neon implementation of vsse16 Hubert Mazur
2022-09-08 9:25 ` [FFmpeg-devel] [PATCH 3/5] lavc/aarch64: Add neon implementation for vsad_intra16 Hubert Mazur
2022-09-08 9:25 ` [FFmpeg-devel] [PATCH 4/5] lavc/aarch64: Add neon implementation for vsse_intra16 Hubert Mazur
2022-09-08 9:25 ` [FFmpeg-devel] [PATCH 5/5] lavc/aarch64: Provide neon implementation of nsse16 Hubert Mazur
2022-09-09 7:32 ` Martin Storsjö [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-09-06 10:27 [FFmpeg-devel] [PATCH 0/5] Provide optimized neon implementation Hubert Mazur
2022-09-07 8:55 ` Martin Storsjö
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8d79d28-6336-fe5d-7c15-e5f0aa951731@martin.st \
--to=martin@martin.st \
--cc=ffmpeg-devel@ffmpeg.org \
--cc=gjb@semihalf.com \
--cc=hum@semihalf.com \
--cc=jswinney@amazon.com \
--cc=mw@semihalf.com \
--cc=spop@amazon.com \
--cc=upstream@semihalf.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git