Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Lynne <dev@lynne.ee>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 4/7] checkasm: use pointers for start/stop functions
Date: Sun, 16 Jul 2023 22:32:21 +0200 (CEST)
Message-ID: <N_VOrgM--3-9@lynne.ee> (raw)
In-Reply-To: <2884373.PGHYOZE5d7@basile.remlab.net>

Jul 15, 2023, 22:13 by remi@remlab.net:

> Le lauantaina 15. heinäkuuta 2023, 20.43.26 EEST Lynne a écrit :
>
>> Jul 15, 2023, 10:26 by remi@remlab.net:
>> > Le lauantaina 15. heinäkuuta 2023, 11.05.51 EEST Lynne a écrit :
>> >> Jul 14, 2023, 20:29 by remi@remlab.net:
>> >> > This makes all calls to the bench start and stop functions via
>> >> > function pointers. While the primary goal is to support run-time
>> >> > selection of the performance measurement back-end in later commits,
>> >> > this has the side benefit of containing platform dependencies in to
>> >> > checkasm.c and out of checkasm.h.
>> >> > ---
>> >> > 
>> >> >  tests/checkasm/checkasm.c | 33 ++++++++++++++++++++++++++++-----
>> >> >  tests/checkasm/checkasm.h | 31 ++++---------------------------
>> >> >  2 files changed, 32 insertions(+), 32 deletions(-)
>> >> 
>> >> Not sure I agree with this commit, the overhead can be detectable,
>> >> and we have a lot of small functions with runtime a few times that
>> >> of a null function call.
>> > 
>> > I don't think the function call is ever null. The pointers are left NULL
>> > only if none of the backend initialise. But then, checkasm will bail out
>> > and exit before we try to benchmark anything anyway.
>> > 
>> > As for the real functions, they always do *something*. None of them "just
>> > return 0".
>>
>> I meant a no-op function call to measure the overhead of function
>> calls themselves, complete with all the ABI stuff.
>>
>
> I
>
>>
>> >> Can you store the function pointers out of the loop to reduce
>> >> the derefs needed?
>> > 
>> > Taking just the two loads is out of the loop should be feasible but it
>> > seems a rather vain. You will still have the overhead of the indirect
>> > function call, the function, and most importantly in the case of Linux
>> > perf and MacOS kperf, the system calls.
>> > 
>> > The only way to avoid the indirect function calls are to use IFUNC (tricky
>> > and not portable), or to make horrible macros to spawn one bench loop for
>> > each backend.
>> > 
>> > In the end, I think we should rather aim for as constant time as possible,
>> > rather than as fast as possible, so that the nop loop can estimate the
>> > benchmarking overhead as well as possible. In this respect, I think it is
>> > actually marginally better *not* to cache the function pointers in local
>> > variables, which could end up spilled on the stack, or not, depending on
>> > local compiler optimisations for any given test case.
>>
>> I disagree, uninlining the timer fetches adds another source of
>> inconsistency.
>>
>
> Err, outlining the timer makes sure that it's always the exact same code 
> that's run, and not differently optimised inlinings, at least if LTO is absent. 
> (And even with LTO, it vastly reduces the compiler's ability to optimise and 
> vary the compilation.) Again, given how the calculations are made at the 
> moment, the stability of the overhead is important, so that we can *compare* 
> measurements. The absolute value of the overhead, not so much.
>

Introducing additional overhead in the form of a dereference is a point
where instability can creep in. Can you guarantee that a context will
always remain in L1D cache, as opposed to just reading the raw CPU timing
directly where that's supported.


> But I still argue that that is, either way, completely negligible compared to 
> the *existing* overhead. Each loop is making 4 system calls, and each of those 
> system call requires a direct call (to PLT) and an indirect branch (from GOT). 
> If you have a problem with the two additional function calls, then you can't 
> be using Linux perf in the first place.
>

You don't want to ever use linux perf in the first place, it's second class.
I don't think it's worth changing the direct inlining we had before. You're not
interested in whether or not the same exact code is ran between platforms,
just that the code that's measuring timing is as efficient and low overhead
as possible.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2023-07-16 20:32 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-14 18:26 [FFmpeg-devel] [PATCH 0/7] checkasm RISC-V Linux perf enablement Rémi Denis-Courmont
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 1/7] checkasm: fix Linux perf cleanup Rémi Denis-Courmont
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 2/7] checkasm: improve Linux perf error message Rémi Denis-Courmont
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 3/7] checkasm: make perf macros functional Rémi Denis-Courmont
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 4/7] checkasm: use pointers for start/stop functions Rémi Denis-Courmont
2023-07-15  8:05   ` Lynne
2023-07-15  8:25     ` Rémi Denis-Courmont
2023-07-15 17:43       ` Lynne
2023-07-15 20:13         ` Rémi Denis-Courmont
2023-07-16 20:32           ` Lynne [this message]
2023-07-17  5:18             ` Rémi Denis-Courmont
2023-07-17 17:48               ` Lynne
2023-07-17 18:09                 ` Rémi Denis-Courmont
2023-07-18 21:32                   ` Lynne
2023-07-19 15:58                     ` Rémi Denis-Courmont
2023-07-24 21:26                   ` [FFmpeg-devel] [TC] " Martin Storsjö
2023-07-24 21:33                     ` Nicolas George
2023-07-24 22:19                     ` Lynne
2023-07-24 22:57                       ` Kieran Kunhya
2023-07-25  6:44                       ` Martin Storsjö
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 5/7] checkasm: remove unused variable Rémi Denis-Courmont
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 6/7] checkasm: allow run-time fallback to AV_READ_TIME Rémi Denis-Courmont
2023-07-14 18:28 ` [FFmpeg-devel] [PATCH 7/7] configure: enable Linux perf on RISC-V by default Rémi Denis-Courmont

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=N_VOrgM--3-9@lynne.ee \
    --to=dev@lynne.ee \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git