On 21/05/2024 14:32, J. Dekker wrote: > Some timers on certain device and test combinations can produce noisy > results, affecting the reliability of performance measurements. One > notable example of this is the Canaan K230 RISC-V development board. > > An option to adjust the number of samples (--samples) has been added, > allowing developers to increase or adjust the sample count for more > reliable results. > > Signed-off-by: J. Dekker > --- > > Auto-detection can be added later when either a count is omitted or a specific > value or term such as '0' or 'auto' is provided. This is a development tool, > the users will be developers primarily working on master who follow checkasm > changes and/ or add their own tests and functionality; there's no need to > support a feature like this or deprecate it for years if a better solution > is submitted. > > tests/checkasm/checkasm.c | 12 +++++++++++- > tests/checkasm/checkasm.h | 5 +++-- > 2 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c > index 31ca9f6e2b..b8e5cfb9dd 100644 > --- a/tests/checkasm/checkasm.c > +++ b/tests/checkasm/checkasm.c > @@ -72,6 +72,9 @@ > void (*checkasm_checked_call)(void *func, int dummy, ...) = checkasm_checked_call_novfp; > #endif > > +/* Trade-off between speed and accuracy */ > +uint64_t bench_runs = 1000; > + > /* List of tests to invoke */ > static const struct { > const char *name; > @@ -820,7 +823,7 @@ static void bench_uninit(void) > static int usage(const char *path) > { > fprintf(stderr, > - "Usage: %s [--bench] [--test=] [--verbose] [seed]\n", > + "Usage: %s [--bench] [--samples=] [--test=] [--verbose] [seed]\n", > path); > return 1; > } > @@ -867,6 +870,13 @@ int main(int argc, char *argv[]) > state.test_name = arg + 7; > } else if (!strcmp(arg, "--verbose") || !strcmp(arg, "-v")) { > state.verbose = 1; > + } else if (!strncmp(arg, "--samples=", 10)) { > + l = strtoul(arg + 10, &end, 10); > + if (*end == '\0') { > + bench_runs = l; > + } else { > + return usage(argv[0]); > + } > } else if ((l = strtoul(arg, &end, 10)) <= UINT_MAX && > *end == '\0') { > seed = l; > diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h > index 07fcc751ff..d6921cc50c 100644 > --- a/tests/checkasm/checkasm.h > +++ b/tests/checkasm/checkasm.h > @@ -167,7 +167,7 @@ extern AVLFG checkasm_lfg; > > static av_unused void *func_ref, *func_new; > > -#define BENCH_RUNS 1000 /* Trade-off between accuracy and speed */ > +extern uint64_t bench_runs; > > /* Decide whether or not the specified function needs to be tested */ > #define check_func(func, ...) (checkasm_save_context(), func_ref = checkasm_check_func((func_new = func), __VA_ARGS__)) > @@ -338,8 +338,9 @@ typedef struct CheckasmPerf { > uint64_t tsum = 0;\ > int ti, tcount = 0;\ > uint64_t t = 0; \ > + const uint64_t truns = bench_runs;\ > checkasm_set_signal_handler_state(1);\ > - for (ti = 0; ti < BENCH_RUNS; ti++) {\ > + for (ti = 0; ti < truns; ti++) {\ > PERF_START(t);\ > tfunc(__VA_ARGS__);\ > tfunc(__VA_ARGS__);\ While working on the FFT asm with https://github.com/cyanreg/lavu_fft_test which has a built-in benchmark, I've found that exponentiation works best, as adding more and more digits at the end is prone to under/overshoot. For large functions, 1 << 16 is a good starting point, while for very small functions, 1 << 23 becomes more optimal. I suggest replacing --samples with --runs (or --bench-runs, but we're all lazy for that), and documenting it as "--runs=" and rejecting anything large enough to overflow.