Okay, I revert the volatile in ff_read_time

How about this version?

use vls instead vlseg, and use vfmacc

The benchmark is sometimes better, sometimes the same

fcmul_add_c: 3.5
fcmul_add_rvv_f32: 3.5
 - af_afir.fcmul_add [OK]
fcmul_add_c: 4.5
fcmul_add_rvv_f32: 4.2
 - af_afir.fcmul_add [OK]
fcmul_add_c: 4.2
fcmul_add_rvv_f32: 4.2
 - af_afir.fcmul_add [OK]
fcmul_add_c: 4.5
fcmul_add_rvv_f32: 4.2
 - af_afir.fcmul_add [OK]
fcmul_add_c: 4.7
fcmul_add_rvv_f32: 3.5


Rémi Denis-Courmont <remi@remlab.net> 于2023年9月28日周四 00:41写道：

> Le tiistaina 26. syyskuuta 2023, 12.24.58 EEST flow gg a écrit :
> > benchmark:
> > fcmul_add_c: 19.7
> > fcmul_add_rvv_f32: 6.7
>
> With optimisations enabled and the benchmarking fix, I get this (on the
> same
> hardware, I believe):
>
> fcmul_add_c: 3.5
> fcmul_add_rvv_f32: 6.7
>
> For sure unfortunate design limitations of T-Head C910 are to blame to no
> small extent. It is not the first occurrence of an RVV optimisation that
> turns
> out worse than scalar due to those, and I still have honest hopes that
> newer
> (and conformant) IP would give saner results, but... I also believe that
> the
> code could be improved regardless.
>
> --
> Rémi Denis-Courmont
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>