Sorry for the long delay in responding. How is the modified patch now? no longer using register stride(learn from your code) and have switched to shNadd instead. (using m4 and m2 as they are slightly faster than m8 and m4) benchmark: fcmul_add_c: 2179 fcmul_add_rvv_f32: 1652 Rémi Denis-Courmont 于2023年9月28日周四 21:33写道: > > > Le 28 septembre 2023 08:45:44 GMT+03:00, flow gg a > écrit : > >Okay, I revert the volatile in ff_read_time > > > >How about this version? > > It's still using register stride which is all but guaranteed to be slow on > any hardware and should only be used as a last resort. > > The code is also missing scheduling for multi-issue and unrolling with the > group multiplier. > > And lastly, while that probably won't change much, there are no reasons to > use mul here. You can use shNadd like existing code does. > > > > > >use vls instead vlseg, and use vfmacc > > > >The benchmark is sometimes better, sometimes the same > > > >fcmul_add_c: 3.5 > >fcmul_add_rvv_f32: 3.5 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.5 > >fcmul_add_rvv_f32: 4.2 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.2 > >fcmul_add_rvv_f32: 4.2 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.5 > >fcmul_add_rvv_f32: 4.2 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.7 > >fcmul_add_rvv_f32: 3.5 > > > > > >Rémi Denis-Courmont 于2023年9月28日周四 00:41写道: > > > >> Le tiistaina 26. syyskuuta 2023, 12.24.58 EEST flow gg a écrit : > >> > benchmark: > >> > fcmul_add_c: 19.7 > >> > fcmul_add_rvv_f32: 6.7 > >> > >> With optimisations enabled and the benchmarking fix, I get this (on the > >> same > >> hardware, I believe): > >> > >> fcmul_add_c: 3.5 > >> fcmul_add_rvv_f32: 6.7 > >> > >> For sure unfortunate design limitations of T-Head C910 are to blame to > no > >> small extent. It is not the first occurrence of an RVV optimisation that > >> turns > >> out worse than scalar due to those, and I still have honest hopes that > >> newer > >> (and conformant) IP would give saner results, but... I also believe that > >> the > >> code could be improved regardless. > >> > >> -- > >> Rémi Denis-Courmont > >> http://www.remlab.net/ > >> > >> > >> > >> _______________________________________________ > >> ffmpeg-devel mailing list > >> ffmpeg-devel@ffmpeg.org > >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >> > >> To unsubscribe, visit link above, or email > >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > >> > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >