Feedback
I wanted to share some feedback based on my experience comparing this model with other audio super-resolution models on 16 kHz speech audio files.
In terms of perceived quality, the ranking for me is FlowHigh > LavaSR (they are extremely close) > NovaSR > FlashSR.
Regarding artifacts, FlowHigh and LavaSR perform very similarly. FlowHigh might have a slight edge, but overall they trade blows.
NovaSR introduces more noticeable artifacts and sound less good, while FlashSR produces significantly more artifacts.
One notable difference is inference speed. On an RTX 5070 Ti, FlowHigh runs at roughly 45x RTFx, whereas LavaSR reaches around 900x RTFx, which is a substantial advantage.
Overall, the quality of LavaSR is excellent, and the faster speed is huge plus. I will switch from FlowHigh to LavaSR.
Hey @or965 , thanks for the feedback!
Nice that you found LavaSR useful, I'm definitely planning on releasing an improved version with training code by this week. It should be even faster while being higher quality.
Hope that fixes the artifacts you found with this one.
Best regards,
Yatharth Sharma