Quote:
Originally Posted by Dialhot
Meanwhile, can you please work on that point:
"Optimize SSE code with regard to pairing, stalls, and so on."
|
There's really not much more optimization that can be done on Flux. Sh0dan added memory alignment and changed one mov instruction to a different, faster kind. I'm sure an SSE expert could find another 10% or so, but I'm no expert (in fact it took me an hour or so to figure out how the SSE code -- that I wrote -- works).
Quote:
Fluxsmooth is currently one the the slowest filter I use and... in fact I don't use it just for that .
|
I don't find it that slow. On my Athlon 2100+ XP this script:
Code:
LoadPlugin("..\debug25\fluxsmoothD-2.5.dll")
Mpeg2Source("test.d2v")
Telecide(order=1)
Decimate()
Trim(908, 994)
Tweak(bright=10)
Crop(0, 0, 240, 480)
old = FluxSmooth(temporal_threshold=3, spatial_threshold=-1)
new = FluxSmoothTest(temporal_threshold=3, spatial_threshold=-1)
StackHorizontal(last.Subtitle("Original"), old.Subtitle("1.0.1"), new.Subtitle("1.0.2"))
ConvertToYUY2()
runs at full speed in Media Player Classic...
To me the speed of a filter is not as important as the results gained from it, and I think Flux does pretty well in that regard. If I could find a 2.5 version of Dust I'd happily use it. I don't care if an encode takes eight hours, because I'll be sleeping anyway
.
I used that script to produce a Huffyuv sample of the new Flux (without current pixel included in average) against the old. I've uploaded it
here, but it's 40mb so don't download if you have a slow connection
.
Look in particular at the wall behind the cat. You'll see that the new version is slightly more effective at reducing the (considerable) noise in the original. I also included a closeup of a guy's face, which I find to be one of the better ways of spotting oversmoothing.
I'd appreciate input on whether people think the slight softening the guy's face in the new version is worth the extra noise reduction seen in the wall. It certainly increases compression a bit:
Old, CCE Q1 = 52,514 bytes per frame
New, CCE Q1 = 52,278 bytes per frame
But bear in mind the clip is only 87 frames long.
(Note: The clip is a little dark, so you might want to use the "superbright" feature, if your monitor has it.)