Suggestion - Statistics

GFR · #1 11-11-2003, 06:43 AM

What we do now is to take some equaly spaced samples from the movie and hope that we took a sufficient amount of relevant data to calculate the final movie size. For some movies it works, for some it doesn't.

That's because for some movies we have less variation in the compressibilty of frames, and for some others we have more variation (and so we need more samples). For some movies the equaly spaced sampling is OK for some others it's not a lucky choice (altough if the number of samples is "sufficiently large" that is not a critical issue).

Now if we could scan the movie first for some information on the compressibilty, maybe a motion metric based on the MA script, or the Q levels from a fast 1st pass, or some of that info that Bitrate Viewer shows (and that I can't understand

) and calculate some basic statistics like the average and the variance, we could use statistical decision theory to decide how many samples do we need for that movie, to make sure we meet a given error margin. We could even use that info to chose which samples are to be selected (this is a little more complicated).

This "statistical" prediction would have then two phasis: a first phase where we analyze the movie to decide how many samples we need; the second phase is the prediction itself.

If the first phase can be made fast enough, predictions will not only be more accurate but they can be faster too, because for "easy" movies it will use fewer samples, and for "hard" movies you won't need to do it again and again and again until it works...

Krassi · #2 11-11-2003, 08:58 AM

If i remember well we already have done a try once.

Do you know mpeg_stat

It's a nice little cmd line tool that displays statistics of an mpeg1 file.
Mpeg 2 isn't supported

You can find the windows portation here:
http://www.krassi.de.tf

Do you know any other tools to show some bitrate values (Peak etc.)

EDIT: Here's a part of the result:

Code:

SUMMARY:

Total Bytes read: 19729619. Total number of frames: 3895.  Length is 155.80 sec

Width: 720      Height: 576
Avg. Frame Size: 5038 bytes + 3 bits  (average rate 1007697.48 bits/sec)

Total Compression Rate:  0.40 % of uncompressed 24 bit images
                        =  0.10 bits per pixel

Number of Macroblocks [width * height = sum]: 45 x 36 = 1620 per frame
Skipped Macroblocks = 2907978 (48.67%), Coded Macroblocks = 3066508 (51.33%)
        Coded blocks: 20.64%    [ 20.73  20.79  20.68  20.71  9.88  7.20 ]

MPEG-Viewer requirements:
        Pixel aspect ratio of 0.9375 (CCIR 601, 625 lines)
        Required display speed: 25 frames/sec
        Specified bit rate is variable
        Requested buffer size is 112K ints (16 bits/int).
        And the constrained parameter flag is off.
        The stream does not meet the constrained parameter requirements,
        due to the following factors:
                Bit rate is variable.
                VBV buffer is too large (2293760 bits).
                Pixels per second is too high.
                There are too many macroblocks per second.

Length of vectors in half pixels:
        Horizontal forward vectors, maximum :  65       average:   6
        Vertical forward vectors, maximum   :  65       average:   5

        Horizontal backward vectors, maximum:  65       average:   5
        Vertical backward vectors, maximum  :  65       average:   4


Frame specific information:

    262 I FRAMES, average is:
        Size: 15508 bytes + 7 bits (20.70%)
        Compression Rate:  1.25%
        Q Factor [scales quantization matrix]: 3.41

    1042 P FRAMES, average is:
        Size: 6350 bytes + 3 bits (33.72%)
        Compression Rate:  0.51%
        Q Factor [scales quantization matrix]: 3.74

    2591 B FRAMES, average is:
        Size: 3452 bytes + 0 bits (45.58%)
        Compression Rate:  0.28%
        Q Factor [scales quantization matrix]: 3.84
        46.44% interpolated Macro Blocks

vmesquita · #3 11-11-2003, 10:01 AM

That looks like a very interesting idea. If we could make it read VOBs, then we would be able to make a very precise qfactor/CQ estimation! The original stream already have compressibility information, that can be obtained by relating size vs Quantization.

[]'s
VMesquita

incredible · #4 11-11-2003, 10:39 AM

Quote:

Originally Posted by GFR

Now if we could scan the movie first for some information on the compressibilty,..... and calculate some basic statistics like the average and the variance, we could use statistical decision theory to decide how many samples do we need for that movie, to make sure we meet a given error margin. We could even use that info to chose which samples are to be selected....

Well logically that's how 2pass works ... analyzing the whole movie to see the Details and which Variance of Bitrates and therefore compressibility is needed (not to explain you but to me your thinking in this part of your Posting sounds similair like the logic of 2pass)

but maybe I understood you wrong.

Quote:

This "statistical" prediction would have then two phasis: a first phase where we analyze the movie to decide how many samples we need; the second phase is the prediction itself.

Well in this case we would have to make a "first" analyse on the whole movie to see which parts of the whole movie (for 100% acuracy) are needed for prediction? And thats what maybe "mpeg_stat" or something else could offer: A way to "shorten" the anylysis cause if it works it could give the almost right parts of the movie we have to predict. Well ... in case of VBR sources .. like VOBs

Quote:

If the first phase can be made fast enough, predictions will not only be more accurate but they can be faster too, because for "easy" movies it will use fewer samples, and for "hard" movies you won't need to do it again and again and again until it works...

And that's what we have to figure out ... the way of doing that first pass.

Interesting .... but also maybe depending on complex program codes as mpeg_stat or BitrateViewer offer which have to be used?
I do love these prediction Threads ....

kwag · #5 11-11-2003, 12:19 PM

The ideas are very nice, but the problem is that source compressibilty is not equal to destination compressibility

We can analyze the source ( VOB, M1V, M2V ) compressibility, but we can't predict what will the destination compresssion will be, because the behaviour is encoder dependant. Specially TMPEG's crazy CQ linearity.
I tried this method, where I discussed in a thread to take a piece of a VOB and demux it, so we could get a small piece, and scale it down by the correct compression ratio and encode to match the clip sample/ratio.
The final full encode was WaaaYyYyYy off, compared to regular prediction methods, because there was no relationship between source compression and the re-encoded (per part) compression

-kwag

GFR · #6 11-11-2003, 12:21 PM

Quote:

Well logically that's how 2pass works ... analyzing the whole movie to see the Details and which Variance of Bitrates and therefore compressibility is needed (not to explain you but to me your thinking in this part of your Posting sounds similair like the logic of 2pass) but maybe I understood you wrong.

Yes similar but two pass analyzes each frame and then decides how to allocate compression for each frame in the 2nd pass so that the overall compression we seek is reached, with the least distortion.

I meant something much simpler, that is, a single number that just tells how much the compressibilty varies, ie, if most frames have the same average compressibilty (low variance) then with few samples we can find the right CQ. If the variance is high (we have compressibilty values "spread" all over and not concentrated around the average) then we need more samples.

The problem is to find a meaningful measure of compressibility that is fast to determine also

GFR · #7 11-11-2003, 12:33 PM

Quote:

Originally Posted by kwag

The ideas are very nice, but the problem is that source compressibilty is not equal to destination compressibility

We can analyze the source ( VOB, M1V, M2V ) compressibility, but we can't predict what will the destination compresssion will be, because the behaviour is encoder dependant. Specially TMPEG's crazy CQ linearity.
I tried this method, where I discussed in a thread to take a piece of a VOB and demux it, so we could get a small piece, and scale it down by the correct compression ratio and encode to match the clip sample/ratio.
The final full encode was WaaaYyYyYy off, compared to regular prediction methods, because there was no relationship between source compression and the re-encoded (per part) compression

-kwag

I agree, you cannot forecast the destination compressibilty with the source alone. The idea is not to abandon the traditional prediction, but to help it.

The idea is not trying to find out the final size - CQMatic should do it. It should just suggest to CQMatic how many samples are needed.

See, you got CQMatic X1 and X3.

You first try X1 and then after hours encoding the movie you find it's too small.

You then try X3 and it's OK.

With the above idea, the source analysis would tell you the "X-factor" for CQMatic, so if it's a hard movie it will begin with X3 right away and you don't lose time, if it's a normal movie CQMatic X1 is used, and the only extra time is the pre-analysys, if it's a piece of cake we can have something like X0.5 and it's faster than a bullet

kwag · #8 11-11-2003, 12:50 PM

Quote:

Originally Posted by GFR

With the above idea, the source analysis would tell you the "X-factor" for CQMatic, so if it's a hard movie it will begin with X3 right away and you don't lose time, if it's a normal movie CQMatic X1 is used, and the only extra time is the pre-analysys, if it's a piece of cake we can have something like X0.5 and it's faster than a bullet

You just turned on a

What we need, is not a compression analysis of the source. We need a Activity distribution analysis of the source

Why

, because if we are about to encode an AVI ( Captured Huffy ), there's no compression, because it's not an MPEG file.
So what I can implement in CQMatic, is "Levels of X"

For example: If the movie (source) has a well balanced distribution, where activity is linear throughout the movie, then a level of X1 is automatically chosen.
The more "unbalanced" ( activity varies ), the higher the X value is applied by CQMatic.

Are we in sync up to now

So a fast source analysis (quick scan, A-LA bitrateviewer), can give a footprint (roadmap), and the more complex the source bitrate distribution is, the longer the sampling (X factor) CQMatic will use.

Does this sound logical

Maybe even the log that bitrate viewer produces can be used for calculating the correct X factor to use

Edit: Bitrate viewer can't be used, because it analyzes compressed formats. So whatever program is cooked up to analyze the source, it must detect frame density/activity. Wonder if an AviSynth script could be written to do the job
Similar to what SansGrip did with Gripfit, where he analyzes several frames of the source, then goes back and does the auto resize. So a script that would "measure" activity, like the built-in functions "YDifferenceToNext()", etc., could be used to build up the activity roadmap.

-kwag

incredible · #9 11-11-2003, 01:22 PM

Quote:

Originally Posted by KWAG

What we need, is not a compression analysis of the source. We need a Activity distribution analysis of the source
Why , because if we are about to encode an AVI ( Captured Huffy ), there's no compression, because it's not an MPEG file.

Thats what I meant by saying ... "in case of VBR sources .. like VOBs" cause captured MJPGs for example do not base on VBR as mpeg, VOS and so on do.

Quote:

Originally Posted by KWAG

So a fast source analysis (quick scan, A-LA bitrateviewer), can give a footprint (roadmap), and the more complex the source bitrate distribution is, the longer the sampling (X factor) CQMatic will use.

Well how about .....

Iike a "YDifferenceToNext()" way as first fast run on the whole movie independend of the FPS and by this we can assume the complexibility and activity ... and we get a desired X-factor????
(I just wrote "YDifferenceToNext()" for understanding ... shure its not the right simple and only command to do this!

)

EDIT:::

HELLL KWAG same Idea while you're editing! *lol*

kwag · #10 11-11-2003, 01:31 PM

Quote:

Originally Posted by incredible

EDIT:::

HELLL KWAG same Idea while you're editing! *lol*

Minds in sync

-kwag

GFR · #11 11-11-2003, 01:39 PM

Quote:

Originally Posted by kwag

Quote:

Originally Posted by GFR

With the above idea, the source analysis would tell you the "X-factor" for CQMatic, so if it's a hard movie it will begin with X3 right away and you don't lose time, if it's a normal movie CQMatic X1 is used, and the only extra time is the pre-analysys, if it's a piece of cake we can have something like X0.5 and it's faster than a bullet

You just turned on a

What we need, is not a compression analysis of the source. We need a Activity distribution analysis of the source

OK, search and replace "compressibilty" for "activity" in my previus posts

Of course "activity" means not only motion but levels of detail through the movie, noise etc.

Quote:

Why

, because if we are about to encode an AVI ( Captured Huffy ), there's no compression, because it's not an MPEG file.
So what I can implement in CQMatic, is "Levels of X"

For example: If the movie (source) has a well balanced distribution, where activity is linear throughout the movie, then a level of X1 is automatically chosen.
The more "unbalanced" ( activity varies ), the higher the X value is applied by CQMatic.

Are we in sync up to now

Yes!

Quote:

So a fast source analysis (quick scan, A-LA bitrateviewer), can give a footprint (roadmap), and the more complex the source bitrate distribution is, the longer the sampling (X factor) CQMatic will use.

Does this sound logical

Maybe even the log that bitrate viewer produces can be used for calculating the correct X factor to use

Edit: Bitrate viewer can't be used, because it analyzes compressed formats. So whatever program is cooked up to analyze the source, it must detect frame density/activity. Wonder if an AviSynth script could be written to do the job
Similar to what SansGrip did with Gripfit, where he analyzes several frames of the source, then goes back and does the auto resize. So a script that would "measure" activity, like the built-in functions "YDifferenceToNext()", etc., could be used to build up the activity roadmap.

-kwag

From my first post:

Quote:

Now if we could scan the movie first for some information on the compressibilty, maybe a motion metric based on the MA script, or the Q levels from a fast 1st pass, or some of that info that Bitrate Viewer shows (and that I can't understand ) and calculate some basic statistics like the average and the variance, we could use statistical decision theory to decide how many samples do we need for that movie, to make sure we meet a given error margin. We could even use that info to chose which samples are to be selected (this is a little more complicated).

That meant YDifferenceToNext()

incredible · #12 11-11-2003, 02:24 PM

Update:

Well nice thinking of ours ..... BUT!

What exactly happens when we use the X1 (or even x3) method of CQ matic and it won't match??????? That easely means there were some parts in a prediction "hole" which CQ matic and its prediction using CQ couldn't verify and which are needed

SO even if we know thats a complex or non complex movie like by adding a "YDifferenceToNext()" based analyse --- this will only give an average! Therefore CQ matic still doesn't know if in reality there could exist some peaks which have to be known by CQmatic for an acurate prediction!

So Complexitivity itself will not show CQmatic which x-method to choose!

Example (cause of my bad english):
A very calm romantic movie gives us an average "nf" of maybe 4 (just assuming) ... therefore CQmatic chooses X1 .... but there are just some little scenes containing a big Bitrate peak (for example sunlight refelxions on watersurfaces)... ok the average still will be less enough for CQmatic to assume its an "easy" movie ... and it starts the x1 prediction inculding "big" movie-holes cause of smaller samples during the whole prediction.
And therefore the risk still is very big that there some of the peaks could be missed during prediction .....

same average of failure is possible

So IMHO this won't work

But well see..... Im just assuming ...

Quote:

Originally Posted by Kwag

Bitrate viewer can't be used, because it analyzes compressed formats

.... mjpeg is also compressed! But doesn't use VBR .... (but that's what you meant

) and on the other hand if this AVERAGE-way above won't work (which doesn't in my theoretical opinion): Even then if we have to/could use a "Bitrateviewer"-like routine to "catch" these valuable peak/low-parts of the movie to get the prediction set right ..... mjpeg, huffyuv .. and so on ...non-VBR-based streams will be OUTSIDE of future acurate prediction methods done by CQmatic

NOOOOOOOO!

Cause thats where we are now ... verify complexibility vs. verify Compressibility/VBR

GFR · #13 11-12-2003, 06:03 AM

Quote:

Originally Posted by incredible

Update:

Well nice thinking of ours ..... BUT!

What exactly happens when we use the X1 (or even x3) method of CQ matic and it won't match??????? That easely means there were some parts in a prediction "hole" which CQ matic and its prediction using CQ couldn't verify and which are needed

SO even if we know thats a complex or non complex movie like by adding a "YDifferenceToNext()" based analyse --- this will only give an average! Therefore CQ matic still doesn't know if in reality there could exist some peaks which have to be known by CQmatic for an acurate prediction!

So Complexitivity itself will not show CQmatic which x-method to choose!

Example (cause of my bad english):
A very calm romantic movie gives us an average "nf" of maybe 4 (just assuming) ... therefore CQmatic chooses X1 .... but there are just some little scenes containing a big Bitrate peak (for example sunlight refelxions on watersurfaces)... ok the average still will be less enough for CQmatic to assume its an "easy" movie ... and it starts the x1 prediction inculding "big" movie-holes cause of smaller samples during the whole prediction.
And therefore the risk still is very big that there some of the peaks could be missed during prediction .....

same average of failure is possible

So IMHO this won't work

But well see..... Im just assuming ...

Quote:

Originally Posted by Kwag

Bitrate viewer can't be used, because it analyzes compressed formats

.... mjpeg is also compressed! But doesn't use VBR .... (but that's what you meant

) and on the other hand if this AVERAGE-way above won't work (which doesn't in my theoretical opinion): Even then if we have to/could use a "Bitrateviewer"-like routine to "catch" these valuable peak/low-parts of the movie to get the prediction set right ..... mjpeg, huffyuv .. and so on ...non-VBR-based streams will be OUTSIDE of future acurate prediction methods done by CQmatic

NOOOOOOOO!

Cause thats where we are now ... verify complexibility vs. verify Compressibility/VBR

The average "is not important"... The average "activity" can only hint the initial CQ for CQMatic to begin with.

Wee need the "variance" of the "activity". If most scenes in this calm romantic movie are "low-activity" and very few are "active", then most of the movie will have an "activity" close to the average "activity", that happens to be low. So you need few samples to predict within a given error margin. If you have an action movie, with active-only scenes, the average "activity" will be high, but as long almost all scenes are "active" they will have "activity" levels close to the average and you'll need a small sample.

Now, if you have a "calm" movie with a significant "active" part, and we look at the distribution of "activity", we'll see that while most of the movie has "activity" close to the average there's a reasonable part of it that deviates from the average. So it's got a higher variance and we need more samples (to guarantee that the high and low "activity" scenes are sufficiently sampled and at the right proportion). The same holds for an action movie that has a significant "calm" part.

So the "X-factor" for CQMatic has to be determined by the variance of activity. The average activity can just help to guess the initial CQ (and that's a poor guess as we know).

=======================

Another issue:

YDifferenceToNext() (if I understand it correctly) is a measure of "action" or "temporal activity".

Now imagine a movie that is mostly "blurred" and has some highly sharp and detailed parts (but has low motion through all the movie). -(This is not absurd; you can have a live action movie with some animation parts interleaved)- It will be difficult to predict (= it will need more samples) because the sharp, detailed scenes will need more bits than the blurred scenes.

So we need to measure the level of detail (high frequencies) or "spatial activity" too.

incredible · #14 11-12-2003, 06:50 AM

Quote:

YDifferenceToNext() (if I understand it correctly) is a measure of "action" or "temporal activity".

http://www.avisynth.org/index.php?pa...ditionalFilter
"YDifferenceToNext()" is a conditional filter (MA-Routine also is based on this command) and as the name says it is "Y" based which stands for "LUMA" therefore the command gives you a value refering to the frame's whole Luma-"average" compared to the next frame (I hope my english won't give a wrong description

but you can read the avisynth reference at www.avisynth.org ). Ive read about a filter which is able to check the pixels within a frame ... maybe this could give us a "needed" value of complexibility of different parts BUT the value still will be output as an average and therefore you still got the problem to just get an average of variance activity of the WHOLE movie.

(based on if I understood you well)

GFR · #15 11-12-2003, 07:55 AM

What we need is to do a 1st pass with no encoding, just "playing" the script (so it's fast), and log the value for each frame in a file. Then we open it in excel (or use a simple command line tool) and calculate the variance, or whatever statistics we find useful.

incredible · #16 11-12-2003, 08:14 AM

I also thought about something like that (but a bit different) yesterday in the afternoon before I posted my updated opinion above .... here's the quote of Kwags answer:

Quote:

Originally Posted by incredible

Maybe something like this????

#######################################
AssumeFPS(900) # for rapid playback to check the dff's
diff=YDifferenceToNext()
avg = # Here a formula which when playing every Frame the avg of all taken dff's until now will be updated
#######################################

Just thinking crazy. Cause I'm still at work

Quote:

Originally Posted by kwag

Hi incredible,

Can't do it in a script, because we cant' do recursive loops in a avisynth script.
It has to be done as an avisynth filter. Like Gripfit, deen, etc.
This way, you load your .avs in anything ( Vdub preferably ), and the filter does the "scan and write" to a log file, then exits.
Then this log file is analyzed for activity

-Karl

So IF we could generate a .dll code which would log the activity into a exe.-file you where right and that would give what you want!?

So somebody knows how to write filter-.dll codes ???

Jellygoose · #17 11-12-2003, 10:12 AM

Let's all shout for SansGrip

S - A - N - S - - G - R - I - P !!