Avisynth: Predicting File Size for VBR ..... - Page 3

kwag · #41 10-02-2002, 07:46 PM

Quote:

Originally Posted by black prince

Hey Kwag,

What CQ did you use for "Green Mile" KVCD-LBR? You
have recommended 20 for 3+ hours movies in the past.
What did your tests tell you about increasing CQ?.

-black prince

Sorry, I forgot to put that in the previous post

The CQ_VBR value was 19. That's what gave me a predicted size of 643,377

With CQ_VBR of 20, the predicted file size was close to 700MB ( 670,153KB to be exact ). So 19 was the closest to my target, which was 645MB ( 645MB video+ 155MB audio = 800MB ). You can use decimals in CQ_VBR. Like 19.5, etc. But it's just too much hassle. I'll stick to integers. Here's the script I used:

LoadPlugin("C:\encoding\MPEG2DEC.dll")
LoadPlugin("C:\encoding\Convolution3d.dll")
mpeg2source("C:\THE_GREEN_MILE\VIDEO_TS\mile.d2v")
BilinearResize(320,160,0,0,720,480)
Convolution3d (1,0,0,7,7,3,0)
#TemporalSmoother(2,2)
AddBorders(16,40,16,40)
#IL = Framecount / 100 # interval length in frames.
#SL = round(Framerate) # sample length in frames.
#SelectRangeEvery(IL,SL)

-kwag

Ozzi · #42 10-03-2002, 06:42 AM

G-Day all,
I'ts great to read that everyone is getting such good results.
The encoded movie size, being slightly smaller than predicted,
is probably the safest way to go.
Having said that, please check out the tweaks below.

After exhaustive testing, I have confirmed my original notion that
there is a direct correlation between the GOP and sample length
when it comes to accurate file size prediction.

It is generally agreed that using a lot of small samples is better
than a few large ones.

The tests were done in a controlled environment, so of course there
will be exceptions to the finding, in the real world.

KVCD Template and TMPGEnc, Optimal sample lengths in frames.
(Lower values recommended.)

KVCD-LBR-352x288-(PAL)-PLUS. ---- 14, or 29, or 44, or 59.
KVCD-CQ-352x288-(PAL). ---- 47, 95.
KVCD-CQ-352x288-(PAL)-PLUS. ----- 47, 95.
SKVCD-352x576-(PAL). ---- 47, 95.
KVCD-CQ-352x576-(PAL). ---- 47, 95.
KVCD-CQ-352x576-(PAL)-PLUS. ---- 47, 95.
KDVD-352x576-Half-D1-PAL. ---- 12, 25, 38, 51.
KDVD-720x576-Full-D1-PAL. ---- 12, 25, 38, 51.
KVCDx2-CQ-704x576-(PAL)-1CD. ---- 47, 95.
KVCDx2-CQ-704x576-(PAL)-2CD. ---- 47, 95.
KVCDx2-CQ-704x576-(PAL)-PLUS. ---- 47, 95.
KVCDx3-MPEG-1-PAL. ----47, 95.
KVCDx3-MPEG-2-PAL. ---- 47, 95.

Standard-SVCD-480x576-(PAL) ---- 17, 35, 53.
XVCD-352x288-(PAL) ---- 17, 35, 53.
DVD-720x576-(PAL) ---- 12, 25, 38, 51.

If anyone is interested in confirming these results, or obtaining
optimal sample lengths for NTSC, let me know and I will give you
a run down on the methodology.

Next:
Determining optimal interval length.

Ozzie -

black prince · #43 10-03-2002, 08:05 AM

Hey Ozzie,

It appears all you tests were performed using PAL templets.

Are there NTCSFilm or NTCS tests? If you are correct and there
is a correlation between GOP and interval test size, can you
confirm this with samples showing Actual vs Estimated file size.
For example, do real world tests using a video's to perform
tests for 3+ hour (an extreme), 2 hour (an average) and
say 90 minutes (small) to show how accurate your findings are.

-black prince

Ozzi · #44 10-03-2002, 09:06 AM

Hello black prince,

Yes it’s true, all my test were performed using PAL templates.
No, I have not done any NTCSFilm or NTCS tests, although the
result should be the same as PAL.

If you like, I can explain how you, or anyone else can do them.
Once you have completed the tests you will see without any
doubt that my findings are true and correct. My tests were done
in a controlled environment, to keep all known variables in check
and to focus on just one aspect. I have also done real word tests
and the results support my theory.

I believe this and other forums are invaluable learning tools
and I also believe that people should give and take equally
when it comes to sharing information.
(Do not take this personally, black prince)

Ozzie. -

kwag · #45 10-03-2002, 09:28 AM

Hi black prince,

One of the reasons I used the "one second window snapshot" was to discard any possibilities of long GOP variations. You see, if you take a long snapshot, longer than say one second, some samples will have scene changes which introduce a new I frame. Other samples may only have a single I frame, if there was no scene change detected or if it's a long still scene. That's why the estimated file size always will be ( should be ) larger than the actual final size. So the tight one second snapshots really give you a "worst case" scenario, and probably 99.9% of the times, your actual file size will be very close to your estimated file size. I think -0.62% accuracy is so close, that I'm not going to fool around anymore trying to find anything more optimal

If that was the accuracy I got on a 3 hour movie with 100 samples, then any movie less that 3 hours will be the same or even more accurate because the same 100 samples are being taken, but on a shorter time span. So the shorter the movie is, the larger is the resolution window, and the higher the accuracy.

-kwag

black prince · #46 10-03-2002, 11:13 AM

@Ozzie,

I hope my choice of words did not offend you.

You have put in
such a great amount of effort and I, along with others have nothing
but praise for how you worked the project.

I suppose what I was
thinking that the script is working so well (-0.62% diff) why
tinker with it further.

I am very pleased with how well it works
now. Again, I didn't take this personally and my questions
were not meant to offend.

@Kwag,

I am producing such great results with your refinements to
this script, that I encoded 3 films and achieved quality that
would have been crude estimates in the past. I realized you were
using a worst case scenero for file prediction, intentally
estimating a slightly over size file. This made sense to me from
the beginning. It's a great strategy.

-black prince

kwag · #47 10-03-2002, 06:28 PM

This is good GOod GOOD

Just finished "High Crimes", which is a 115 minute movie, on the KVCDx3.
Estimated file size was 671,851MB
Final file size after encoding 673,403MB

The difference is only 1.55MB

or +0.23% accuracy on this one

CQ_VBR used was 11.5 with the KVCDx3 NTSC. English subtitles on this one, to add to the complexity

This is getting really GOOOOOOD ROTFLMAO

-kwag

muaddib · #48 10-03-2002, 06:47 PM

Hi kwag!

Did you predict "High Crimes" with or witout the subtitles?
I mean, you add the subtitles just in the final encoding or also with the predict sample?

kwag · #49 10-03-2002, 07:03 PM

Quote:

Originally Posted by muaddib

Did you predict "High Crimes" with or witout the subtitles?
I mean, you add the subtitles just in the final encoding or also with the predict sample?

Hi muaddib,

The subtitles are encoded on the file prediction sample too.
Here's my script with the prediction portion that I now append to every .avs script for calculation:

LoadPlugin("C:\encoding\MPEG2DEC.dll")
LoadPlugin("C:\encoding\Convolution3d.dll")
LoadPlugin("C:\encoding\vobsub.dll")
mpeg2source("K:\HIGH_CRIMES\VIDEO_TS\crimes.d2v")
vobsub("K:\HIGH_CRIMES\VIDEO_TS\VTS_04_0")
BilinearResize(336,192,45,0,630,480)
Convolution3d (1,0,0,7,7,3,0)
#TemporalSmoother(2,2)
AddBorders(8,24,8,24)
###------------------- Start Of File Size Prediction -------------------###
#
IL = Framecount / 100 # interval length in frames.
SL = round(Framerate) # sample length in frames.
SelectRangeEvery(IL,SL)
### Final MPEG size = ( ( Total frames / Framerate) / 100 ) * (MPEG sample file size * .95) ###
#
###-----------------------End File Size Prediction----------------------###

-kwag

Ozzi · #50 10-04-2002, 02:43 AM

Hi black prince.

Quote: I suppose what I was thinking that the script is working so
well (-0.62% diff) why tinker with it further.

My posts in this forum are intended to engender discussion and debate.
Some people take for gospel what they read, while others will question,
Speculate, investigate and come to their own conclusions.
I would put Kwag into the latter category.
If Kwag believed what he was told about fitting 120 minutes of good
quality video on one cd, we would not have the original kvcd templates.
If Kwag assumed that the original templates were close enough to perfect,
he would not tinker with them, trying to find anything more optimal.

Ozzi · #51 10-04-2002, 02:49 AM

Hello kwag.

Sorry if I put you off the track with my script. The Framerate value was
an attempt by me to keep the script universal between pal, ntsc, etc.
When it comes to best or worst case scenario and choosing sample length,
Framerate has very little to do with it.
Taking tight one second snapshots will rarely produce the worst case
scenario.
Depending on the template used, a one second sample, could be the best
case. You may have confirmed this already with one of your latest encodes, The Green Mile.

NTSC - 29.97fps one second sample= 29.97 frames, etc.
Pal - 25fps one second sample= 25 frames, etc.

Using a one second sample will introduce a x% degree of error on
encode(a)
Using a kvcd template with a different intrinsic GOP structure on encode(a)
will not introduce the same degree of error.

Take for example: kvcd-lbr-352x288-(pal)-plus.
Best sample lengths. 14, 29, 44, 59 frames, and so on.

Worst sample lengths. Starting at 1 then getting progressively better the
closer you get to the first optimal of 14.
Then 15 and getting progressively better the closer you get to the second
optimal of 29, and so on.

If the intrinsic GOP structure of the kvcd templates are the same between
pal, ntsc, etc, the data I supplied will be identical.

Ozzie.-

black prince · #52 10-04-2002, 11:29 AM

Qzzi,

Quote:

It is impossible to make anything foolproof because fools are so ingenious.

This pretty much describes you. Are flame war is over.

-black prince

kwag · #53 10-04-2002, 01:03 PM

Quote:

Originally Posted by Ozzi

Hello kwag.

Sorry if I put you off the track with my script.

Not at all Ozzi

Quote:

The Framerate value was
an attempt by me to keep the script universal between pal, ntsc, etc.
When it comes to best or worst case scenario and choosing sample length,
Framerate has very little to do with it.

Hold on, yes it does. If your movie is PAL(25fps), and you encode three samples, one at 23.976(24), 25, 29.97(30) you will get three different results. The most accurate will be the one with your movie's native frame rate.

Quote:

Taking tight one second snapshots will rarely produce the worst case
scenario.

The worst case scenario will always be the shortest sample, being the less B and P frames. That's because the shortest sample will always have the least compression, because there are less prediction frames ( B and P ) and it will always give you a larger mpeg than if you use longer snapshots. I choose one second, because it's short enough to always predict a file size larger than the actual sample made, and then adjust the formula by .95, which seems to be the constant that targeted the +- 1% accuracy. I say +-1% because I have now made 4 movies. Three of them came out to ~-.6% and one came out to ~+.24~. So I can say that the formula does indeed create predicted file size of less than +-1%.

Quote:

Depending on the template used, a one second sample, could be the best
case. You may have confirmed this already with one of your latest encodes, The Green Mile.

NTSC - 29.97fps one second sample= 29.97 frames, etc.
Pal - 25fps one second sample= 25 frames, etc.

Using a one second sample will introduce a x% degree of error on
encode(a)
Using a kvcd template with a different intrinsic GOP structure on encode(a)
will not introduce the same degree of error.

Take for example: kvcd-lbr-352x288-(pal)-plus.
Best sample lengths. 14, 29, 44, 59 frames, and so on.

Worst sample lengths. Starting at 1 then getting progressively better the
closer you get to the first optimal of 14.
Then 15 and getting progressively better the closer you get to the second
optimal of 29, and so on.

If the intrinsic GOP structure of the kvcd templates are the same between
pal, ntsc, etc, the data I supplied will be identical.

Ozzie.-

Yes, I understand that if the GOP is larger, the % for error will always be larger, if I keep the constant snapshot of one second. But that's to the advantage of users, because the longer the GOP, the smaller will be the file size, and it will always be smaller than the actual final file size. The problem with estimating with large GOP's is that you won't get the same accurate results with an action movie and with a low action drama type movie. The drama will throw off the calculations, because there will be very long GOP sequences with not very much I frames. On the action film, where you have many scene changes, etc, you'll get a completely different final size compared to the estimated file size.
Thus, if I keep the sample snapshot at one second, a point where there's not much difference whether the movie is action or drama, the % accuracy is higher and always will/should give slightly smaller final file size then the predicted one, or slightly larger than actual if the movie is an action type film.
I was able to double check three samples made from a same movie, and only changing the frame rate to 24, 25, and 30, and the sample file sizes are: 11,425KB, 12,425KB, and 14,607KB respectively. So you see, there's a big difference if you change your sample frame rate to something other than the film's native frame rate.
Anyway, what we all do here is optimize stuff, and your calculations and methods are very welcome!. It's the user's choice to take advantage of all the things we put here

Regards,
-kwag

Ozzi · #54 10-04-2002, 08:48 PM

G-Day -kwag -

To summarize my findings:
Taking a snapshot at anything other than the optimal size will introduce an unnatural scene change.
Taking a snapshot of an optimal size will not introduce an unnatural scene change.
Therefore taking more snapshots of an optimal size will give you a more accurate prediction.
Unless you take these snapshots with no interval, there will always be some degree of error.

It is my opinion that:
Taking these things into account and adding a predetermined buffer into the equation for safety,
will give the most reliable prediction.

The only question in my mind is, what is the most practicable interval length.

Regards,
Ozzie. -

kwag · #55 10-04-2002, 09:06 PM

Quote:

Originally Posted by Ozzi

The only question in my mind is, what is the most practicable interval length.

Regards,
Ozzie. -

Hi Ozzi,

It all depends on how much you want to wait until the test sample is done. The longer ( larger ) the lenght is, the longer the encoding. That is maintaining the same number of snapshots constant. For example 100 samples per movie. That would be "widening" the sampling window, and would theoretically increase the accuracy. Just like increasing the number of snapshots to 200, and halving the window to 1/2 second, should also increase the resolution of the sampling and give a more precise result because we are taking 200 points instead of 100. But here we have a problem of sampling a very small time frame ( 12 frames in a 24fps movie ), and there is less compression evaluated because there are less B and P frames than in a full 24fps frame shot. So far, the time slot of 1 second has provided me with the +-1% accuracy, and an average time of 3 to 4 minutes to encode each test sample. It usually takes me 2 to 3 sample encodes to reach my goal of approximation to my estimated file size. I have tested this now with extreme resolutions like 352x240 KVCD_LBR and 528x480 KVCDx3, and the results are consistent. So I'll stick to that for now, because it has not failed yet, and the last 3 movies I've done, have been right on the +-1% target that I had estimated.

Regards,
-kwag

aderunn3r · #56 10-05-2002, 10:44 AM

hi kwag
with this step;
e) Check and view the source range in TMPGEnc, hit the
default button, move to end frame, adjust the slider
to an even minute mark, then set end frame
when in mentions even minute mark do u mean like 2 minutes or just a minute mark like 1 minute. also when u mention * the mpeg sample file by .95 say if my sample size was 10.6mb (11,123,314 ) wud i times by 10.6 or the number in the bracket.

kwag · #57 10-05-2002, 02:30 PM

Quote:

Originally Posted by |3|aderunn3r

hi kwag
with this step;
e) Check and view the source range in TMPGEnc, hit the
default button, move to end frame, adjust the slider
to an even minute mark, then set end frame
when in mentions even minute mark do u mean like 2 minutes or just a minute mark like 1 minute. also when u mention * the mpeg sample file by .95 say if my sample size was 10.6mb (11,123,314 ) wud i times by 10.6 or the number in the bracket.

Hi = |3|aderunn3r,

If you appended the following to your .avs script:

###------------------- Start Of File Size Prediction -------------------###
#
IL = Framecount / 100 # interval length in frames.
SL = round(Framerate) # sample length in frames.
SelectRangeEvery(IL,SL)
### Final MPEG size = ( ( Total frames / Framerate) / 100 ) * (MPEG sample file size * .95) ###
#
###-----------------------End File Size Prediction----------------------###

You don't select any source range in TMPEG. That's the function of the script. The range is already calculated with the SelectRange() function.

-kwag

Ozzi · #58 10-06-2002, 12:22 AM

Hi kwag,
I'm glad we agree on at least one thing. -

(1) What works best for the individual user is what they should do.

Do you acknowledge that my findings on optimal sample lengths
are correct.
If you do, why will you continue to introduce an as yet undefined
variable into your equation.
If not, are interested in the methodology and how you can test this
for yourself.

Regards,
Ozzie.

kwag · #59 10-06-2002, 02:01 AM

Quote:

Originally Posted by Ozzi

Do you acknowledge that my findings on optimal sample lengths
are correct.
If you do, why will you continue to introduce an as yet undefined
variable into your equation.
If not, are interested in the methodology and how you can test this
for yourself.

Regards,
Ozzie.

Hi Ozzi,

Actually, no. I don't. Your calculations were always producing 5% to 6% smaller predicted files than actual real encoded file size. I was able to double check that against my formula, in my last six encoded movies.
Right now, I'm getting +-1% with my formula, which I believe it's optimal, and I won't touch it anymore because of the accuracy it's producing.
I do see you're trying to zero in on yours, because you started out with a PT = 5, then you changed it to 17, then you changed it to 20 or so in the last update at vcdhelp here http://www.vcdhelp.com/forum/userguides/114551.php

I'm a firm believer of "Don't touch what's not broken", and right now +-1% is fair enough. I would dare to say that it's as accurate or even more accurate that a X-pass VBR calculated CCE or TMPEG encode.

Don't you think that +-1% ( or even+-2% ) is an accurate prediction, taking into consideration all the variables that are present

Mind you, if you do devise a formula that increases the accuracy to 0.5%, I won't hesitate to use your formula immediately

Cheers,
-kwag

holgerschlegel · #60 10-06-2002, 06:23 AM

hi,

maybe I've made an error using the filesize prediction script but I got the following results. The source movie (Terminator, captured from TV) is about 95 min. long. I want to use the KVCD+ 352x288 template to encode it to one cdr. The filesize prediction encoded to a 13678KB file. Using the formula to calc the estimated total file size is 740663KB. After encoding the file is 800727KB, a difference of ~60MB ?!?!

Any idee whats going wrong?

thx,
Holger