Quantcast File Size Prediction Formula - Page 2 - digitalFAQ.com Forums [Archives]
Go Back    digitalFAQ.com Forums [Archives] > Video Production Forums > Avisynth Scripting

Reply
 
LinkBack Thread Tools
  #21  
01-08-2003, 12:07 PM
ARnet_tenRA ARnet_tenRA is offline
Free Member
 
Join Date: Jan 2003
Location: Illinois, USA
Posts: 73
Thanks: 0
Thanked 0 Times in 0 Posts
I was wondering how accurate the prediction is for people that sample the framerate rather than the GOP rate of 24. If it is negligible then we could switch to the simpler formula. Of course this is most important for PAL,25 or NTSC,29.97 fps.

Let me know.

-tenra
Reply With Quote
Someday, 12:01 PM
admin's Avatar
Site Staff / Ad Manager
 
Join Date: Dec 2002
Posts: 42
Thanks: ∞
Thanked 42 Times in 42 Posts
  #22  
01-08-2003, 01:22 PM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by ARnet_tenRA
I was wondering how accurate the prediction is for people that sample the framerate rather than the GOP rate of 24.
It's most accurate when the sample length matches the GOP length. We ran both formulas (along with about a dozen others) already .
Reply With Quote
  #23  
01-09-2003, 07:06 AM
girv girv is offline
Free Member
 
Join Date: Sep 2002
Posts: 108
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via ICQ to girv
Quote:
If you sample 1 second for every minute of your video (ie. 25
samples for 25fps or 30 samples for 29.97fps) then the formula for all
framerates would be:
Predicted Size = 60 * sample size
Thats what I said Sm = (Tm/Ts) * Ss. If you adjust this equation
for taking one GOP-length sample for every 60 seconds of movie you get:
  • Sm = 60 * (FR/Lg) * Ss
    Sm = movie file size
    FR = frame rate in frames per second
    Lg = GOP Length in frames
    Ss = sample file size
If you plug in the numbers for each frame rate you get the
constants ARnet_tenRA posted (59.94, 62.5, 74.925).

More generally this becomes:
  • Sm = (Tm/Cs) * (FR/Ls) * Ss
    Sm = movie file size
    Tm = movie length in seconds
    Cs = number of samples taken
    FR = frame rate in frames per second
    Ls = length of each sample in frames
    Ss = sample file size
Reply With Quote
  #24  
01-09-2003, 07:14 AM
girv girv is offline
Free Member
 
Join Date: Sep 2002
Posts: 108
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via ICQ to girv
Quote:
Originally Posted by kwag
Just (800 - audio size) / 60 will give you predicted sample size
Why 800? ISTR something about VCD's being written in "MODE2" (or
something) so you could fit more data on the disk, but does this mean
that I can fit an 800Mb .mpg file on to a 700Mb disk if its used as a
VCD / SVCD?!

I've been using an upper limit of 700Mb until now, but an extra 100Mb
would be very nice

/girv
Reply With Quote
  #25  
01-09-2003, 07:31 AM
girv girv is offline
Free Member
 
Join Date: Sep 2002
Posts: 108
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via ICQ to girv
Quote:
Originally Posted by SansGrip
It's most accurate when the sample length matches the GOP length. We ran both formulas (along with about a dozen others) already .
Just a thought, but I wonder if the prediction would be even more
accurate if sample strips were started on frame numbers that were
multiples of GOP length instead of multiples of one second?
eg: sample 24 frames every 24*60 frames instead of every
framerate*60.

Im thinking that this way you would be creating the sample with
GOPs that would actually be in the final encode. Mad?
Reply With Quote
  #26  
01-09-2003, 08:39 AM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by girv
I've been using an upper limit of 700Mb until now, but an extra 100Mb would be very nice
Then today's your lucky day, because you can indeed get a maximum of 800mb on a VCD. To be exact, 813,019,155 bytes .
Reply With Quote
  #27  
01-09-2003, 08:41 AM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by girv
Just a thought, but I wonder if the prediction would be even more accurate if sample strips were started on frame numbers that were
multiples of GOP length instead of multiples of one second?
They should always be multiples of GOP length. If your frame rate is not 23.976, you should specify the sample length, i.e. Sampler(length=24). If your frame rate is 23.976 you can just use Sampler() since the rounded frame rate happens to match the GOP length.
Reply With Quote
  #28  
01-09-2003, 10:45 AM
girv girv is offline
Free Member
 
Join Date: Sep 2002
Posts: 108
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via ICQ to girv
Quote:
Originally Posted by SansGrip
They should always be multiples of GOP length. If your frame rate is not 23.976, you should specify the sample length, i.e. Sampler(length=24).
I was referring to the start frame of the sample strip not the number
of frames in it, which as you say should be equal to the GOP length.

e.g.: if your frame rate is 25fps then Sampler(length=24) will take
sample strips starting at frame 0, 1500 (25*60*1), 3000 (25*60*2),
4500 (25*60*3) ... correct? What I am suggesting is to instead align
the start of the sample strip to the multiple of GOP length closest to
these numbers i.e.: 0, 1488, 3000, 4488 ...

fps: 23.976
current sample start: 0,1437,2877,4316,...
proposed: 0,1440,2880,4320,...

fps: 25
current: 0,1500,3000,4500,...
proposed: 0,1488,3000,4488,...

fps: 29.97
current: 0,1798,3596,5395,...
proposed: 0,1800,3600,5400,...

The differences aren't much (+- 12 frames at most) but I just wondered
if it could give a little extra accuracy
Reply With Quote
  #29  
01-09-2003, 10:52 AM
girv girv is offline
Free Member
 
Join Date: Sep 2002
Posts: 108
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via ICQ to girv
Quote:
Originally Posted by SansGrip
Then today's your lucky day, because you can indeed get a maximum of 800mb on a VCD. To be exact, 813,019,155 bytes .
Sorry to be dumb but if I have a .mpg file on my hard drive that
is 810,000,000 bytes then it can be burned on to a standard 700Mb
CD-R as a VCD? Happy day !

What about SVCD? Is that the same?

What is the overhead for VCD/SVCD ie: how big can a .mpg file
be on my hard drive and still (just) fit on to a 700Mb CD-R ?
Reply With Quote
  #30  
01-09-2003, 11:32 AM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by girv
I was referring to the start frame of the sample strip not the number of frames in it, which as you say should be equal to the GOP length.
Ah, I see. It might make it more accurate, yes, but my gut says not a lot. I'll have to modify Sampler slightly and try it .
Reply With Quote
  #31  
01-09-2003, 11:36 AM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by girv
What about SVCD? Is that the same?
Roughly, though I believe the SVCD filesystem is slightly different, so will have slightly different overhead.

Quote:
What is the overhead for VCD/SVCD ie: how big can a .mpg file
be on my hard drive and still (just) fit on to a 700Mb CD-R ?
The figure I gave (813,019,155 bytes) is compensated for filesystem overhead and the system stream. In other words, if you subtract from that the size of your audio (which I always encode first), you'll get the maximum number of bytes for your video stream.

If you have a .mpg file then it's already got the system stream in it, so the maximum byte count will be 825,105,664.

(By the way, this is how I always do my prediction:

813,019,155 - audio_bytes = max_video_bytes
max_video_bytes / frames_in_movie = bytes_per_frame
bytes_per_frame * frame_count_with_sampler = sample_bytes

It's almost always accurate within 0.5% or so. It's more involved than the regular formula, but I'm testing it for the next release of KVCDP .)
Reply With Quote
  #32  
01-09-2003, 12:39 PM
ARnet_tenRA ARnet_tenRA is offline
Free Member
 
Join Date: Jan 2003
Location: Illinois, USA
Posts: 73
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by girv
e.g.: if your frame rate is 25fps then Sampler(length=24) will take
sample strips starting at frame 0, 1500 (25*60*1), 3000 (25*60*2),
4500 (25*60*3) ... correct? What I am suggesting is to instead align
the start of the sample strip to the multiple of GOP length closest to
these numbers i.e.: 0, 1488, 3000, 4488 ...
How about this, always sample (24*60*n) as the starting frame for each sample. Don't worry about the frame rate at all. ie. 0, 1440, 2880, 4320, . . . When you do it this way all sample MPEGs produced will be exactly 1/60th of the final movie.

fps: 23.976
current sample start: 0,1437,2877,4316,...
proposed: 0,1440,2880,4320,...

fps: 25
current: 0,1500,3000,4500,...
proposed: 0,1440,2880,4320,...

fps: 29.97
current: 0,1798,3596,5395,...
proposed: 0,1440,2880,4320,...

This will have the benefit of aligning with the GOP like girv suggested and having the simplest formula no matter the length of movie or framerate:
Predicted Size = 60 * sample size

-ARnet_tenRA
Reply With Quote
  #33  
01-09-2003, 02:02 PM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by ARnet_tenRA
When you do it this way all sample MPEGs produced will be exactly 1/60th of the final movie.
The main requirement of the formula is to be accurate, not necessarily simple . A great deal of testing indicates that the most accurate formula for all kinds of sources and all resolutions is minutes-in-movie samples, each max-gop-size frames in length. That said, we should obviously test this method out against the current one .

Sampler uses a pretty simple algorithm to decide which frames to select. The curious can take a look at the source code here.

Quote:
This will have the benefit of aligning with the GOP like girv suggested
I'm not sure that aligning with the GOP will make things any more accurate, but I can certainly build a test version of Sampler with that modification and try it out.
Reply With Quote
  #34  
01-09-2003, 02:04 PM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by ARnet_tenRA
Don't worry about the frame rate at all.
The frame rate is only used by Sampler as a default sample length if none is specified -- it's not a part of the formula. I figured that movie-length-in-minutes samples of one second each would be a good generic default, and happened to correspond to the current file prediction formula for KVCD.
Reply With Quote
  #35  
01-10-2003, 09:03 AM
ARnet_tenRA ARnet_tenRA is offline
Free Member
 
Join Date: Jan 2003
Location: Illinois, USA
Posts: 73
Thanks: 0
Thanked 0 Times in 0 Posts
Hi all,

I just ran a test last night using my suggested formula, and I got some pretty exciting results. .0001% accuracy!!!

AVISynth script:
Code:
LoadPlugin("MPEG2DEC.dll")

mpeg2source("sample.d2v")

SelectRangeEvery(1440,24)
  • Sample file size = 12,091,021 bytes
    Predicted movie size = 60 * 12,091,021 = 725,461,260 bytes
    Actual final Movie size = 725,551,421 bytes
    Error = 90,161 bytes
    % error = 0.0001
Based on these results I would like to try some more movies and see what I get.
Reply With Quote
  #36  
01-10-2003, 10:04 AM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by ARnet_tenRA
SelectRangeEvery(1440,24)
Note that SelectRangeEvery doesn't seem to be quite accurate. The old prediction method, which used SelectRangeEvery, produced sample strips that had a frame count fairly significantly different from what should have been produced.

That's one of the reasons I wrote Sampler .
Reply With Quote
  #37  
01-10-2003, 11:04 AM
ARnet_tenRA ARnet_tenRA is offline
Free Member
 
Join Date: Jan 2003
Location: Illinois, USA
Posts: 73
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by SansGrip
Note that SelectRangeEvery doesn't seem to be quite accurate. The old prediction method, which used SelectRangeEvery, produced sample strips that had a frame count fairly significantly different from what should have been produced.

That's one of the reasons I wrote Sampler .
I knew that you could get up to one sample too many, but I was unaware of any more offset than that. Let me know it this is not the case.

I used SelectRangeEvery because I knew exactly where the frames would be captured from in the video (every 1440 frames). Maybe this is not the case.

Anyways, let me know if sampling 24 frames every (24*60*n) frames gives you as accurate results as I got. Whether you use Sampler or SelectRangeEvery.

-ARnet_tenRA
Reply With Quote
  #38  
01-10-2003, 05:18 PM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by ARnet_tenRA
I knew that you could get up to one sample too many, but I was unaware of any more offset than that. Let me know it this is not the case.
Using the old prediction method I would get mismatches of a dozen or two frames. Not normally a big deal, but it is for our purposes.

Quote:
Anyways, let me know if sampling 24 frames every (24*60*n) frames gives you as accurate results as I got.
I will try that out once GripFit is released. I don't want to get distracted by anything right now so I can get it out as quickly as possible .
Reply With Quote
  #39  
01-10-2003, 05:21 PM
kwag kwag is offline
Free Member
 
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
AHA!, will that be the official name, GripFit

-kwag
Reply With Quote
  #40  
01-10-2003, 05:23 PM
SansGrip SansGrip is offline
Free Member
 
Join Date: Nov 2002
Location: Ontario, Canada
Posts: 1,135
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by kwag
AHA!, will that be the official name, GripFit
I haven't got round to deciding yet .
Reply With Quote
Reply




Similar Threads
Thread Thread Starter Forum Replies Last Post
File size prediction program ? genK Avisynth Scripting 3 05-24-2003 07:21 AM
File Size Prediction For PAL... Jellygoose Avisynth Scripting 8 01-01-2003 09:18 PM
FitCD File Size Prediction Paul0889 Avisynth Scripting 2 12-21-2002 01:03 AM
KVCD: File Size Prediction with FitCD? Jellygoose Video Encoding and Conversion 3 12-17-2002 10:07 AM
TMPGEnc: File size prediction akrein62 Video Encoding and Conversion 0 11-15-2002 10:16 PM




 
All times are GMT -5. The time now is 07:56 AM  —  vBulletin © Jelsoft Enterprises Ltd