![]() |
two ideas for CCE prediction
Today I had a bit of free time to wonder and had two ideas for CCE prediction, and both could be implemented in DIKO. Here they go:
1) A better mathematical model for CCE QFactor: Right now I am using a linear model. I try Q=1(MIN), Q=40 (MAX), calculate the next Q Factor to try as if the QFactor curve was linear. If the sample is 3% bigger or smaller than the target, prediction is over, otherwise this QFACTOR becomes MIN or MAX (depending if the it was bigger or smaller) and the process starts again. But this leads to the need of 4 or 5 samples sometimes, exactly because linear approach is a bad one. If we could find a function that it's closer (exactly would be impossible), less cycles would be needed. Also CCE doesn't release that much versions as TMPGENC so this would be more viable 2) After finding the right Q Factor, encode again using different scenes (as in ping-pong method) and a larger sample range like 5% ( sampler(length=75) instead of sampler(length=15)): a) If the result is X% bigger, decrease QFactor by 1, b) If the result is Y% bigger, decrease QFactor by 2, c) If the result is X% smaller, increase QFactor by 1, d) If the result is Y% smaller, increase QFactor by 2, This are just ideas, I still could not find time to play with it. But to me looks like togheter, prediction would be much more accurate taking about the same time it took before. EDIT: If anyone wants data to play with, there's this old newbee post here with a excel spreadsheet with Q Factor vs size for a sample he tried. The link is still working: http://fruchtiger.tripod.com/CCE-KDVD_Q1-40.xls http://www.kvcd.net/forum/viewtopic....r=asc&start=16 |
http://www.digitalfaq.com/archives/error.gif
Vertical is Size, Horizontal is QFactor. The pink line is the actual size and blue line is the calculated size in the first interaction. So that's why DIKO is taking many predictions cycle before finding the right QFactor. The right QFactor is only found with 1 or 2 cycles (before the initial two with Q1 and Q40) only if the QFactor is next to the borders (38,39,40 or 1,2,3). It would be better to try Factor 20, then try 1 or 40 (depending if the file got bigger or smaller) and use the method described in the previous post from this point on. Everyone please note: what was said above is intended to make prediction go faster, not make it more precise. To make it more precise, after the right Q Factor is found, it could be fine-tuned using larger samples (3% or 5% instead of 1%). I am still thinking about how this can work. :D |
The first idea is already implemented and working for the next release of DIKO. And results are amazing: ideal Q Factor is found with only 1 cycle after doing MAX and MIN. Only 4 cycles are needed to find the right QFactor if it's between 1 and QMax/2, and 3 cycles if it is between QMax/2 and QMax. Nobody posted here (yet?), but if anyone is interested I can provide a simple how-to describing how to do this manually. :wink: I tested for the QFactor between 1 and 40, I am not sure if it would work for higher QFactors, but I don't recommend using QFactor higher than 40 to prevent bad quality.
Now I am still thinking about how idea 2 can work... :? |
the "how to" will be cool but....
"only" for D.I.K.O. :?: :( ps: you understand the "only"! (over the over) :lol: |
No, this can be used manually too. :D
You must be familiar with the prediction described in my guide. But of course this won't be a problem for you Jorel. :D I'll be using QMin=1 (Maximmum quality possible) and QMax=40 (worse quality acceptable) 1) First calculate the idel sample size using the method described in my guide or my calculator. 2) Add the line sampler (length=15) in the end of the script. Encode using CCE at QMax/2, i.e. , Q=20. and write down the size, which will be called Q20_Size. Now do 3A or 3B according to the obtained sample size you got. 3A) If the encoded sample comes bigger than ideal sample size you calculed before, encode at Q=40. The size of the obtained sample will be called Q40Size. The Final Q will be calculed using this formula: 40- ((Ideal_SAMPLE_Size)*40-20)/(Q20_Size-Q40size) 3B) If the encoded sample comes smaller than ideal sample size you calculed before, encode at Q=1. The size of the obtained sample will be called Q1_Size. The final Q will be calculed using this formula: 20- ((Ideal_SAMPLE_Size)*20-1)/(Q1_Size-Q20size) If you want, you can run one more cycle to check if the calculated QFactor gives a sample that is around 3% the ideal size, but in all 5 tests I did till now, this has always happened. This way you can find the ideal QFactor in only 2 cycles., if it's between 1 and 40! 8O 8) Of course, I just invented all this (this idea has less than 24 hours) so it's highly experimental. :wink: Fell free to test. :D |
vmesquita,
have you tried a Newton-Raphson search instead of a binary search? r6d2 from doom9 forum has an Excel worksheet to do this, if you want to give it a try. http://www.geocities.com/r6d2_stuff/ |
I got this spreadsheet a while ago but didn't understand how to use it. :oops: Anyway, this method I am describing here could look simple and silly, but it has worked in 5 tests. :D Of course it's a CCE specific method and the 5 tests is too little to say something, and I also don't know if it's efective in every CCE version since it's based in the graphic shown above. But I am amazed with this possibility of finding the correct QFactor in few steps, and them be able to fine tune using larger samples.
BTW, using larger samples is really efective against undersizing. Look at this test I just did: Sampler(length=15) ------> Size 23.22 Mb Sampler(length=60) ------> Size 83,9 Mb (If the proportion was correct it should give 92,88 Mb). This movie actually gave me a greatly undersized file so that's why I am using it for the tests. The undersize was around 10% and the bigger sample somehow shows it. :D EDIT: I am using CCE 2.67.00.09 for this tests. |
Using the experimental data New_Bee provided, which may not be that generic but I don't have time to check this now, I came to the following conclusion:
Between 2 and 6, Size decreases about 1.2% every time you lower it Between 7 and 22, Size decreases about 2.8% every time you lower it Between 23 and 34, Size decreases about 2.1% every time you lower it Between 35 and 40, Size decreases about 1.5% every time you lower it The idea is: after doing the prediction process described in my previous post, create a big sample ( sampler(length=60) ) and use the data above to fine tune the Q Factor avoiding oversized/undersized files. I just implemented this in DIKO (I am starting to find easier to implement and test than test manually :D ) and I am testing right now. I'll let you know the results. |
i have a new .mpv with 2,23Gb that i did using Q=5 for start the tests!
i did another full encode using Q=1 and the size is 2,53Gb ! another full encode using Q=10 and got ~1,85Gb(was trashed,don't remember the correct size) (i was testing full encodes with differents Q values,before you post this thread) have some parameter that you want to test? i have this new .mpv with 2,23Gb ( Q=5 ) for reference! :) |
Quote:
Code:
Encoder 1http://www.angelfire.com/droid/r6d2/ |
Quote:
Actually, I was thinking of a improved version, since only two cycles are actually needed, this cycles can be done using larger samples, making prediction as accurate as you want. I rewrote the mini-how-to. Jorel, if you have the time, please test this. :D 1) First calculate the idel sample size using the method described in my guide or my calculator. Multiply by 4. 2) Add the line sampler (length=60) in the end of the script. Encode using CCE at QMax/2, i.e. , Q=20. and write down the size, which will be called Q20_Size. Now do 3A or 3B according to the obtained sample size you got. 3A) If the encoded sample comes bigger than ideal sample size you calculed before, encode at Q=40. The size of the obtained sample will be called Q40Size. The Final Q will be calculed using this formula: 40- ((Ideal_SAMPLE_Size)*40-20)/(Q20_Size-Q40size) 3B) If the encoded sample comes smaller than ideal sample size you calculed before, encode at Q=1. The size of the obtained sample will be called Q1_Size. The final Q will be calculed using this formula: 20- ((Ideal_SAMPLE_Size)*20-1)/(Q1_Size-Q20size) If you want, you can run one more cycle to check if the calculated QFactor gives a sample that is around 3% the ideal size. This way you can find the ideal QFactor in only 2 cycles., if it's between 1 and 40! :D |
@ vmesquita
I didn't tested your method yet but decided to give u some ( maybe useful ) info :D I'm using manual prediction method with CCE 2.66 and result is 8O 8O 8O Judge yourself : With sampler(length=75) as Jorel found and reported while ago in another thread , predicted size - 4043 mb and resulted - 4024 mb :!: :!: :!: Only 19mb diff on ~4Gb file !? 8O 8O 8O I like very much your scintific aproach with whole math calcs around the subject Guys ( vmesquita and gfr too ) . Maybe my small experience can help in here maybe not , worth to try ! What I'm doing is much simpler (PAL & NTSC movies )- Wanted size for DVD is ~3700mb if I'm using AC3 original sound and ~3900 if I'm using MP2 224-128kbps . If u run CCE 2.66 just 20-25% from whole clip with sampler(len=75) result u'll get be very close to the whole samle size . So if sample takes ~5-6 minutes with len=75 then just in 1.5-2 min u have first pass ?! and so on till u get close enough to desired file size . Then u encode whole samle ( just to be sure ) It works for me quite well but what I always wanted not to make this manually ( and not to be stacked to pc ) I'll be glad if enyone'll find this info useful bman |
hey bman,
i was waiting the vmesquita results but .... edited> .....better is still waiting! :lol: |
Quote:
Jorel Sorry but it's easier if u devide needed file size on 20 in case of sampler(len=75) for PAL and sampler(len=72) for NTSC that is GOP*3 as u yourself suggested some time ago (with sampler(len=25) I devide on 60 ) . Anyway subject for me is how to make prediction With CCE automated not how we find needed sample size that we already learned to do some time ago :wink: :D :D :D bman |
edited>
:roll: |
vmesquita,
i'm still waiting your results! i did another full encode using Q=20 and got 1,55GB ! :wink: |
Ok ' Lets make some order in here :wink:
When i sayd "u yourself suggested some time ago " I maent => sampler(len=75) for PAL and sampler(len=72) for NTSC that is GOP*3 and that was in TMPG prediction thread . Sampler length of 3 GOP's for prediction is very accurate and I'm happy with results . U just make your avs script with sampler length of 75, load to CCE and encode just 25% - ~2min to encode and result multiply on 4 and on 20 and that's your final file size . Try this and tell me your results . Now I'm living my office and going back home so in a 1hr or so I'll be with u to see your results if u'll make test :wink: :wink: :D bman |
Quote:
The method in the spreadsheet is exactly the same, only the chosen points are different. If you look at the example in the spreadsheet, he makes two iterations, CQ1=30 , CQ2=16 (or CQ2=25 in the modified case), assumes it's a straight line from 16-30 (25-30) and calculates CQ=25 (CQ=24). If you run 3rd iteration and the error is too big you can "narrow" the range that is considered a line and so on. To run your method in the spreadsheet, type 20 instead of 30 in the Q Desired field. Then choose the table Newton's Iteration 1 and instead of the "11" that it puts automatically in the second line, you put 1 or 40, according to your rule. The 3rd line in the table gives you the CQ. :) You can increase the accuracy with some "educated guesses". Like, let's say, you're encoding a movie, and you know from experience that for the length of this movie, with this level of action, there's no way you can go below CQ=10, so you can use CQ=10 instead of CQ=1 in your method. If you guessed right, and the actual needed CQ is 10<CQ<20, so the line from 10-20 should be closer to the curve than the line from 1-20, and your error should be smaller. The spreadsheet tries to make these guesses automatically that's how the second line in the tables are filled. |
Quote:
|
@GFR
Ok, now I understood how it works. :D And it's really very similar. :D But the interation to confirm is dispensable for a 3% accuracy according to my tests. :wink: The idea2 is after finding this Q using the method above, do a one time big sample (about 5%) and fine-tune using a variation table. I implemented this in DIKO and tested with a encode that previously gave me 250 Mb undersize, and got 40 Mb undersize fine tuning using idea2 (much better). I can post the variation table later if you're interested to play with. But maybe it makes more sense to do prediction using 2 cycles of big samples (4%) and skip the confirmation cycle. I am still thinking how it can work, since there's one more limitation: DIKO has always to do a cycle using QMax to check if it's possible to get the footage in the media. EDIT: I think a 1% accuracy is overkill because of the flutuations of the resulting filesize... :wink: |
Site design, images and content © 2002-2026 The Digital FAQ, www.digitalFAQ.com
Forum Software by vBulletin · Copyright © 2026 Jelsoft Enterprises Ltd.