01-22-2006, 11:35 PM
|
Free Member
|
|
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
@Hank,
I don't know if this has ever been discussed, but I just thought about something that might be interesting for your encoder, and would be ground breaking if it's ever implemented.
As you already do a first pass for analysis, prior to the second pass, could this first pass be extended either with a "hook" to an external program that can do a very accurate frequency domain analysis of every frame in the source, so that the program can actually generate a dynamic matrix, optimized for the input material, and then you can apply this matrix pattern to every GOP while encoding on the second pass
I'm not sure if standalone DVD players will be able to decode a variable matrix, but I believe that SAPs that do conform to DVD specifications shouldn't have any problem because there's nothing that says it can't be done. Unless the SAP only get the information from the first header, and "assume" that the rest of the material contains the same matrix. That, of course, would be a disaster in the reproduced material.
But I'd like to ask you, if it's possible that you could do a quick test with your encoder, and encode, say a 2 minute video with random matrixes (from any of your internal supported matrixes) for every GOP, and see if it plays back on a SAP
If it does, and maybe several people can test it on different SAPs, then we can contemplate such an exciting feature for some furure version.
This technique would be optimal for any material, because the program would generate an optimal matrix for every part of a movie, and I believe there's no current encoder that does that.
Thanks,
-kwag
|
Someday, 12:01 PM
|
|
Site Staff / Ad Manager
|
|
Join Date: Dec 2002
Posts: 42
Thanks: ∞
Thanked 42 Times in 42 Posts
|
|
|
01-23-2006, 07:12 AM
|
Free Member
|
|
Join Date: May 2003
Posts: 97
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
At this time it's impossible ...
There are HVS study for best intra preduction (and with ultra low speed metric computation) but not for inter prediction. Only eyes for that actually. Metric are unable to choose the best HVS inter/intra matrix and you must absolutly use metric (SAD, SADT, PSNR or other) in video codec algo computation.
Metric are able to measure mathematical efficiency with same matrix for all codec (same matrix for all MPEG2 codec for example) but not best visual pertinence for matrix choice.
__________________
Le Sagittaire
--------------------
Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...
|
01-23-2006, 07:48 AM
|
Free Member
|
|
Join Date: Jan 2005
Location: Yorkshire, England
Posts: 61
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Not quite sure what that meant even after reading it a few times... but I could test this on that pedantic pioneer of mine if wanted :P
__________________
Mark
|
01-23-2006, 10:08 AM
|
Free Member
|
|
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
Originally Posted by Sagittaire
At this time it's impossible ...
|
That's exactly what they said when I talked about fitting one movie on one CD
Nothing is impossible, specially in software Quote:
There are HVS study for best intra preduction (and with ultra low speed metric computation) but not for inter prediction. Only eyes for that actually. Metric are unable to choose the best HVS inter/intra matrix and you must absolutly use metric (SAD, SADT, PSNR or other) in video codec algo computation.
|
Wrong.
Every area of a picture has a specific set of frequencies of activity.
That's why a matrix is divided in blocks, that specifically work on a frequency domain. So I believe a fourier transform analysis of macro block aligned weighting would give a rough idea of the frequency activity per area, and that can easily be mapped to an optimal matrix pattern, after several tests. Quote:
Metric are able to measure mathematical efficiency with same matrix for all codec (same matrix for all MPEG2 codec for example) but not best visual pertinence for matrix choice.
|
Again, we're not talking about metrics here.
We're talking about applying a dynamic frequency "order" transformation for a particular area, after heavy mathematical analysis, and then applying the optimal matrix pattern (per block) based probably on a pre defined matrix table that is weighted for the particular frequency that the area of analysis calls for.
Edit: I'm already visualizing exactly how I would do it in software, although it's too premature.
-kwag
|
01-23-2006, 11:38 AM
|
Free Member
|
|
Join Date: May 2003
Posts: 97
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
We're talking about applying a dynamic frequency "order" transformation for a particular area, after heavy mathematical analysis, and then applying the optimal matrix pattern (per block) based probably on a pre defined matrix table that is weighted for the particular frequency that the area of analysis calls for.
|
... but that is metric test. Metric are not only PSNR but more generaly mathematical analysis (with HVS specification or not) to choose best encoding way (SAD for vector, PSNR for RDO ... etc etc). There are no model actually to choose best matrix simply because inter HVS study are highly complexe.
You want choose best matrix for particular situation but at this time there are no psychovisual mathematical model way to make that in video codec algorithme (don't forget that video codec or audio codec are only mathematical algo with some psychoaccoustic or psychovisual mathematical model)
Your proposition is actually "the Graal" in video coding world ... but at this time no mathematical way for make that ... only eyes ...
But if you want example DCTune can make that only for intra matrix
http://forum.doom9.org/showthread.ph...atrix+analysis
Finaly matrix tuning with these "magical number" are good improuvement way but IMO RDO (Rate Distortion Optimisation) for MPEG2 like in libavcodec will be a very better way for high improuvement quality in all MPEG2 encoder ...
__________________
Le Sagittaire
--------------------
Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...
|
01-23-2006, 06:57 PM
|
Free Member
|
|
Join Date: Feb 2005
Posts: 38
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Nice discussion, and a nice idea also.
At first IMO all players should be able to play such a stream where the matrix could change at every GOP.
CCE also uses it with their adaptive matrix stuff which also can change matrices at every GOP.
I know they have a warning in their manual about it, but it is compliant.
And now the difficult part...
Implementing such a thing in HC isn't very difficult because I also wanted to do it some day.
But it always stopped because of the decision when to use what matrix...
Seems there were always other things which were more important to look at.
If it should be implemented it must be done before the first pass otherwise the bitrate control could screw up pretty bad.
Quote:
I'm already visualizing exactly how I would do it in software, although it's too premature
|
This is really good news, I don't have to write it
|
01-23-2006, 07:12 PM
|
Free Member
|
|
Join Date: May 2003
Location: Germany
Posts: 3,189
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
Originally Posted by a
At this time it's impossible ...
|
...
Quote:
Originally Posted by b
We're talking about applying a dynamic frequency "order" transformation for a particular area, after heavy mathematical analysis, and then applying the optimal matrix pattern (per block) based probably on a pre defined matrix table that is weighted for the particular frequency that the area of analysis calls for.
|
Quote:
Originally Posted by a
Your proposition is actually "the Graal" in video coding world ... but at this time no mathematical way for make that ... only eyes ...
|
Quote:
Originally Posted by the encoder building 'Grail'
....
CCE also uses it with their adaptive matrix stuff which also can change matrices at every GOP.
I know they have a warning in their manual about it, but it is compliant.
...
Implementing such a thing in HC isn't very difficult because I also wanted to do it some day.
But it always stopped because of the decision when to use what matrix...
Seems there were always other things which were more important to look at.
|
@Sagittaire
The point is (sorry for bringing this up again):
You say things like "XYZ ... is best in the world" and "ABC ... is impossible" .... ah, I do remember:
Quote:
Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...
|
The possibility of making such Ideas true is the same as Libavcodec is fully outputting DVD SAP mpeg2 compilant streams.
Means you know it when you do your practizes and not only theoretics.
I thought encoding communities are meant to break barriers of theoretics
|
01-23-2006, 07:38 PM
|
Free Member
|
|
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
Originally Posted by hank315
If it should be implemented it must be done before the first pass otherwise the bitrate control could screw up pretty bad.
|
Ok, so it should run "parallel" to the first pass, so when the second pass applies bitrate distribution, it also applies the correct matrix tables at the same time and place.
Would this involve sort of a "prefetching" of several frames, before scene change detection, possibly to apply a matrix not necessarily on a GOP basis, but on a sequence of frames
That would probably be more realistic than applying a matrix change at every GOP Quote:
Quote:
I'm already visualizing exactly how I would do it in software, although it's too premature
|
This is really good news, I don't have to write it
|
There's a big difference in visualizing it, than writing it
But given an algorithm, I wouldn't mind giving a crack at it
BTW, is there a time (frame) code in your first pass file
Because if there is, then an external program can produce such a "matrix" stream, which can be kept in sync with your first pass file, so at the point of the second pass you could encode and apply the matrix changes on the correct spot.
These are just crazy thoughts that pass my mind (and no, I'm not on crack or smoke )
-kwag
|
01-24-2006, 03:49 AM
|
Free Member
|
|
Join Date: May 2003
Posts: 97
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
Originally Posted by incredible
Quote:
Originally Posted by a
At this time it's impossible ...
|
...
Quote:
Originally Posted by b
We're talking about applying a dynamic frequency "order" transformation for a particular area, after heavy mathematical analysis, and then applying the optimal matrix pattern (per block) based probably on a pre defined matrix table that is weighted for the particular frequency that the area of analysis calls for.
|
Quote:
Originally Posted by a
Your proposition is actually "the Graal" in video coding world ... but at this time no mathematical way for make that ... only eyes ...
|
Quote:
Originally Posted by the encoder building 'Grail'
....
CCE also uses it with their adaptive matrix stuff which also can change matrices at every GOP.
I know they have a warning in their manual about it, but it is compliant.
...
Implementing such a thing in HC isn't very difficult because I also wanted to do it some day.
But it always stopped because of the decision when to use what matrix...
Seems there were always other things which were more important to look at.
|
@Sagittaire
The point is (sorry for bringing this up again):
You say things like "XYZ ... is best in the world" and "ABC ... is impossible" .... ah, I do remember:
Quote:
Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...
|
The possibility of making such Ideas true is the same as Libavcodec is fully outputting DVD SAP mpeg2 compilant streams.
Means you know it when you do your practizes and not only theoretics.
I thought encoding communities are meant to break barriers of theoretics
|
1) Well I confirm that there are no HVS study for best inter DCT decision : speak about that with codec developper if you want. CCE use certainely simple switches like Matrix alpha "high motion GOP" and Matrix Beta "low motion GOP" or something like that but certainely not complete GOP DCT analyse (DCTune make that for intra DCT with really heavy calculation and very slow speed ... certainely not CCE). There are absolutly no HVS study way actually to choose best inter/intra Matrix for video encoding ... and that is simply scientific fact ...
[HS ON but incredible speak about that and not me ... lol]
2) Libavcodec and quality ...
Libavcodec is the best MPEG2 (for metric and by far ... more than 1 dB for OPSNR) simply because it use very advanced ME search function like RDO. Speak about that with hank315 if you want ... LAVC is certainely the most powerfull MPEG2 encoder for low bitrate at this time and by far ... I prove that in visual quality challenge if you want.
3) Libavcodec and DVD compliance ...
Well ... I wait always your example for make test. I asked to friends who works in SAP manufacturer and my Libavcodec example stream are always perfectly compliant (he use special analysis log for hardware DVD chip emulation). hank315 check my libavcodec stream in my MPEG2 Challenge and seem to detect no problem too ...
[HS OFF but incredible speak about that and not me ... lol]
__________________
Le Sagittaire
--------------------
Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...
|
01-24-2006, 03:00 PM
|
Free Member
|
|
Join Date: Feb 2005
Posts: 38
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
The general program flow is:
1) read and buffer frames
2) analyse frames: do scene change detection, detect high/low motion etc.
3) create GOP structure and encode GOP
4) dump encoded GOP in the video stream
5) goto 1
So it could be done between 2 and 3, a routine can be called (from a dll ?) which can generate a matrix based on the analysis results per GOP.
Matrices can be stored in the database to be used in the second pass.
|
01-24-2006, 03:44 PM
|
Free Member
|
|
Join Date: Sep 2002
Location: Lahti, Finland
Posts: 1,652
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
You know, I would even take the CCE method any time. It's nice to be able to set a decent matrix which will remove somewhat more details from the high-action scenes and then count on the fact that the more static (and more noticable to the eye) scenes will have more details due to less quantization.
|
01-24-2006, 04:37 PM
|
Free Member
|
|
Join Date: Feb 2005
Posts: 38
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
It could be a good start for implementing and testing.
Something like 3 matrices, for low, normal and high activity.
Or always a better matrix for low lit scenes etc.
|
01-24-2006, 04:38 PM
|
Free Member
|
|
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
Originally Posted by Boulder
It's nice to be able to set a decent matrix which will remove somewhat more details from the high-action scenes and then count on the fact that the more static (and more noticable to the eye) scenes will have more details due to less quantization.
|
I was thinking on something like that last night
User selectable options, where it would create the matrix tables with a "weighting", biased towards "Standard", "High Compression/Lower Quality", "Low Compression/Higher Quality", etc..
So the matrix can actually be used for filtering (just like we do with the Notch, to filter lower frequencies.
Quote:
Originally Posted by hank315
So it could be done between 2 and 3, a routine can be called (from a dll ?) which can generate a matrix based on the analysis results per GOP.
|
Exactly
So you can even call an external program to do that, which returns a new matrix.
-kwag
|
01-24-2006, 04:39 PM
|
Free Member
|
|
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Damn, Hank, you just beat me by a minute
I guess we're all in sync
-kwag
|
01-24-2006, 11:35 PM
|
Free Member
|
|
Join Date: Sep 2002
Location: Massachusetts
Posts: 119
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
I like your spirit on this one. I'll keep in touch to see what you come up with next.
My only real comment is how come we didn't think of this earlier....
That's why we love this place...
|
01-25-2006, 12:03 AM
|
Free Member
|
|
Join Date: Mar 2003
Location: Madrid-Spain
Posts: 515
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Hi,
Quote:
Originally Posted by Racer99
My only real comment is how come we didn't think of this earlier....
|
We did: http://www.kvcd.net/forum/viewtopic.php?t=15712
But one year ago, too many doubts, and no way to test that. Now, we have Hank!
CU,
Fabrice
|
01-25-2006, 12:10 AM
|
Free Member
|
|
Join Date: Apr 2002
Location: Puerto Rico, USA
Posts: 13,537
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
Originally Posted by fabrice
|
WOW, I had forgotten all about that discussion
Thanks for pulling that out, Fabrice
-kwag
|
01-25-2006, 05:19 PM
|
Free Member
|
|
Join Date: Sep 2002
Location: Massachusetts
Posts: 119
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Ditto on that Karl, I too forgot about that.
Kudos Fabrice!
|
01-26-2006, 04:30 AM
|
Free Member
|
|
Join Date: Mar 2003
Location: Palma de Mallorca - España
Posts: 2,925
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Does anybody here know what AutoQMatEnc does?
Take a while to test it.
Actually, it is optimized for x pass vbr, but SAPSTAR is working on OPV also.
|
01-26-2006, 04:59 AM
|
Free Member
|
|
Join Date: May 2004
Location: Rio de Janeiro - Brasil
Posts: 538
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
@ALL
In addition to Prodater's statement and considering that this is a HC forum, I've to say that SAPSTAR already has done something similar to proposed:
QMatOp
It uses DCTune do scan and analyse GOPs and other things in order to generate optimal quantization matrix constrained to specified target average bitrate.
Greetings,
|
All times are GMT -5. The time now is 03:35 AM — vBulletin © Jelsoft Enterprises Ltd
|