digitalFAQ.com Forums [Archives] - HCenc: Variable Quantization Matrix

Page 1 of 2

Show 40 post(s) from this thread on one page

digitalFAQ.com Forums [Archives] (http://www.digitalfaq.com/archives/)

- Video Encoding and Conversion (http://www.digitalfaq.com/archives/encode/)

- - HCenc: Variable Quantization Matrix (http://www.digitalfaq.com/archives/encode/14081-hcenc-variable-quantization.html)

kwag	01-22-2006 11:35 PM

Variable Quantization Matrix

@Hank,

I don't know if this has ever been discussed, but I just thought about something that might be interesting for your encoder, and would be ground breaking if it's ever implemented.
As you already do a first pass for analysis, prior to the second pass, could this first pass be extended either with a "hook" to an external program that can do a very accurate frequency domain analysis of every frame in the source, so that the program can actually generate a dynamic matrix, optimized for the input material, and then you can apply this matrix pattern to every GOP while encoding on the second pass :?:
I'm not sure if standalone DVD players will be able to decode a variable matrix, but I believe that SAPs that do conform to DVD specifications shouldn't have any problem because there's nothing that says it can't be done. Unless the SAP only get the information from the first header, and "assume" that the rest of the material contains the same matrix. That, of course, would be a disaster in the reproduced material.
But I'd like to ask you, if it's possible that you could do a quick test with your encoder, and encode, say a 2 minute video with random matrixes (from any of your internal supported matrixes) for every GOP, and see if it plays back on a SAP :idea:
If it does, and maybe several people can test it on different SAPs, then we can contemplate such an exciting feature for some furure version.
This technique would be optimal for any material, because the program would generate an optimal matrix for every part of a movie, and I believe there's no current encoder that does that.

Thanks,
-kwag

Sagittaire

01-23-2006 07:12 AM

At this time it's impossible ... :idea:

There are HVS study for best intra preduction (and with ultra low speed metric computation) but not for inter prediction. Only eyes for that actually. Metric are unable to choose the best HVS inter/intra matrix and you must absolutly use metric (SAD, SADT, PSNR or other) in video codec algo computation.

Metric are able to measure mathematical efficiency with same matrix for all codec (same matrix for all MPEG2 codec for example) but not best visual pertinence for matrix choice.

scrappy

01-23-2006 07:48 AM

Not quite sure what that meant even after reading it a few times... but I could test this on that pedantic pioneer of mine if wanted :P

kwag	01-23-2006 10:08 AM

Quote:

Originally Posted by Sagittaire

At this time it's impossible ... :idea:

That's exactly what they said when I talked about fitting one movie on one CD :D
Nothing is impossible, specially in software ;)

Quote:

There are HVS study for best intra preduction (and with ultra low speed metric computation) but not for inter prediction. Only eyes for that actually. Metric are unable to choose the best HVS inter/intra matrix and you must absolutly use metric (SAD, SADT, PSNR or other) in video codec algo computation.

Wrong.
Every area of a picture has a specific set of frequencies of activity.
That's why a matrix is divided in blocks, that specifically work on a frequency domain. So I believe a fourier transform analysis of macro block aligned weighting would give a rough idea of the frequency activity per area, and that can easily be mapped to an optimal matrix pattern, after several tests.

Quote:

Metric are able to measure mathematical efficiency with same matrix for all codec (same matrix for all MPEG2 codec for example) but not best visual pertinence for matrix choice.

Again, we're not talking about metrics here.
We're talking about applying a dynamic frequency "order" transformation for a particular area, after heavy mathematical analysis, and then applying the optimal matrix pattern (per block) based probably on a pre defined matrix table that is weighted for the particular frequency that the area of analysis calls for.

Edit: I'm already visualizing exactly how I would do it in software, although it's too premature.

-kwag

Sagittaire

01-23-2006 11:38 AM

Quote:

We're talking about applying a dynamic frequency "order" transformation for a particular area, after heavy mathematical analysis, and then applying the optimal matrix pattern (per block) based probably on a pre defined matrix table that is weighted for the particular frequency that the area of analysis calls for.

... but that is metric test. Metric are not only PSNR but more generaly mathematical analysis (with HVS specification or not) to choose best encoding way (SAD for vector, PSNR for RDO ... etc etc). There are no model actually to choose best matrix simply because inter HVS study are highly complexe.

You want choose best matrix for particular situation but at this time there are no psychovisual mathematical model way to make that in video codec algorithme (don't forget that video codec or audio codec are only mathematical algo with some psychoaccoustic or psychovisual mathematical model)

Your proposition is actually "the Graal" in video coding world ... but at this time no mathematical way for make that ... only eyes ... :idea:

But if you want example DCTune can make that only for intra matrix
http://forum.doom9.org/showthread.ph...atrix+analysis

Finaly matrix tuning with these "magical number" are good improuvement way but IMO RDO (Rate Distortion Optimisation) for MPEG2 like in libavcodec will be a very better way for high improuvement quality in all MPEG2 encoder ...

hank315

01-23-2006 06:57 PM

Nice discussion, and a nice idea also.

At first IMO all players should be able to play such a stream where the matrix could change at every GOP.
CCE also uses it with their adaptive matrix stuff which also can change matrices at every GOP.
I know they have a warning in their manual about it, but it is compliant.

And now the difficult part...
Implementing such a thing in HC isn't very difficult because I also wanted to do it some day.
But it always stopped because of the decision when to use what matrix...
Seems there were always other things which were more important to look at.

If it should be implemented it must be done before the first pass otherwise the bitrate control could screw up pretty bad.

Quote:

I'm already visualizing exactly how I would do it in software, although it's too premature

This is really good news, I don't have to write it :D :D

incredible

01-23-2006 07:12 PM

Quote:

Originally Posted by a

At this time it's impossible ...

...

Quote:

Originally Posted by b

Quote:

Originally Posted by a

Your proposition is actually "the Graal" in video coding world ... but at this time no mathematical way for make that ... only eyes ...

;)

Quote:

Originally Posted by the encoder building 'Grail'

....
CCE also uses it with their adaptive matrix stuff which also can change matrices at every GOP.
I know they have a warning in their manual about it, but it is compliant.
...
Implementing such a thing in HC isn't very difficult because I also wanted to do it some day.
But it always stopped because of the decision when to use what matrix...
Seems there were always other things which were more important to look at.

@Sagittaire
The point is (sorry for bringing this up again):
You say things like "XYZ ... is best in the world" and "ABC ... is impossible" .... ah, I do remember:

Quote:

Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...

The possibility of making such Ideas true is the same as Libavcodec is fully outputting DVD SAP mpeg2 compilant streams.

Means you know it when you do your practizes and not only theoretics.
I thought encoding communities are meant to break barriers of theoretics :?: :wink:

kwag	01-23-2006 07:38 PM

Quote:

Originally Posted by hank315

If it should be implemented it must be done before the first pass otherwise the bitrate control could screw up pretty bad.

Ok, so it should run "parallel" to the first pass, so when the second pass applies bitrate distribution, it also applies the correct matrix tables at the same time and place.
Would this involve sort of a "prefetching" of several frames, before scene change detection, possibly to apply a matrix not necessarily on a GOP basis, but on a sequence of frames :?:
That would probably be more realistic than applying a matrix change at every GOP :idea:

Quote:

I'm already visualizing exactly how I would do it in software, although it's too premature

This is really good news, I don't have to write it :D :D

There's a big difference in visualizing it, than writing it :lol:
But given an algorithm, I wouldn't mind giving a crack at it ;)
BTW, is there a time (frame) code in your first pass file :?:
Because if there is, then an external program can produce such a "matrix" stream, which can be kept in sync with your first pass file, so at the point of the second pass you could encode and apply the matrix changes on the correct spot.
These are just crazy thoughts that pass my mind :cool: (and no, I'm not on crack or smoke :lol: )

-kwag

Sagittaire

01-24-2006 03:49 AM

Quote:

Originally Posted by incredible

Quote:

Originally Posted by a

At this time it's impossible ...

...

Quote:

Originally Posted by b

Quote:

Originally Posted by a

Your proposition is actually "the Graal" in video coding world ... but at this time no mathematical way for make that ... only eyes ...

;)

Quote:

Originally Posted by the encoder building 'Grail'

@Sagittaire
The point is (sorry for bringing this up again):
You say things like "XYZ ... is best in the world" and "ABC ... is impossible" .... ah, I do remember:

Quote:

Inutile de discuter avec moi ... j'ai toujours raison ... en tous cas j'en suis convaincu et c'est le principal ...

1) Well I confirm that there are no HVS study for best inter DCT decision : speak about that with codec developper if you want. CCE use certainely simple switches like Matrix alpha "high motion GOP" and Matrix Beta "low motion GOP" or something like that but certainely not complete GOP DCT analyse (DCTune make that for intra DCT with really heavy calculation and very slow speed ... certainely not CCE). There are absolutly no HVS study way actually to choose best inter/intra Matrix for video encoding ... and that is simply scientific fact ... :roll:

[HS ON but incredible speak about that and not me ... lol]

2) Libavcodec and quality ...

Libavcodec is the best MPEG2 (for metric and by far ... more than 1 dB for OPSNR) simply because it use very advanced ME search function like RDO. Speak about that with hank315 if you want ... LAVC is certainely the most powerfull MPEG2 encoder for low bitrate at this time and by far ... I prove that in visual quality challenge if you want.

3) Libavcodec and DVD compliance ...

Well ... I wait always your example for make test. I asked to friends who works in SAP manufacturer and my Libavcodec example stream are always perfectly compliant (he use special analysis log for hardware DVD chip emulation). hank315 check my libavcodec stream in my MPEG2 Challenge and seem to detect no problem too ...

[HS OFF but incredible speak about that and not me ... lol]

hank315

01-24-2006 03:00 PM

The general program flow is:
1) read and buffer frames
2) analyse frames: do scene change detection, detect high/low motion etc.
3) create GOP structure and encode GOP
4) dump encoded GOP in the video stream
5) goto 1

So it could be done between 2 and 3, a routine can be called (from a dll ?) which can generate a matrix based on the analysis results per GOP.

Matrices can be stored in the database to be used in the second pass.

Boulder

01-24-2006 03:44 PM

You know, I would even take the CCE method any time. It's nice to be able to set a decent matrix which will remove somewhat more details from the high-action scenes and then count on the fact that the more static (and more noticable to the eye) scenes will have more details due to less quantization.

hank315

01-24-2006 04:37 PM

It could be a good start for implementing and testing.
Something like 3 matrices, for low, normal and high activity.
Or always a better matrix for low lit scenes etc.

kwag	01-24-2006 04:38 PM

Quote:

Originally Posted by Boulder

It's nice to be able to set a decent matrix which will remove somewhat more details from the high-action scenes and then count on the fact that the more static (and more noticable to the eye) scenes will have more details due to less quantization.

I was thinking on something like that last night :)
User selectable options, where it would create the matrix tables with a "weighting", biased towards "Standard", "High Compression/Lower Quality", "Low Compression/Higher Quality", etc..
So the matrix can actually be used for filtering (just like we do with the Notch, to filter lower frequencies.

Quote:

Originally Posted by hank315

So it could be done between 2 and 3, a routine can be called (from a dll ?) which can generate a matrix based on the analysis results per GOP.

Exactly :)
So you can even call an external program to do that, which returns a new matrix.

-kwag

kwag	01-24-2006 04:39 PM

Damn, Hank, you just beat me by a minute :lol:
I guess we're all in sync :lol:

-kwag

Racer99

01-24-2006 11:35 PM

Intersting Idea!

I like your spirit on this one. I'll keep in touch to see what you come up with next.
My only real comment is how come we didn't think of this earlier.... :wink:

That's why we love this place...

fabrice

01-25-2006 12:03 AM

Re: Intersting Idea!

Hi,

Quote:

Originally Posted by Racer99

My only real comment is how come we didn't think of this earlier.... :wink:

We did: http://www.kvcd.net/forum/viewtopic.php?t=15712 ;-)

But one year ago, too many doubts, and no way to test that. Now, we have Hank! :-)

CU,
Fabrice

kwag	01-25-2006 12:10 AM

Re: Intersting Idea!

Quote:

Originally Posted by fabrice

We did: http://www.kvcd.net/forum/viewtopic.php?t=15712 ;-)

WOW, I had forgotten all about that discussion 8O
Thanks for pulling that out, Fabrice :D

-kwag

Racer99

01-25-2006 05:19 PM

Ditto on that Karl, I too forgot about that.

Kudos Fabrice!

Prodater64

01-26-2006 04:30 AM

Does anybody here know what AutoQMatEnc does?

Take a while to test it.
Actually, it is optimized for x pass vbr, but SAPSTAR is working on OPV also.

danpos

01-26-2006 04:59 AM

QMatOp

@ALL

In addition to Prodater's statement and considering that this is a HC forum, I've to say that SAPSTAR already has done something similar to proposed:

QMatOp :!:

It uses DCTune do scan and analyse GOPs and other things in order to generate optimal quantization matrix constrained to specified target average bitrate.

Greetings,

Page 1 of 2

Show 40 post(s) from this thread on one page