just a side thought..
I suppose we tend to think in (planar) storage format when [drawing] a line or a circle or sketching a scene.
Its not natural for us to think in terms of bit pumps and bit streams combining in a video encoder or decoder that goes on to paint fields and rasters.
So there has to be a way of converting from "frames" to "fields" or "lines" of video.
So (packed) or stringified bits streams of video versus (planar) or arrayed frames of video each have their place.
Having code that easily converts from one storage format to the other is important.
Building up pictures by adding and subtracting parts of lines, like working with clay from one line to the next.. sounds excrusiatingly "difficult" if possible at all for a human being. Animation artists typically work with a frame at a time, not a line at a time.
In reverse, converting from "native" line-based video format, which is inherently (liney) into a "progressive" frame-based video format.. like in computer image is just as difficult.. but without the advantage of automatic code libraries.. we have to "stitch" the frames back together.. and with interleve actually being two (different) frames separated in real time.. there is always a difference.
This exercise has helped me understand that de-interleaving is a "kind" of upsampling to get a higher resolution from fewer lines.
And there is [no] perfect upsampling method.. only acceptable artifacting
The best de-interleave .. is always (none) at all.. that is how the video was created, any change will by definition (be worse).. interleaving "works" due to persistence of the eye in the human brain.. and in a way your brain (is) the best "de-interleaver" if that's really how vision in a human being works at all.. we tend to filter out artifacting automatically all by ourselves.. and individually.. leaving it interleaved and presenting it that way.. lets each individual "de-interleave" in the manner that best suits them.
by the way, there is a really cool website that discusses the four CC formats interms of (packed) or (planar) formats
http://www.fourcc.org/yuv.php
its an easy read
the reason it is "four CC" is four characters take up 4 x 8 bits = 32 bits
A "megapixel" is a line "pixel" which uses four bytes to define its lumanance and color details, so if a pixel is defined by ARGB = 32 bits if its defined by UYVY = 32 bits, YUY2 = 32 bits
Its kind of elegant in that the "four CC" identifier for a format easily "fits" within the storage space of a single "megapixel"
If you don't know what the format of the video bits are, you can just snoop for one of the common four CC identifiers to tell you the video format.