Software TBC that doesn't need the frame edges?
4 Attachment(s)
Reading some of the forum posts regarding jmac689's attempt at creating a software TBC has inspired me to start my own attempt at creating a software TBC. I've just wrote a conceptual software TBC that demonstrates my idea and it's already showing promising results.
In plain English, it's an algorithm that continually shifts and scales the lines of an image in order to reduce the amount of vertical edges present on the image. It can take advantage of the GPU and takes a few seconds per image. These example images were captured using a JVC SR-MV45 VCR with the built-in TBC/NR turned off, a For.A FA-125 external TBC, and a Pinnacle 710-USB capture card. Let's start with a basic example: Admin refused to unhide hidden content nothing will help to see hidden content As you can see, the oscillation of the jitter has been considerably reduced. It isn't perfect, however, but it's definitely better. For this example, though, you could just use the edges of the frame for reference to correct the jitter; a technique done this way would potentially look better than the technique I'm using. But perhaps the biggest key advantage of this technique is that it can still reduce horizontal jitter even when the edges of the frame are not visible! The original capture of this next image did have the edges of the frame visible, but for this demonstration I've clipped the heck out of the blacks before feeding it to the software TBC. Now there is absolutely no frame edges to adjust to! Admin refused to unhide hidden content nothing will help to see hidden content Evidently, the main issue with the end result is that the title is noticeably distorted. This is mainly because this software TBC does not account for temporal motion yet. Once I add some code to account for temporal motion, the results may look even better. Still, not bad at all! Of course, this algorithm isn't perfect by any means. A few seconds per frame is considerably slow, and as you saw with the last example there can be distortion present in the final result. Also I've only tested this algorithm on two examples, but I'd like to get some additional examples of content to try it out on. The other problem is that sometimes the software TBC widens the entire image (see the first example). A very simple solution is to "normalize" the line shifting parameters so that there is no global scaling adjustment. For these demonstrations I didn't separate the fields before applying the algorithm. It's an essential step for practical purposes but the oscillation of the horizontal jitter was smooth enough that I didn't bother to. For the moment, I've implemented this algorithm in Python, using the PyTorch library (so this software TBC is technically using deep learning technology, but it's not actually deep learning). Therefore, it would most likely be packaged as a VapourSynth filter if, in the future, I make this software TBC available publicly. That's all, for now. I'll keep fine-tuning this algorithm and hopefully get my hands on some more examples to try it on. |
Finally i was waiting for someone to use Jmac's ideas.if i can help you somehow (not coding) let me know
|
I will always fully support the creation of a software TBC as much as I can. For example, submitting captures/recordings, of varying degrees, in need of correction, since I have a large body of test tapes and homemade DVDs (made by others, with not-great equipment).
|
Quote:
|
This could be great in cases where the tapes are not accessible any more but there is a digital copy in a reasonable quality. I have lots of mpeg2 files, interlaced and encoded at crazy bitrate hoping for something like that. Looking forward to this, and to the source code too!
|
Quote:
Quote:
Quote:
|
I think software TBC is far more valuable than RF capture ever will be. While it would be interesting to see RF succeed, it will not be a game-changer for capturing. A software TBC, however, will be.
I have no time to put together clips right now. However, I do immediately know of a homemade DVD with issues, and a DVD released without issues. That may prove very useful for this research. And that I can start uploading that very soon, sometime next week. (Feel free to nudge me via PM next week, if I've not uploaded by then.) |
Quote:
|
1 Attachment(s)
Quote:
-------- Quick update here: I've made a couple of changes to the algorithm. I was able to fix the scaling problem I mentioned in the original post, and I added code to separate the fields and process each one separately. I've also reorganized the code so that it'll allow me to apply this algorithm to entire videos instead of a single image. I still haven't added code to account for temporal consistency yet, though. While scouring through forum threads of software TBC discussions I came across a comparison video of jmac's software TBC using a Robotech commercial for the material. I figured I'd download it, crop jmac's software TBC version out of the frame and pass only the original untouched part through my own software TBC. Each frame took a little less than 1fps to process. And, here's how that turned out... The top left is the original capture. The top right is jmac's software TBC. The bottom center is my own software TBC (currently dubbed "Virtual TBC 1" or "vTBC1"). I will say that there are only a few areas were jmac's TBC beats out mine, mostly where frame edges are available for tracking (such as the text at the end). Despite my result being a bit wobbly, I find it way more watchable than the original capture (and even most of jmac's). What's especially notable is that the original video I had wasn't even an interlaced encode, it was a pretty compressed MPEG-4 progressive file that contained both fields per frame (so there's blending and blocking being shared between the fields, not an ideal situation). I've noticed in my internal testing that when the fields aren't separated and both fields contain the same image, the end result is definitely cleaner. I had to separate the fields, though, for this algorithm to be more practical. For improved results, I would imagine that for each frame, two versions of the image would be passed through the TBC: one with both fields as a progressive image and one where the fields were separated. Then there would be some criterion that would check which one produced "better" results (probably by checking how many vertical edges there are in the images, and picking the one with the least). A thought that has popped up in my mind not much later after I made the first post, is that the best way to create an ideal software TBC is not to create just one, but create multiple, one with each different approach. Or, you make one software TBC but it's highly adjustable. This is because, especially for a complex problem like software TBC, an attempt to create a "one-size-fits-all" solution is a path to failure and suffering. No matter how advanced your software is and how many cases it can cover, there will always be a specific case or purpose it's best at. So if you create multiple pieces of specific purpose software, and pick and choose each one for each situation, you end up with an overall better result. Of course, one still has to be conservative with how many different versions of the same software they make, or else which one to use gets more and more confusing. The demo I posted above proves this point. You can see how sometimes jmac's TBC beats mine out, when it's able to accurately track the frame edges. So if there were some "hybrid" solution between his and mine, where it would use his solution where the frame edges can be tracked and fallback to mine where it can't, then results could potentially be even better. |
1 Attachment(s)
I hate to double post but I was hoping for someone else to respond before I made this update post, lol. (And I can't edit my previous one either, but this update is big enough.)
Progress update: Yesterday and the day before, I did a crapton of tweaking and experimentation with the software TBC, trying all sorts of different things and see which helped it and which didn't. Not only have I managed to speed up the algorithm by more than three times (now able to process at 3-4fps instead of 1fps), but I managed to significantly improve the visual quality of the results as well as the stability. Provided here is an updated comparison video: Top left is original capture (same as previous post), top right is jmac698's software TBC, bottom center is the most recent revision of my software TBC as of this writing. I still haven't written anything to account for motion tracking yet. The quality of this TBC has reached to a level where I will be taking a break from development on for a bit. I will resume development once I get some more test clips from lordsmurf. I am mainly doing this because I don't want to focus on development for only one test video only to realize it won't work as well for all the other test videos. ------ Nitpick thing here: is the question mark at the end of the title automatically added or was that added by an admin? I personally find having the question mark in the title a bit weird (I didn't originally add it there) because to me it makes the title sound like I'm asking for a software TBC (which I'm not), rather than presenting one. |
Wonder how much of that mess a line TBC inside a VCR can fix on the original tape? it will be an interesting comparison.
|
1 Attachment(s)
Hi imgkd. Your window in comp2.mp4 looks fantastic compared to the other two! Great job. Of course, without the original for reference, someone just looking at the completed clip would still immediately notice something is very wrong.
Quote:
I have samples of other content, but which are you most interested in: super-screwed examples like this, or more mild? Do you want some captures with TBC off vs TBC on, where possible? |
Quote:
That sounds like an interesting sample. I’ll check it out later today. I’m interested in both mild and super-screwed. TBC off vs TBC on might be useful as well for a “ground truth” comparison. |
I would like to announce that this project has officially been resumed.
Just today I found yet another tweak that improved the quality of the results. Not big enough to post yet another comparison video, but definitely noticeable. Second, I've been re-re-revisiting the conversations jmac had on his project and will take some of those posts for possible inspiration. Thirdly, I've decided that there will be three types of software TBCs: spatial TBCs, temporal TBCs, and registration TBCs. Each type will have its own use cases:
Lordsmurf once suggested an idea on Doom9 to apply heavy temporal NR to reduce the wobbliness, and use that as a reference. From some very very limited experimentation it is evident that this idea works better for some denoisers than others. So far my best luck has been with hqdn3d; however there's tons of ghosting, and it doesn't perform motion compensation. MCTemporalDenoise sort of works but doesn't reduce the detail enough to remove the wobbliness. I will continue to scour through more NR filters (I'm feeling like dfttestMC might be the right fit.) Note that I haven't tested the full idea (with the line alignment and everything), just the part with using the NR to remove the wobbliness. In addition to using temporal NR + registration TBC, I've also been exploring ideas on borrowing concepts from image denoisers to improve spatial and temporal TBCs. For example, for each patch in the field a non-local search of the closest patches are found and grouped together. Then the lines are shifted such that all the patches are as similar to each other as possible. Extending this to become a temporal TBC is quite simple: search patches among multiple fields and not just the current one. To make this practically work, the similarity metric for the patch searching would have to be robust to line misalignments, and the similarity metric for the optimization part of the system would have to be stricter. Perhaps the patches are searched at multiple scales and deformations to further strengthen the optimization. What's also evident from analyzing some of the samples is that there are a few different kinds of jitter: there's the smooth wobbly kind, the rough jagged kind (such as the Robotech sample), and the extremely severe but much less frequent jitter. Tackling the latter would involve processes similar to film dirt removal: have a process for masking out the "non-jitter" parts (using a combination of vertical delta thresholding and temporal detection), and make the optimization extremely severe. This would undoubtedly require a dedicated temporal TBC. |
1 Attachment(s)
Shortly after writing the post above (can't edit the previous post since too much time has gone by), I had a semi-major lightbulb moment for creating a simple temporal TBC: Grab a few frames, transform the frames such that each new frame is now a time slice of all the frames for each line, run those new frames through the spatial TBC, then undo the transformation to get the stabilized frames.
It didn't take me long to apply this idea, and it totally works. However, as a consequence of this idea the temporal TBC (now named "Temporal TBC 1", the previous TBC now being "Spatial TBC 1") only looks at each line independently throughout each group of frames, it doesn't account for spatial smoothness at the same time. What this means is that, temporally, objects that stay still have zero wobble, but they appear slightly less smooth. To solve this, I used a 2-pass pipeline, which I detail out in the demo explanation. Currently, Temporal TBC 1 filters 180 frames (360 fields) at a time. So that's roughly 6 seconds worth of data for each line to stabilize. No overlap, so theoretically there could be jumps in the line positions every 6 seconds, but I don't see it at all. For this new demo I picked a different sample that Lordsmurf has PMed me. The left is the original clip. The right is the same clip run through the following pipeline: Spatial TBC 1 > Temporal TBC 1 > Spatial TBC 1 (1/3rd strength) > Temporal TBC 1 (1/3rd strength). Due to the very primitive "motion compensation", the second shot in the trailer is the least treated, since everything is moving at a faster rate. OTOH, check out the frames at around the 15-second mark. You'll notice that the lines are not quite perfectly straightened, but they're certainly more straightened than the original clip. Temporal TBC 1, when used in conjunction with Spatial TBC 1, appears to be extremely effective for clips such as this one. However, more complex techniques involving proper motion compensation may produce even better results. Not to mention that this 2-pass method I did is quite slow (both TBCs approximately operate at 3-4fps, but it's being run a total of four times, which makes it slightly less than 1fps). |
Oh wow ... wow wow wow :) ... I've wanted to see some clean-up on this for 20+ years now.
Can you upload that as either lossless AVI, or high bitrate (low CQ, like 10) 4:2:2 H.264? That's really quite good. Not perfect, but way better. As of now, I'd be willing to say that TBC, for this clip, has the power of ... hmm ... not an ES10/15, at least as good as a weak ES20/ES25, maybe a choking JVC or Panasonic, much better than ADVC-300 super-weak. That's a compliment! I was just thinking of you a few days ago. I lost the link to this thread (honestly forget we even had a thread, just remember the PMs), and was crawling through my nightmare of a PM box trying to remember your username (take no offense, my memory of names and dates has always sucked; women always loved that! :laugh:). Heavy NR can defeat some errors, but the side effects are always hard to overcome. As I recently wrote in a VH post about herringbone, sometimes the only cure is to "beat it death with NR". That's a terrible solution, but I'm really glad it gave you an idea -- especially on this topic. Sometimes Avisynth stuff is literally headache-inducing frustration for me, and jmac's stuff was hard to understand (for me). So I'm really glad his research is paying off, as I told him everything I could think of, gave him samples, and even gave him a TBC for shipping costs (< yeah, I've wanted a software TBC for a long, long time! but also noting it was a flawed unit that I picked up cheap at the time, 10+ years ago) Bah, not afraid of 1fps. I recently tried some insane AA methods, and it was 0.1fps. :laugh: I remember 2fps with QTGMC when still in infancy on old single-core P4 hardware. 3-4fps for some that looks back+forward is actually quite good. Also remember Avisynth+ x64 and MT modes, and for preview AvsPmod x64, and VirtualDub2 x64. Was that taken into account for the fps? There is another technique you may want to look into, in terms of back+forward+NR. This is the method used for de-dropout, but it is extremely harsh. This is an area where I want to learn more Avisynth, to tweak it, which is something nobody has ever done (that I have seen) with this complex code. I have, and gotten it to be less damaging, but it's still entirely NOT feasible for some clips (namely cartoons). That's a big script that I'll have to attach to a reply post, I'm not on my main video restore system at the moment. Perhaps you can learn how to tamp down the negative effects, and keep the positives? This software TBC is important to me, and I'll always try to drop what I'm doing to help however needed. If you need or want more samples, I'll see what I can dig up. I know of a few uglies, and I can give you some actual ES10 performance with the same tape to shoot for. You can always email me if needed. If you don't already have it, PM me. |
Quote:
Also the fact that it's good enough for you to start comparing it to actual hardware with semi-functioning "TBC" functionality is an achievement in itself. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
EDIT: Missed a section from Lordsmurf's post: Quote:
|
1 Attachment(s)
I gave this clip a quick treatment, to enjoyably view. And enjoyable it now be! :D
Of course, it can always get better. But it's come a long, long way. See attached. This is mostly me playing around. It's aggressive with sharpness and NR, some stabilization and chroma work, quick audio treatment. BTW: The reason for the aggressive sharpness? To stress test the interlace and timing wiggle. It can really pull out flaws. The IVTC came before that, but still. Do you have the "super ones" (ch11) commercial, same movie? That has even more meaning to me. All of the non-credit for this particular clip (in this thread here) can actually be rebuilt using the Megazone 23 BD or DVD. The spinning Robotech logo can be recreated in After Effects from scans of the logo. The Cannon credit page also recreated. But the ch11 footage is unique, does not exist anywhere else I need to digitize the whole tape again, post-movie bonuses, using AG-1980P with no NR. It has several samples of damage, from minimal to moderate to insane. Quote:
|
Quote:
|
As far as Robotech samples go I only have this one, the one I used previously for comparisons with jmac (see earlier posts), and then two live-action footage pieces with what appear to be "interviews". That's all.
Good point on not needing to port to Avisynth, we may indeed need a dedicated interface especially for some of the heavier/more manual work. |
Site design, images and content © 2002-2024 The Digital FAQ, www.digitalFAQ.com
Forum Software by vBulletin · Copyright © 2024 Jelsoft Enterprises Ltd.