Its up to you, but your kind of doing things "not the way" most people suggest.. if I may be so bold..
The TBC in the JVC-MVS is a line-TBC and corrects for stretching and recoil in the longitudinal distortion of the video tape in the cassette as it travels across the playback drum. This occurs because the winding motor is a feedback servo which pulls and slacks depending on the needs to get it moving and keep it moving from one reel to the next. It can't help but be in error from time to time.. and this produces longer, and shorter horizontal lines across the screen.
The "tearing" or "flagging" common to horizontal problems at the top of each frame (if they are noticeable) is because the leading edge of each frame is dragged across the video heads from the same edge at the bottom of the tape which is scribbled diagonally across the tape from bottom to top. This edge meets the whirling video head and is most likely to "warp" or "actually flail like a flag in the wind" as the video head sweeps up from no where on the tape to the edge of the tape and then diagonally across it. Its this meeting place.. the intersection of a real "edge" and the lack of support of the edge which gives it a chance to be the "most variable" and flip and flap back and forth. Feathering.. or permanently stretching the edge does occur in misaligned transports, sometimes.. and permanently "bakes" this problem into tapes.
A line-TBC grabs a snapshot of each horizontal line and digitizes it, and store it in a line buffer memory for two or three lines, or however much room it has for multiple integers of lines. It "scales" the lines by automatically "stretching" them in digital memory to "fix" the period of time to sweep across the video monitor screen in the same amount of time.. so that there are no "outliers" or "variable line lengths". This prevents "zigzag patterns" from top to bottom in the picture.. sometimes called "squiggles". The more memory for more lines the better.. up to a full field, or a full frame.. but usually a full field is enough.
The ES10 is used in this pipeline as a "frame-TBC" or "synchronizer".. it can try to stablize or correct line length problems.. but by the time the video signal travels through video cables.. its "smeared" just a bit.. so that the information of the true edge of a line is lost and it becomes much more difficult to detect the edge.. so the "strength" of the line-TBC is effectively "lost" and cannot be recovered. Instead the "sychronizer" reconstructs "vertical sync". It basically digitizes each field and strips the poor quality sync signal and slaps on a new copy of a perfect sync signal. It also detects the start and end of each field and frame set and makes sure they are evenly spaced.. so they appear in the output "consistently". At one time this was important for "Genlock" or "splicing in frames from different sources" so that they appears perfectly on screen without jumping around, or "rolling" when switching from one scene source to another.. today its just good for making sure there is no "jitter" or "disturbanced in the [up-and-down] vertical frame sync".
The gyrating output of the VHS player will jump forward and backwards, that's just the nature of a servo-controlled feedback system.. its like a ball on a spring.. but it does "average" around a point.. this is called "hunting" and the frame synchronizer will help smooth this out by acting as a dampener.. sooner or later the signal will arrive tardy, or early all squished or strung out.. and the frame sychronizer (the ES10 in this case) will even all that out.
The ES10 does its job by performing analog to digital conversion, as part of that it has to filter its input. It also has chroma and luma "noise" filters.. and if you are using the composite input of the ES10 it also has a comb filter to minimize dot-crawl. After its converted it to digital, it then converts it back to analog and sends the signal out on its merry way.. hopefully for the better.
The capture device IODATA GV USB2 isn't on the recommended list, so no one will likely have an opinion about it, except that it will be suspect if there are problems. It might be good it might be bad.. but no one knows.
The lossless
VirtualDub step is a good one and tells us you are planning to capture without compression, and that's good.. but most capture devices were designed with some type of compression in the output. They often came with custom compression codecs in their hardware or software.. which might compensate for all kinds of unpredictable things (a green "tint" for example, or a tendency to clip blacks or blowout highlights). When capturing losslessly, the "king is without clothes" and all flaws are revealed.. which you then have to be aware of to cover them back up.
So capturing lossless with an Unknown capture device has its own set of unknowns.. so most people will just not comment and wait to see the output.
Its best to nail down what you don't know first, before changing and adding things.
But in general most would recommend gaining some control over your "filters".. and that usually means either by having an on/off switch in the VHS player, or some sort of actual filter level controls. After that point the "frame synchronizer" you use might have a proc-amp.. or some kind of "sharpener" or other type of filter controls.. a proc-amp is just a fancy name for a type of filter.. other kinds are noise filters and color corrector.
Most people would also recommend "do as little damaged as possible" to the signal.. and some follow the idea that "least adjustment in the signal path is best" clean it all up in software after the capture. So collecting lots of devices to insert in the signal path.. is probably a bad idea.. there are diminishing returns to trying too hard.
Mindful .. collecting of gadgets to insert into the signal path, once you've determined a "need" however is not a symptom of hoarding.. and pretty much keeps you from cluttering your mind up with questions (now why did I buy that thing?) or buyers remorse.