You have to capture the tape once, for the sake of the video.
Then again for the audio.
You will likely need two separate VCRs for it. The current hates the audio. Use a known-different VCR.
You can easily edit this Womble (MPEG) or
VirtualDub (AVI), to make source lengths match.
Then you split/demux audio from video.
One has good video, one has good audio.
You can remux in many ways, be it in an editor, encoder, or authorware. It depends on your workflow; ie, what you're making, streaming vs discs.
HiFi audio may be damaged, and only the linear track is viable.
This is tedious, yet relatively easy.
There is no software to do what you want. This is all hardware and re-captures. Software cannot help.
Note: If you're not using a TBC, and are dropping frames at capture, this will never work. Audio and video will not sync.