GOP = group of pictures. That's the basis for temporal compression of the MPEG format. You have three types of "pictures" in the GOP -- the I, P and B frames.
The I frame (independent) is a "full" uncompressed image. Uncompressed in the sense that it has no interframe encoding (temporal), only intraframe JPEG-like compression.
The P frame (predictive) is monodirectionally compressed, based on the content of the I frame. Let's pretend your video is of a clown sitting still, with a pie thrown in his face. On a P frame, anything that's not moving (not the pie), the video is compressed because nothing has really changed. The the video is decoded/decompressed for playback, data from both the I and P frame is required to build this frame encoded as a P picture.
The B frame (bidirection predictive) pulls data from adjacent P and I frames to compress. The pie reference doesn't really fit in here, and there's really no analog that works to explain a B frame. It's simply a frame that's further compressed from P, built from I and P and B when decoded/decompressed. The B frame is the worst quality frame.
So, considering all that, you want more I and P frames. But that reduces compression efficiency (including bitrate allocation), so you'll want a few B frames. Because live MPEG-2 encoding isn't the best at fast compression, lay off the B frames a bit. Software-encoded MPEG files (made from uncompressed AVI sources, for example) can include 4-5 B frames per GOP, and look great. The ATI AIW guides were created some years back with a lot of visual testing on the best GOP structures. Generally speaking, the default was fine (1I 2P 3B), but reducing the B frames to 2 looked a wee bit better.
Picture quality won't change at all, between XP Home and XP Pro. That's simply the OS, and Pro is suggested more than Home because it's proven itself to be more stable in quite a few situations. If you have Home, and it works, leave well enough alone.
Hope that helps.