Well, I have enough experience working with video and audio that I am confident I can offer you some insight into this confusion you have. I lived near and worked in Hollywood for a time.
When you are watching a professionally produced movie, the audio and the video were recorded separately. In post-production, the time period when all the editing takes place, the audio track is matched up to the video track. When movies are also dubbed in other languages, they simply replace the original audio track with the new audio track for the alternate language. Since the way the mouth moves is different for different languages, they cannot align the new audio track with mouth movement in the video track. Instead, they break down the scenes of the movie into smaller moments and try to match the general pace of the audio to those moments (known as beats). These new audio tracks are recorded by voice actors in recording studios, who use a script specially prepared in their language and the muted video footage (no audio). This allows the voice actors to match the pacing and vocal dynamics to the video as best they can using their own language.