A method and apparatus to insert variable audio delay during video conferencing to achieve conflicting goals of lip-sync and interactive conversation. The amount of audio delay is variable according to the condition of the videoconferencing: long audio delay is inserted to achieve lip-sync between audio and video playback during monologue speech, but minimum or no audio delay is inserted during interactive discussion or argument. Variable audio playback speeds may be used instead of inserting quantum delay to achieve the same result. Various methods and apparatuses to detect the non-interactive mode or interactive modes are also disclosed.

