Excerpt from Knowledge For Use Web site:
A puzzle to think about.

Many feel that it's important to make the distance between the two camera positions as close to the actual inter-eye distance as possible.  They say, "If you take them farther apart, you will exaggerate the depth."  But the depth almost never looks "exaggerated."  (And in those rare cases where it does look exaggerated, the pictures might have been taken right at the interocular distance . . .  or even less!)   There's an "obvious yet unobserved" principle here, but let's leave it for now as a puzzle:  What might make sterescopic images have exaggerated depth?

The Stereoscopic Model
 

Relevance vs Irrelevance
THE PRINCIPLE

A stereoscopic image that does not appear "flattened" or "stretched" in the dimension of depth must have the objects in the photo-pair subtend the same angles as they did in the original scene.  If those angles are greater when the photos are viewed, the stereoscopic image is flattened. If the angles are less, the stereoscopic image has "exagerrated depth."   So taking stereophotos with a long telephoto lens will tend to flatten the depth, because viewing the pictures so as to make those angles what they were to the camera means viewing them held far from the face.  We actually hold them closer, which makes the subtended angles greater.  

Conversely, taking the photos with a wide angle lens tends to exaggerate the depth.  We hold the photos farther than is proper, and that stretches the stereoscopic model in the direction we are viewing--that is, in the "depth" direction.

The distance between the camera positions when making the stereoscopic pair of photos is irrelevant to "exaggeration" or "flattening" of the stereoscopic image.  The relevant parameter is deviation in the viewed photos from original-scene angles.

.
We can think of the stereoscopic model in a simple way that helps us understand the math of the model.  When we project stereocopic photos with the projectors lined up properly, each point in the model is the intersection of two rays of light, one from each projector.  ("Proper alignment" means that each pair does intersect.)  The model is the collection of all of those intersection points.  And it is a model of the original scene.  It's an invisible abstraction, that collection of infinitely many abstractly defined points in space.  

Unless, of course, we are viewing stereoscopically.  Then it's very visible, assuming we aren't stereopsis blind.

We want to do our math on measurements of the model.  It's high-school math, easy math for those who understood it.