The Emperor’s New Stereo

I was contacted a few months back by Jose Luis Diaz about an article I wrote for Mix magazine -in 1998. He asked did I have a copy of the original in pdf form.  No. I am not the best archivist. 😦

Well it turns out he had a Spanish translation of it and he RETRANSLATED it back to me. 🙂

It’s funny for me to see the old article and the extremely crude drawing quality of that era. As for the subject matter itself, it still holds up pretty well. Not too long ago went to another concert with a 5 piece jazz band where the piano was on the left and the guitar on the right. We had really great seats on the left side. The piano and drums and bass were fresh and clear. The guitar I heard when it came back off the wall on the right side. Bet it sounded great at FOH.

So here it is ……once again.  And if you want the Spanish version go here


The Emperor’s New Mix

Unveiling the stereo myth on live sound

(Bob McCarthy Mix Magazine January 1998)

Once upon a time, there was an emperor living in a giant palace.

After mixing some tracks in his private studio, the emperor was so happy with the stereo image that he decided to throw a concert for his 5000 closest friends.

For the occasion, he bought a new luxuriously advanced stereo sound system.

Before the show started, the emperor told the audience what the sound system sales man had said to him:

”This system has such magic qualities, that it’s capable of creating perfect stereo imaging in every seat. Every person that doesn’t experiences stereo imaging is, obviously, vulgar and not suitable for his job.”

Everyone was sitting to the left and to the right all along the center walkway.

The sound system was set in such a way, that all the seats where inside the left and right P.A. towers coverage area.

The concert began.

The emperor was sitting in the center of the room, and he marveled at his own sophistication. The stereo image was perfect!

Everyone else shuffled in their seats realizing how vulgar they were and the danger they faced of losing their jobs if they were caught. To them, the sound appeared to come almost exclusively from the nearest P.A tower from their location.

When the concert finished, all the guests congratulated the emperor over the vivid stereo image they had experienced. Everything seemed to go well until a little boy, putting words to everyone’s thoughts, said:

”Why did all the music except the tom drum come from the right speaker?”

What the boy had said was true, and everyone knew it.

For some reason, the stereo image only worked in the very center of the room. How could this be? Was there something wrong with the sound system? With the mix? With the room acoustics? None of the above.


There is one simple and irrefutable problem: stereo effects don’t scale when moved from a studio to a bigger room. You could have all the stereo coverage needed for every seat, but that doesn’t mean you’ll experience stereo imaging when you leave the center.

Everyone agrees that stereo spatialization is better perceived from the center. But in a studio, or in a living room, one can move freely over a large part of the room and still experience reasonably effective stereo.

Try it yourself: Play a well mixed track in your living room, sit directly in front of the left speaker and close your eyes. Although off-centre, it’s still possible to identify the instruments all along different horizontal locations in between speakers. Now try it again in front of the P.A tower of the left, from a 30 meters distance in a concert hall. No more gradual horizontal movement between both sides. The image stays almost exclusively in the left speaker.

Keep your eyes closed, and slowly head to the center of the room (be careful!) until you reach a point where you find the same panoramic image you experienced in your living room. Be objective! This is all about real experience, not expected results. Surely, you will be standing just a few steps away from the center of the room, not much further than in your living room.

The distance you can travel in your living room while retaining acceptable stereo imaging is almost the same as you can travel in a 5000 seat concert hall before you lose spatialization.


Panoramic location between two sound sources depends on two interrelated factors: Time differences and Intensity differences. Let’s analyze intensity differences first.Turn gradually the pan pot in your console to the right. You have created now a difference in the level between the channels, favoring the right one, thus, the stereo image (as it’s expected) moves to the right.

This happens, as long as you remain seated in the center of both speakers. If, by any chance, you’re sitting to either side, the image won’t move the same way the pan pot does. Why? Here comes the defining factor in sound localization: time difference.

We locate the image depending on which source arrives first to our ears, even if the time difference is minimal and the later source has more intensity. The psychoacoustic relation between these two factors is known as ”Precedence effect” and was analyzed in 1950, among others, by the now famous Dr. Helmut Haas.

The ”sweet spot” for binaural localization (stereo imaging) is within the first millisecond of time difference. If the time difference exceeds the 5 milliseconds, the sound image can only be moved by brute force. The channel that arrives last must be 10 dB louder than the first to achieve this.

Now this is where the scale concept really comes alive.

Time and intensity differences don’t translate equally when we scale from a small space to a large one.

The intensity difference is a proportion between the level of both sources (the two speakers, the two channels…). The intensity relationship between left and right channel is the same in your living room than in a stadium. If you’re standing at twice the distance from one speaker in reference to the other, the intensity difference will be 6 dB, This will remain the same, no matter if the difference is 1.5 and 3 meters, or 15 and 30 meters.

The time difference, however, is not a proportion. It is simply, the DIFFERENCE in the arrival time of both sources.

While the intensity difference was kept constant in the previous example, the time difference will be multiplied by 10 when we increase the distance from 1.5 (4.4 ms approx.) to 15 meters (44 ms).

Given that the time difference is the predominating factor in sound location, you can clearly see that the odds are low when you’re trying to achieve stereo in large scale.

Because we only have a 5 ms window to control the image, the usable space to recreate stereo in a stadium is, in proportion, really small compared to your living room. In other words, the horizontal area needed to experience true stereo localization (the space where the images can be situated) is barely larger in a stadium than it is in your living room.

Nobody wants to admit that there is no stereo for the big crowds. From a mix engineer point of view, stereo represents an advantage. If he is mixing from the center of the room, it’s easier to listen individually to each instrument in the mix if they are panned all along the horizon. Plus, it’s more fun this way.

The diagram shows a concert room and a living room. The living room is in scale to the concert room. The light-shaded area in the living room drawing shows the area where the time difference between sources is less than 5 ms. This is the area where true stereo is achieved.

The same shading in the concert room is where one would assume you could obtain stereo imaging. The dark-shaded area shows the real area where stereo works properly in a concert room.


The search for a stereo image can have a negative effect in the frequency response uniformity if the speakers are arranged in a way where there is too many overlapping of coverage area.

Signals panned to the center, almost always the important channels, will arrive at different times to the seats far from the center. This causes severe comb filtering and changes the frequency response for each listener.

Comb filtering, or combing, is one of the side effects caused by combining signals that aren’t in sync. The time differences change the phase relation between both speakers for all the frequencies. In any location, the frequency response obtained will depend on the phase relation between both signals. When the phase matches, there will be a total sum. When the phase is inverted, there will be a total cancelation.

In any point in between those two, the combined signal won’t have sums or cancelations. Instead, it will have a series of audible peaks and depressions in the obtained response. Each change in location will hold different time differences between left and right channel, and because of this, a new phase relation, resulting in a new series of peaks and depressions in the frequency response.

The irregularities caused by combing are more severe when you have two signals with the same intensity but different time arrival.

The more you try to spread the stereo, increasing the overlapping area of the speakers, the more audible will the peaks and dips will be. This is not to be taken lightly. A sound system with a large overlapping area will have variations of up to 30 dB in the frequency response over a band width that changes from seat to seat, turning EQ into something completely arbitrary. A short 1 ms delay will create a 1-octave hole in 500 Hz and will scale that way. Longer delays degrade the intelligibility and the sound quality even further.

If the stereo image is the most important, then you should fully pan the channels and make the overlapping coverage area of the speakers fill the room. The only way to beat time difference is forcing it with intensity. Although this expands the stereophonic area, you will be left with terrible level differences between channels at both sides of the room. However, channels panned to the center will have a variable response over the listening area, caused by the combing obtained with all the overlapping.

This technique was used for many years by a nameless touring band, which hard-panned several of its musicians. In the center of the listening area, the stereo was fantastic.

However, fans that couldn’t arrive early to the shows, in order to get seats in the center, would have to choose between listening to the left drummer and the guitar player, or, the right drummer and the keyboard player.

If the priority is to make the entire band enjoyable for the whole audience (and I expect it to be this way), then, leave the stereo as a special effect. Design the sound system in a way that the overlapping of the left and right speakers roughly matches the 5 ms time delay window area. Reduce the level of infill speakers so the front and center coverage can be achieved without big overlapping spaces. Don’t waste your time, energy and money on stereo delays and fills.


All of these can sound radical, maybe even heretical to many readers. After all, we have put too much time and effort into stereo reproduction in P.A systems.

It would be awesome if we could achieve stereo in every seat of the room, or even half of them. If a large amount of the audience receives the benefits of stereo imaging, we could argue that combing and intelligibility loss are a reasonable price to pay for it. But it is futile and self-destructive to fight against the laws of physics and psychoacoustics and to pretend that we are experiencing stereo, when we are not. Remember our priorities.

It is unlikely that our customers will raise their voice because they don’t have enough stereo. They certainly will of course, if everything sounds like a telephone or can’t be understood, two of the most common results when searching for stereo on big shows.

Mono sound reinforcements seem like something we should have already discarded for something better, but they have a big advantage over stereo: They work.

This is not a statement that will please the emperor, or the band manager, but it does hold some truth: ”This system has such magic qualities that it’s capable of creating perfect mono imaging in every seat”.

So thank you Jose Luis.

Explore posts in the same categories: Alignment & Design, Live Sound, Studio and Cinema

19 Comments on “The Emperor’s New Stereo”

  1. Larry Elliott Says:

    Fantastic example Bob… Oh how I wish there were more “little boys” that would question the nonsense we are being told!

    • 6o6 Says:

      Thanks Larry. I had a fascination with Greek Mythology as a kid but did not expect to go into a field with so much mythology as a grownup.

  2. Sators Says:

    Great article – definitely passing around.

  3. Stephen Devine Says:

    I even understand this. Brilliant!

  4. Anonymous Says:

    Is there any way to design a system that provides separation of instruments but is NOT stereo? In other words, the discussion always seems to be about mono vs. stereo where what I’m really looking for is mono vs. some form of instrument imaging separation for clarity purposes. Would it ever be possible to design a system so that instead of creating two separate imagine sources (left/right,) we could create it as something like center/outside imaging (using 4 point sources where 2 would normally be used) or some other solution, so the listener can still hear _some_ sort of instrument separation even if it is not the traditional left/right imaging? Again, I’m not looking for the traditional left/right for the entire room, just left/right/up/down/whatever for a listening position, even if that imaging gets flip/flopped as you move throughout a room.

    • 6o6 Says:

      Thanks for posting. The short answer is YES. and so is the long answer and the medium answer. The solution is in thinking and designing multi-channel rather than stereo. If we design multi-channel we can offset the placement of sound sources, which makes for the sonic separation that we enjoy.
      Here are some examples of systems I have had experience with in the multi-channel realm:
      1) Musical theatre systems: Many of these systems use a separate set of speakers for voice and music. The mono voice systems are very precise in their coverage and therefore highly intelligible. The music systems are very wide and highly overlapped and linked (level and position wise) to the sound coming from the pit. This can be highly effective as a panoramic spread for a sizable part of these halls (which are not so big and tend to be fairly narrow.
      2) Extreme multichannel: Two examples A) the mermaid theater in Tokyo Disney which is a domed circular listening space – like a planetarium. We have 14 very wide coverage speakers all around in a circle. Each of the 14 has a unique mix (derived from 7 stereo mixes). One pair has keys, another horns, strings, another drums etc. The vocal is mono in a separate system. The instrument mixes combine in the space (a small space of 700 people) and everybody gets to hear all the instruments – but all of the locations get a different mix. Everyone hears a spatial spread of the instruments all around them – just a different one. This is all playback and has no link to any visual – otherwise it would be very strange to the brain.
      B) Love theater in Vegas (Beatles LOVE show.) Highly overlapped multichannel system. Works fabulously but again (like the Mermaid) we have complete control of the tracks – no leakage issues etc. and no visual link to musicians spread along the stage. The sound can come from all around you – or all in front of you depending on the settings at the time.

      But the gradual and controlled placement of sound sources in a stereo mix (like we hear at home) is VERY limited in its translation to the concert hall. If you want a lot of people to image to exact spots – you need speakers in those spots. Just a pan pot won’t do it.
      3) the final note on this is cinema – Left-Dialog-Right. The cinema people know that the closeup on the actor will be at the center – and the only way to get the sound there is put the speaker there.

      Hope this helps


  5. Eduardo Brewer Says:

    Hey Bob, I’ve translated a couple of your articles for a friend of mine who really likes your work but doesn’t speak a word of english. Do you want to take a look at them???

    • 6o6 Says:

      Thanks Eduardo,

      I would be happy to. Send them to me. Maybe I can put together an area here with spanish language resource.


  6. Luis Herranz Says:

    Hi Bob, brilliant articule.

    What do you think about a multichannel of L-R-L-R-L-R…L-R across the width of the room? That way almost everyone could hear a rather stereo (and more important, benefit from more instrument clarity).
    Do you think it could be done with line arrays or is it likely to produce to much comb-filtering?

    • 6o6 Says:

      I m sorry to be slow to reply – your post accidentally went to the spam catcher. L-R-L-R schemes depend on the shape room. If it is very wide but not deep then it can work. If the room is deep then (like you say) there will be too much combing – this would be the case for line arrays or other speakers because it is a horizontal plane issue.

  7. chris Says:

    Hey Bob, thanks for the insight into large-scale stereo, gotta love it when physics defines the necessary solution and not a sales pitch!

  8. […] going to refer back to a couple of things. For starters, Bob McCarthy recently republished an old article of his on his blog. For those new to the party, Bob McCarthy basically wrote the book on system […]

  9. […] The Emperor’s New Stereo […]

  10. Ken Says:

    Hey Bob,

    I was wondering if you could explain “The Seat-Dip Phenomenon.” Someone else was telling me that my measurements weren’t accurate because this phenomenon creates a loss of low frequency measured at the heads of audience members in seated venues. Apparently it is formed by the bottoms of seats and the floor.
    I was placing my measurement microphone at head height and the person I was arguing with wanted the mic to be at standing height. Is this a real problem or is it another sound myth. How does a sound designer overcome this when designing a loudspeaker system?

    • 6o6 Says:

      This is a subject that comes up pretty often. There are several considerations here, but nothing that I would term a “phenomenon.”
      1) If you place a mic very near a seat back you get a reflection. This creates a strong comb filter signature that would greatly color your data.
      2) Now ask yourself the question: “Is that reflection going to be present during show conditions. The answer is Yes only if the seat is empty, in which case we don’t care how it sounds.
      3) If the seats are occupied then we want to get the most representative response for that area – not an over representation of a single seat
      4) This is best served by a standing head height mic because it is less prone to strong very local reflections. It is true that no single person is there but would you really expect that if you stood up in your seat that the sound would change dramatically. Hopefully not. So a standing height mic (in most applications) is a degree or two above the ears of a listener – but would be the same angle as a person 1 or two rows back. For the row or two behind the mic the angle is correct but the mic is a bit close. Maybe a few tenths of a dB louder than would be at their row. So the standing height mic is really a representative for the area – not just a seat. Unless we are using 1 degree speakers we will be OK. This is a small price to pay to get away from combing that WON’T be there at showtime.

      Hope this answers your q. Thanks for posting

      PS – The exception is a frontfill mic which needs to stay pretty low or the angle error will be too big. For the other 98% of the time you are good to go.

  11. […] ämne som kanske ligger mittemellan de två är väl kanske ett som Bob McCarthy också skriver om, nämligen vilka i en typisk livepublik som egentligen upplever stereo “på riktigt”, […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: