The Power of Numbers

Yesterday, I learned a valuable lesson on why you should never assume that a situtation will ‘never happen’ when it comes to a server or network.  In this case, it revolves around a POP3 mailserver and email attachments.

One remote site accesses all of its email via POP3, unlike all the other sites which are on Exchange.  This site is connected to the core office by a T1 link.  Normally, problems are few, as most traffic on the link is telnet.  Then, one user sent an email message.  A very large message.  Around 23MB large.  To all 89 employees at that site.

In going over the logs for that timeframe, the source message took about 10 minutes to send.  That didn’t cause any problems.  Its what happened once it hit the server that brought everything to a crawl.  All the users are setup on a Linux VM.  When sendmail received that 23MB attachment, for all 89 users, it made 89 distinct copies and gave one to each user (this here is why I love Single Instance Storage in Exchange).  The copies immediately chewed up a little over 2GB of space. 

Within minutes, the T1 link was suddenly brought to a standstill by the other 89 users’ Outlook doing a send/receive operation automatically.   The phones began ringing, and the problem was quickly tracked down.  However, the network link for the server did have to be disconnected for a few minutes to prevent users from getting a lock on their mailboxes, so we could clean them up.

Had the message been allowed to sit in place, it would have taken slightly over 3 hours and 5 minutes, at full saturation of the T1, for everyone to get their mail (and do nothing else during that time).  The results of this little fiasco?  Attachments now have a file size limit to match our Exchange limits, and POP3 traffic is rate-limited on the link to 768kbps.

I will be so glad once this location has been switched to Exchange.