Friday, September 26, 2008

Coldspring stopped me cold..

We just recently upgraded to Coldspring 1.2 and it wasn't long before I came to a grinding halt the next time that I needed to restart my CF instance. I got the following error:

Object Instantiation Exception
An exception occurred when instantiating a Java object.
The class must not be an interface or an abstract class. Error: ''.

The error occurred in C:\websites\coldspring\beans\AbstractBeanFactory.cfc: line 253
Our beta system wasn't having the same issues, so I decided to look at what might be different in the two environments. I found that I was still on 8.0.0 (doh!). So I went to http://kb.adobe.com/selfservice/viewContent.do?externalId=kb403277&sliceId=1 and upgraded.

That worked right? Wrong! Though it was good for me to get up to speed with the proper version, it did nothing in getting me back to productivity. I browsed some more on some blogs with pieces of the error message and came across a blog that seemed to think this error happend when some nulls were being thrown around when it expected actual values. I looked at this line in the loadFrameworkProperties method in Coldspring/beans/AbstractBeanFactory.cfc:

<cfset local.fileStream = CreateObject('java', 'java.io.FileInputStream').init(arguments.propertiesFile) />
... which caused me to look in the initial code or the AbstractBeanFactory.cfc file here:
<!--- ColdSpring Framework Properties --->
<cfset variables.instanceData.frameworkPropertiesFile
= "/coldspring/frameworkProperties.properties" />
<cfset variables.instanceData.frameworkProperties
= loadFrameworkProperties(ExpandPath(variables.instanceData.frameworkPropertiesFile)) />
So I looked for the .properties file in the /coldspring directory and found nodda. I plugged an empty file into the coldspring directory by the name of frameworkProperties.properties and I was back in business.

Hope this helps.

Blessings...

Tuesday, September 23, 2008

I shot the session...

I was reading across a couple of old blogs recently about the need to kill a session immediately rather than waiting for it to timeout. If you are using J2EE sessions, the session will be orphaned when the browser is closed but the session still lives on until it eventually times out according to the time set for your application. There may be a need to clear out some of those sessions when you're sure they are orphans. Maybe you have a user that is logging in from another workstation and would like to clear out the other session so that you don't have the same user logged in twice. Whatever your reason, there is a way to immediately clear out the session and still have the onSessionEnd execute in your Application.cfc. Try this little piece of code:
<!--- get the session id by working with the session tracker --->
<cfset _sessionid = '74306b9b10b4c354a8db101f73246434611b'/>
<cfset killSession(application.applicationname,_sessionid)/>

<cffunction name="killSession" output="false"
access="public" returntype="void">
<cfargument name="appName" required="true" type="string" />
<cfargument name="sessionid" required="true" type="string" />

<cfset var st =
createobject("java","coldfusion.runtime.SessionTracker") />
<cfset st.cleanUp(arguments.appName,arguments.sessionid) />

</cffunction>

If you do not have J2EE sessions enabled, you can call cleanUp(application.applicationName, _cfid, _cftoken).
Note: I haven't tested this one yet.


This will successfully remove that session. It will also execute any onSessionEnd code that you have to take care of any cleanup scripts that you have. This is an undocumented method of sessionTracker so use with caution knowing that things could change. I'm not sure if this works in versions before CF8. I would be interested in knowing, if someone would be so kind as to test it.

Blessings...

Monday, September 22, 2008

Undocumented Goodness

In working on a project the other day, I needed to allow users to interact with data in such a way that they were not tripping over each other. This led me down the path toward a proof of concept with monitoring sessions for a particular application. The problem is that when you "touch" the sessionScope methods it will reset the session timeout and keep dead sessions alive. This can be countered by calling the methods using reflection like so:
<cfset tracker=createObject("java","coldfusion.runtime.SessionTracker")>
<cfset sessions=tracker.getSessionCollection(application.applicationname)>
<cfoutput>
<cfloop item="loopSession" collection="#sessions#">
Idle Time:
#getSessionProxy(sessions[loopsession],'getTimeSinceLastAccess')#<br/>
</cfloop>
</cfoutput>

<cffunction name="getSessionProxy"
output="false" access="public" returntype="string">

<cfargument name="session" required="true" type="struct" />
<cfargument name="method" required="true" type="string" />

<cfset var _a = arrayNew(1)/>
<cfset var _sessionClass =
_a.getClass().forName("coldfusion.runtime.SessionScope") />

<cfset var _method = ''/>
<cfset var _value = ''/>
<cftry>
<cfset _method =
_sessionClass.getMethod(arguments.method, _a) />
<cfset _value = _method.invoke(arguments.session, _a)/>
<cfcatch><!--- Do Nothing ---></cfcatch>
</cftry>
<cfreturn _value />
</cffunction>

That works great and is very valuable data, but what if you need to find out what the session.idofuser is on one of those sessions or another session variable that you need access too? There really is no reflected function that gives you access to those variables without touching the session timeout. You could do sessions[loopSession].idofuser but that would trip the session. Even a cfdump of the sessions collection trips the session. I couldn't find this documented on blogs or anywhere else, but there is a new sessionScope method going by the name of "getValueWIthoutChange". This must be new in CF8. I'm guessing one reason it was added is for the server monitor which gives you access to this info without touching the session timeout. If you google this method, you get absolutely 0 results. When I saw that in the dump of the sessionScope class, I knew there was hope. Important to note too is that Java is very case sensitive. Notice that the W and I are capped in "WIthout". Typo that made it through? Anywho, my next challange was finding the correct casting and such to be able to pass a var into the reflected getValueWIthoutChange method from CF. I flailed on this for about a day before I took this to a co-worker that was a java guy in a past life. He had it nailed down for me in a 1/2 hour or so. Long story short, we ended up with a method that will allow you to get any session var without touching the sessions. This is a beautiful thing and opens up all kinds of possibilities.
<cfset tracker=createObject("java","coldfusion.runtime.SessionTracker")>
<cfset sessions=tracker.getSessionCollection(application.applicationname)>
<cfoutput>
<cfloop item="loopSession" collection="#sessions#">
Idle Time:
#getSessionProxy(sessions[loopsession],'getTimeSinceLastAccess')#<br/>
User ID:
#getSessionValue(sessions[loopsession],'idofuser')#<br/><br/>
</cfloop>
</cfoutput>
<cffunction name="getSessionValue"
output="false" access="public" returntype="any">
<cfargument name="session" required="true" type="struct" />
<cfargument name="key" required="true" type="string" />

<cfset var a = arrayNew(1)/>
<cfset var valueMethod = ''/>
<cfset var value = ''/>
<cfset var sessionClass =
a.getClass().forName("coldfusion.runtime.SessionScope") />

<cftry>
<cfset a[1] =
CreateObject("java","java.lang.String").GetClass()/>
<cfset valueMethod =
sessionClass.getMethod("getValueWIthoutChange",a) />
<cfset a[1] =
CreateObject("java","java.lang.String").Init(arguments.key)/>
<cfif findnocase(arguments.key,structkeylist(arguments.session))>
<cfset value = valueMethod.invoke(arguments.session, a)/>
<cfelse>
<cfset value = ''/>
</cfif>

<cfcatch><!--- Do Nothing ---></cfcatch>
</cftry>
<cfreturn value />
</cffunction>
<cffunction name="getSessionProxy"
output="false" access="public" returntype="string">

<cfargument name="session" required="true" type="struct" />
<cfargument name="method" required="true" type="string" />

<cfset var _a = arrayNew(1)/>
<cfset var _sessionClass =
_a.getClass().forName("coldfusion.runtime.SessionScope") />

<cfset var _method = ''/>
<cfset var _value = ''/>
<cftry>
<cfset _method =
_sessionClass.getMethod(arguments.method, _a) />
<cfset _value = _method.invoke(arguments.session, _a)/>
<cfcatch><!--- Do Nothing ---></cfcatch>
</cftry>
<cfreturn _value />
</cffunction>

Now, as an important note and as has been echoed on other blogs, this is an undocumented method and the farm should not be bet on it. There is no guarantee that it will live on in other versions of CF so use wisely.

With that said, praise God for technology and go change the world.

Blessings...

Friday, September 12, 2008

RegEx broke my phone

We had a bug turned in on a form that was not accepting valid phone numbers, or so it seemed. The number used was something like 233-122-2323. Looks like a valid phone number right? I dug into the form field and found that it was a cfinput tag utilizing the validate="telephone". We recently moved to CF8 and I was wondering if this was a CF8 issue that was somehow unique. I looked at the generated source in order to evaluate the resulting js that is used to validate the form on submit. That led me to this code:

//form element phone 'TELEPHONE' validation checks
if (!_CF_checkphone(_CF_this['phone'].value, true))
{
alert(_CF_this['phone'].value)
_CF_onError(_CF_this, "phone", _CF_this['phone'].value, "A valid phone number is required.");
_CF_error_exists = true;
}
We can find the code for _CF_checkphone buried in this file, depending on your instance:
\JRUN4\servers\[instance]\cfusion-ear\cfusion-war\CFIDE\scripts\cfform.js
Finding the function in this file revealed the regular expression that is being used to validate phone number.

/^(((1))?[,\-,\.]?([\\(]?([1-9][0-9]{2})[\\)]?))?[,\-,\.]?([^0-1]){1}([0-9]){2}[ ,\-,\.]?([0-9]){4}(()((x){0,1}([0-9]){1,5}){0,1})?$/

1-800-322-5544 or 220-122-2323 (the number used on the form)
Now I'm not a regex guru nor do I work with it every day, so color coding is a definite help for me. There are some nice tools out there that can help you test regex both pay and free. I've been playing with RegExBuilder (free) lately. It doesn't have all the bells and whistles like being able to switch the regex engine, but it works for what I need right now. Anywho, let's break this down:

**- the ? makes the 1 optional

**- this has to be a digit between 1 and 9, never 0

**- two digits that are between 0 and 9

**- any character that is not 0 or 1. Surprisingly, this allows non digits

**- two digits that are between 0 and 9

**- four digits that are between 0 and 9

So according to the test number that was entered in the test, it violates rule **.
The funny thing is that we could have entered anything else besides 0 or 1. I tested 701-B86-5566 and it worked. But anywho… I think this form is performing as expected, unless of course there are valid phone numbers with the ** being 0 or 1. I wouldn’t know where to find that info and a brief google session didn’t turn anything up. I gave up quickly because didn't feel like digging into that right now. I'll leave that up to a more ambitious person. But the question to be answered is whether or not this should be reported as a bug and request Adobe to fix that regex in CF8 to be more accurate on the [^0-1] test. Why not [2-9]?

Blessings...

Token Broken...

This summary is not available. Please click here to view the post.

Wednesday, August 27, 2008

Double execution, my bad

Yesterday was a bug fighting day for me. I might add, twas a frustrating one at that. Here's the scenario:

We have a simple subscribe box allowing users to receive email updates for certain categories if a new job appears in that category. If there is a new user, it should prompt them for their user info. If it is an existing user, it should prompt them for their pin. The problem that I was seeing is that a new user was being prompted for a pin, indicating that this was not a new user at all. The odd part was that it worked in my dev environment but not in our beta or prod environments, same exact code.

So... digging in, I set several cfdumps throughout the application followed by a cfabort.
(Side note: for a cool way to view a stack trace from several levels deep, see Ben Nadel).
I didn't really see anything from that info. All it told me is somehow the record for the user was being created before the code I was observing. That was a head scratcher because I was at the beginning of the code execution. I looked all over the relevant pages for a cflocation or a window.location thinking that somehow it was being recursive. Nothing. I looked at the fusebox parsed file and there was nothing in there that told me it was circling back. Now what? Since this was in beta and we didn't want to turn RDS on, I couldn't do the step through. That wouldn't have helped me anyway knowing now what the issue was. I did turn on debugging and that didn't give me much. I decided to set up a sql trace and found that there was indeed a double execution going on but it simply wasn't showing in my browser.

After lunch I came back and searched on coldfusion and some play on the words "double exexution" and found a blogging from the cf4 days about how some code was causing double execution on an image tag that had a non cfoutputted variable as the src. The explanation came back that the browsesr was seeing the # and going back to the page again for the image causing it to run twice. That got me to thinking. Maybe I should be looking in the iis logs. Sure enough, there it was:

2008-08-26 19:41:27 ******** GET ****** 80 - 69.41.14.80 libcurl-agent/1.0 200 0 0
2008-08-26 19:41:31 ******** GET ****** 80 - ***.***.***.*** Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.9.0.1)+Gecko/2008070208+Firefox/3.0.1 200 0 0


One was my browser... but before that was a libcurl-agent. What was that? I was thinking it was some of our data gathering visit trackers but I came up empty googling libcurl in combo with their names. Finally I researched the ip. Using arin.net, I found that this "bot" belonged to

Michigan Online Group MOG-69-41-0-0 (NET-69-41-0-0-1)
69.41.0.0 - 69.41.15.255
Covenant Eyes, Inc. MOG-69-41-14-0 (NET-69-41-14-0-1)
69.41.14.0 - 69.41.14.255


Oh man... Covenant Eyes. That is my integrity software that I am running locally. The filter service gets wind of the site that I want to visit, rushes out and see's it before I do in order to check its content, then flags my system to say that its ok to visit. That was essentially creating the user before my browser could get to it. By the time I got there, it was percieved as a return visit. Man... I just wasted 7 hrs looking into it (I'm obsessive I know). I'm all in favor of running integrity software because we're only as strong as our weakest moments. I still like what Covenant Eyes does, but if you forget about how it works it can cause a few headaches and wasted hours. The fix to this is to set the site url to permanent allow under the Filter History and Settings area. That will stop the filter from "pre-visiting" the sites that you are trying to debug.

Thursday, July 31, 2008

pickIE pickIE

I spent yesterday looking for a nice solution for drag selecting multiple items on a page. After trying several keywords to try and find what I was actually looking for (drag and drop seemed to dominate the results), I finally stumbled across this solution: http://drjavascript.com/drag-select/.

I downloaded the script keeping the copyright intact, plugged it into my own demo page and it works perfectly.... in FF. I happily pulled up an IE browser and got the ominous error message sound with an always helpful "Operation Aborted".



Thanks IE for the specific error message. I'll be sure to track that down.

I noticed that it worked fine on the originating site, but it was dying on my test site. That had me scratching my head. So I proceeded to cut and paste function by function back into the script tags to see exactly what syntax it was bombing on. It finally bombed on a certain function in the script. But when I left that function in and cut out a previous function, it wouldn't bomb. So again I was back to scratching my head.

Well, after flailing for a while, I decided to leave that rabbit trail and focus more on the ambiguous "operation aborted". After varied search results I came across a statement saying that IE doesn't like you changing the DOM before it is done writing it completely. I went back to my script and looked for things that were changing the DOM. There were two sections of code where it was doing a document.body.appendChild( obj ). When I commented those lines out, all was fine.

Ok... so that was it... now how to fix it.

Now that I could google with a more accurate sense of the problem, I found that you can add a defer attribute to the script tag like so:

<script type="text/javascript" language="javascript" defer="true">

This allowed the entire DOM to load before running the drag select init code. One more question remained however. Why did the originating site work and my test site not work. On out test framework, the test template that I am using doesn't have the final word as to what the DOM looks like. We do some post execution code after the core content is generated. So since the drag-select code was initialized in the core content, the DOM was still being worked on after my test. That would explain the difference between the two sites.

Hope this helps...

Monday, June 16, 2008

No Cookie for You!!



UPDATE:
A better explanation has surfaced. See this post for more info http://jochem.vandieten.net/2008/07/03/reserved-names-for-cookies/
I am leaving this entry as is as it still solved our issue. Call it a "work around".


So we've been noticing a lot of entries in one of our logs that look like this:
06/16 16:27:08 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:27:08 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:27:54 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:27:54 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:28:34 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:28:34 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:29:22 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:29:22 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:30:13 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:30:13 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:31:11 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT
06/16 16:31:11 error Cannot create cookie: expires = Wed 09-Jun-2038 16:17:42 GMT


I kept seeing postings from a couple of years ago that people would run into this issue as a result of a memory shortage in the JVM. This was not the case with us however as we have 400+ megs of memory free on average.

I decided to try and pair up the times of these errors with our IIS logs to see what I could find. I did find a matching entry that looked like this:
2008-06-16 00:00:01 W3SVC762671395 xxx.xxx.xxx.xxx GET /xxx/ xxx=5773E359497D4F1B 80 - 12.129.9.198 HTTP/1.1 Botster.LinkChecker/v.1.0 CFID=xxxxxxxx;+expires=Tue+08-Jun-2038+16:28:07+GMT;+CFTOKEN=xxxxxxx;+expires=Tue+08-Jun-2038+16:28:07+GMT;+JSESSIONID=xxxxxxx - www.xxxxxxxx.com 200 0 0 526 356 453

Not sure what to do about it yet. We could block Botster.LinkChecker at the IIS level as we have an ISAPI tool in place. But I'm wondering if there is something we can do at the code level too. The section of the log entry with +CFID, +expires, etc.. is part of the cs(Cookie) area. I'm assuming this is the area that reveals which cookies were requested to get set. Is it possible to check the user-agent for Botster and then redirect at the code level? I don't remember the order of operations there.

The rest of the story...
After looking at this for another day I finally realized what was going on. The key to this all lies within CF's client management. When the bot is hitting our site, cf wants to establish a session so it tries to write two permanent cookies, cfid and cftoken, to the visiting client. The problem comes in when bots don't allow the setting of permanent cookies. CF was noticing this and writing an error entry a couple times every minute, as often as the bot was hitting us.

CFID=7857398;
+expires=Thu+10-Jun-2038+13:51:18+GMT;
+CFTOKEN=1959af634c5b6ec3-96CB3B7F-17A4-A7F4-705CBC2B0F9052AA;
+expires=Thu+10-Jun-2038+13:51:18+GMT;
+JSESSIONID=2e3040ee771260d33446

The light turned on for me when I broke down the iis log entry cs(Cookie) area like this. It looks like the first item in the ';' delimited list is the cookie that was requested. If it is followed by an expires, that is a modifier to the previous item in the list. So when I saw this I realized that it was CF that was trying to set the cookie, not some renegade bot code. Notice that JSESSIONID doesn't have the expires modifier because it is always set as a session cookie and goes away when the browser is shut down (hence, not an issue for bots). I think its helpful to break the info down to simple lines, otherwise you get lost in the horizontal scrolling.

There are two solutions for this that I looked at. The first I mentioned before was filtering the bot at the ISAPI level. The second was using j2ee session management. We already had this turned on but we must have forgotten about the fact that the setclientcookies in the cfapplication tag defaults to true. Jsessionid replaces the need for cfid/cftoken though cf still seems to put cf still seems to append cfid/cftoken to the urls. Cflocation will append these variables if you have addtoken=true. We can solve our problem by simply adding the setclientcookies=false in the cfapplication tag for the app.

FYI: if you are using cfloginuser, cfid/cftoken perm cookies will still be set if in your cfapplication tag you have loginmanagement set to cookie. This seems to override the setclientcookie setting. I haven't tested it yet, but I bet if you set loginmanagement="session", you would not see the perm cookies being set. This is not a concern for us at the moment since the bot is not trying to log in, its simply hitting publically accessible data.

Additional info on J2EE session management can be found here:
How to write CFID and CFTOKEN as per-session cookies
http://kb.adobe.com/selfservice/viewContent.do?externalId=tn_17915

Hiding / Encrypting ColdFusion CFID And CFTOKEN Values
http://www.bennadel.com/blog/785-Ask-Ben-Hiding-Encrypting-ColdFusion-CFID-And-CFTOKEN-Values.htm

Friday, June 6, 2008

16 thread pile-up...

We have been having issues with one of our servers going down. The offending action seemed to be happening in the middle of the night some time. One of the symptoms was a sql table that was locked and the site would just spin when trying to access it. It was either that or the cf service was plain dead in the morning. After further review in the logs, I found these scattered throughout:
06/05 23:49:53 Error [jrpp-57] - Error Executing Database Query.[Macromedia][SQLServer JDBC Driver]Connection reset by peer: socket write error The specific sequence of files included or processed is: X:\xxx\xxx\xxx.xxx, line: 348
removeOnExceptions is true for xxx. Closed the physical connection.

followed by a bunch of

java.lang.RuntimeException: Request timed out waiting for an available thread to run. You may want to consider increasing the number of active threads in the thread pool.
at jrunx.scheduler.ThreadPool$Throttle.enter(ThreadPool.java:116)
at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:425)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
Notice that the connection to sql was being reset when CF was attempting a sql execution. It didn’t look like the same line of code every time, but there is one line that is showing up more often. While researching the error in google, I came across a couple of possible causes. One was a possible flaky network that was dropping the attempted sql. While I was checking on the sql server box for a record of possible network issues in the sql log, I found that there were some sql errors that were occurring at the exact same time as our CF woes. Here they are:
Event Type: Error
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17066
Date: 6/5/2008
Time: 11:49:50 PM
User: N/A
Computer: xxxxxx
Description:
SQL Server Assertion: File: , line=9421 Failed Assertion = 'NULL == m_lockList.Head ()'. This error may be timing-related. If the error persists after rerunning the statement, use DBCC CHECKDB to check the database for structural integrity, or restart the server to ensure in-memory data structures are not corrupted.

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 3624
Date: 6/5/2008
Time: 11:49:50 PM
User: N/A
Computer: xxxxxx
Description:
A system assertion check has failed. Check the SQL Server error log for details

Event Type: Error
Event Source: SQLSERVERAGENT
Event Category: Alert Engine
Event ID: 318
Date: 6/5/2008
Time: 11:49:55 PM
User: N/A
Computer: xxxxxx
Description:
Unable to read local eventlog (reason: The parameter is incorrect).
(Link: http://go.microsoft.com/fwlink/events.asp.
)
After looking at a couple of these errors on google, I saw one instance where it was solved by moving to SP1 for SQL2k5. We recently upgraded to SQL 2K5 and are still on SP0. There may be an issue with sp0 where it doesn’t handle locks correctly and it may be dumping there. Notice this line… SQL Server Assertion: File: <lckmgr.cpp>, line=9421 Failed Assertion = 'NULL == m_lockList.Head ()'. The lock isn’t being handled correctly for some reason.

We have opted to try to up to SP1. Think we can trust it? Its only been out since 2006 ;). We'll see what happens.

UPDATE (6/10/08): our server has been up for 2 straight week days with no issues at all. In fact, the site has been reported to be responding much faster. A lot of SP1's updates were efficiency related so things should naturally respond better after upgrading.

Thursday, June 5, 2008

CF Mail Spooler Woes...

We have been having some issues with our mail server recently. Mail hasn't been going out and this is what we noticed in the logs:

"Error","scheduler-4","06/03/08","10:33:24",,"Could not connect to SMTP host: xxx.xxx.xxx.xxx, port: 25; nested exception is: java.net.ConnectException: Connection refused: connect"
javax.mail.MessagingException: Could not connect to SMTP host: xxx.xxx.xxx.xxx, port: 25;
nested exception is:
java.net.ConnectException: Connection refused: connect
and
"Error","scheduler-1","04/02/08","03:30:56",,"A problem occurred when attempting to deliver mail. This exception was caused by: coldfusion.mail.MailSpooler$SpoolLockTimeoutException: A timeout occurred while waiting for the lock on the mail spool directory.."
From what I found, this occurs for two reasons:
  1. Bombage from sending a huge email
  2. Insufficient disk space
Reference: http://www.talkingtree.com/blog/index.cfm?mode=entry&entry=67FD4A34-50DA-0559-A042BCA588B4C15B

We did notice a message the other day claiming that we were running low on disk space.
There is a recent hotfix from Adobe that supposedly helps this that was released in April '08 that supersedes a previous hotfix in this area. That
hotfix can be found here: http://kb.adobe.com/selfservice/viewContent.do?externalId=kb402001&sliceId=1
We applied the hotfix today and cleared some disk space so we'll see happens.

Update: after getting our hands dirty today, KP found that there was a backup of 10,000+ emails in the inetpub/mailroot/queue directory. Something is getting hosed at that level. He found this by telnet'ing to the server's port 25 and sending a message. It told him that the message had been queued for delivery. He never did get the message which told us that the bottleneck was at the queue. We've got a ticket opened up to check it out.