On a consulting gig, I was recently asked to investigate a strange problem with a Lucene server on Windows Server 2003.

The Lucene index was periodically refreshed by running a new instance of the app, then killing the old one via "taskkill". Worked fine, except the available memory displayed by Task Manager somehow steadily decreased with every app refresh, and it would run out of memory every 2-3 days. However, just by killing _all_ java processes, the memory would be magically reclaimed and available memory would immediately jump up to the correct amount.

Well, turns out that the culprit was Windows' file system cache, and the default Windows Server 2003 settings which gives priority to the System Cache over application processes. So we ended up with a huge file system cache and not enough memory to start a new application process.

Here are some links which were helpful in troubleshooting this problem, and its eventual solution.

Initially I was trying to find where the missing memory went, since the "Available physical memory" value didn't match process totals given to me by the "Processes" tab.
Everything you always wanted to know about Task Manager but were afraid to ask helped me determine that "Commit Charge" was what I was really interested in, and that value did indeed match process totals. So it wasn't some memory leak then.

After taking another look at Task Manager, I realized the biggest culprit was System Cache.

http://smallvoid.com/article/winnt-system-cache.html gives an overview of the Windows system cache, and, in particular, lists these tools:

Both CacheSet and SetSystemFileCacheSize worked in setting an upper limit on the file cache size, and that solved our problem of the missing memory.