Today we noticed a weird error where our CMS software was unable to write files to the server. Upon investigating, it appears the /tmp directory filled up the hard drive, which prevented file writes to the server.
We rebooted the server and it took 20-30 minutes to reboot. The AWS boot logs were saying:
"A start job is running for Create Volatile files and directories"
As it turns out, the /tmp directory is only cleared on reboot. Since there were so many temporary files on the disk, it took that long for the server to delete the files before coming back online.
I learn something new everyday.
The other week we updated an Ubuntu 12 server to 14. Prior to upgrading the distribution, we updated all packages and rebooted. That reboot took around 1.5 hours to reboot. I bet we ran into the same scenario, where the /tmp directory was loaded with files. This would make sense since that particular server hadn't been rebooted in a year or more.
Perhaps rebooting might need to happen on a more regular basis? Are there any best practices I'm not aware of?