Tuesday, 12 January 2010

You MUST backup Linux PCs

Lots of people know that when you delete a file on a computer, you do not remove the contents of a file, you simply remove its "index" entry. Certain programs can forensically scan a disk and recover this data. However, as I found out the other day, this process is VERY slow and mostly not 100% effective. There are a few chances with the journalling system but that is hit and miss.
Why? Well for a few reasons. Firstly and most obviously, the file data on the disk might now be overwritten by a new file - one reason to shutdown as soon as you cock something up. You might want to kill the power since shutting down can cause log writing etc to overwrite the disk. Onto more details however, small files are often stored in adjacent areas on the disk, in addition to this, certain files have known markers (i.e. a jpeg starts with a specific byte and finishes with another) so small images can often be recovered easily. Large files however are often broken into chunks and put anywhere on disk, although within a certain boundary. When you delete a file in ext3 or 4, the information about where these chunks are is lost (unless you are fortunate enough to have updated the file recently and you can find the info in the journal).
At best you know that the file exists somewhere in un-allocated disk space so you can dump all this to a single file. You can then hope that the blocks are contiguous which to be honest is likely for small files and unlikely for large ones. You can then scrape this data to find beginning and end markers which again only works for certain documents (although if you know some of the contents of your file, you can use this to find it). If the blocks are broken up then best case is you get other parts of files inside your recovered file which might or might not be fixable and worst case, you cannot put your file back together (i.e. you will not be able to manually scrape binary files with unreadable content).
Another issue is that the scraping can lead to hundreds of thousands of recovered files (anything you have ever deleted, you might even have moved something and end up with several copies).
The Police might find this useful because they would only need to recover several images or documents to incriminate someone but for specific files, it is more of a miss than a hit.
The moral, make sure you back up all your important files. I don't know how mine were deleted (not exactly anyway) so it can and does happen!

CacheItemRemovedCallback not firing

I was getting a bit annoyed using HttpRuntime.Cache in an asp.net app. I really needed the cache since I was generating reports and to do these each time can be very slow (~30secs). The problem was that the cache never seemed to expire.
When caching, I used the Add() function to add a placeholder of the string "caching" and then started a thread to do the work, this was to prevent caching the item more than once at a time. Inside the thread function, after generating the report, I then used the indexer [] brackets to update the "caching" placeholder to point to the generated report. This worked in as much as it was caching but it never expired. I added a CacheItemRemovedCallback function just to check and it was never called.
After trying loads of things, I found that calling the Insert function via the indexer [] form does NOT re-use the expiry settings of the original Add function and possibly also not the dependency and removed callbacks - it sets the item to no expiry.
All I had to do was call the Insert function directly and give it the expiry I wanted and all was well again. Another major omission from the MSDN docs but at least now you know!

Monday, 11 January 2010

A New IDE Would be Nice

Most things are like church. You carry on doing things the same way you always have until someone asks why and no-one really knows. You just have.
Take the trusty Integrated Development Environment (IDE) I use Visual Studio mostly and to be honest it isn't bad from a usability point of view. However, I have been reminded recently how easy it is to introduce bugs into code because at the end of the day these IDEs are generally little more than glorified text editors. OK, we all know in theory that our design tools generate code for us so we can't make mistakes but in reality we do, we change the name of something and the best we get is an auto-prompt asking what word we want. If we change a name then maybe the compiler will find any dependencies and show us errors. At best this is time consuming but at worst if you are using late-bound technologies/reflection etc then your program can fail at run-time.
Rather than a text editor, it would be nice to see a tool that I guess is more like a design tool but treats fields and functions as symbolic objects that can be linked symbolically rather than by name. This way they can be renamed willy-nilly and can have properties attached (meta-data) such as pictures, descriptions etc that can form something much more rich and useful than plain text and hopefully, eventually, something more robust that is harder to break when we modify and refactor, something that even, maybe, can prevent us making errors in code since it will not permit broken symbolic links, will not permit us to leave pointers un-initialised etc and will lead to high quality software that doesn't have the usual umpteen thousand bugs when it goes into system test for the first time.