Thursday, 31 January 2013

Clearing the Postback Response in ASP.Net

Most of us don't think too much about bandwidth usage of our applications since the first rule of optimization is: don't. If we want to speed things up we might just gzip the pages or whatever but while I am working on a system which is massively multi-user, any savings I can make in the response are potentially large savings in bandwidth, something which is great for cloud services which charge for bandwidth.
I realised much to my surprise that if you call Response.Redirect in the code behind after a postback, all this does is add in some html to the response like this:

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/blah/blah.aspx">here</a>.</h2>
</body></html>


...but the body of the page is still returned after this html code even though the browser will shortly be redirecting. This effectively doubles the bandwidth requirement for the system since every page is returned once on the GET and once on the POST. I decided to write a method that all pages could call for redirect purposes which would only return the redirect code and nothing else. I had to play around with all the similar and confusing methods that are available in ASP.net and ended up with this:


 internal static void Redirect(HttpContext context, string location)
        {
            context.Response.Clear();
            context.Response.Redirect(location, false);
            context.ApplicationInstance.CompleteRequest();
            context.Response.End();
        }

The first line clears anything that is already in the response. The second effectively generates the html that will cause the redirect and "false" means it won't throw a thread abort exception in the process. The third line ensures that no other filters or pipelines will be called (which might add additional data to the response) and the last one ensures that the default Render method is not called which would put the page into the output.

Monday, 28 January 2013

How are passwords cracked?

Introduction

Every day it seems there is a new article in the newspaper about how some system was hacked and passwords stolen. How is this achieved and how can you make your passwords stronger so it is less likely that you will be a victim?
Sadly, most cases of password hacking are not carried out by people with advanced skills and Telephone Repair man uniforms who connect crocodile clips to various electronic panels inside the buildings. Almost always, an easily preventable flaw in the design of a web site allows an attacker to gain information which can then be used to obtain the database behind the site which stores the details of the passwords. The attacker then copies this data and processes it on a system somewhere until the passwords are cracked. Sometimes this success is published but I suspect often is not either because the victim did not know of the breach and sometimes because of the fear of lost reputation or legal precedings.
Anyway, I want to describe in straight-forward terms how passwords are hacked. I will not describe how the original site might be hacked simply what happens after the database is obtained.

Web Sites and Databases

Firstly the database is a technical term for what might be represented like a spreadsheet or several sheets of squared paper. Each sheet will represent what is called a table and one of these will probably be called "User" or something like that (the name is not important). Inside this table "User" is a row for each user and each column will contain some information about that user such as name, age, email address, user id, password etc. The site might collect all kinds of different information and store it in the database but we will assume for example that each row only contains user id, password and email address. Something like this:

 User name  password     email address
=======================================
 lb1        Password123  luke@gmail.com

If the site is poorly written, the passwords are simply stored as plain old text and can be read or used at will by the attacker. The most dangerous aspect of any password system is that your email address and password are very likely to gain access to many other accounts on other sites since most people share passwords. Obviously in this case, however good your password is doesn't matter since the attacker has to do no work other than hack the web site and read the data.

Hashing

Most sites, in my experience, will not store the passwords in plain text but will do something called hashing, which turns the plain text password into something that looks completely different. If we use a method called MD5 (don't worry about the name) then the password "Password123" always becomes "42f749ade7f9e195bf475f37a44cafcb". Why is that useful? Well firstly, you cannot tell what the password is directly since there is no obvious connection between the password and this "hash" code but more importantly, a good hashing method will make it 'unfeasible' (very hard) to compute what the original password is just from knowing the hash. When a user logs into the web site, we don't need to check the actual password but we can compare the hash of the password they type in with the hash of the password they registered with, since the hash always produces the same output for a given password, this will work.

Hashing Weaknesses

Sadly, although hashing sounds wonderful, the biggest problem with it is that the hashing methods are generally all publically available and I can do something called a reverse look-up attack. What this means is that I compute thousands or millions of hash codes from obvious passwords like password, password123, letmein etc. and store all of these in a large computerised lookup table. If I then obtain a hashed password like "42f749ade7f9e195bf475f37a44cafcb", I can then look it up in my pre-computed table and hopefully find a match.

Defeating Reverse Lookups with Salt

Salt describes some additional text that we add to the password, both when it is first registered and also when the user logs in after which we carry out the same matching process as before to verify the password. Why does this work differently than before? Let us take the example of Password123 which we already used but this time, I will add some random data to the end of it like (&*^( and has it again. This time, the hash produced with MD5 is "8bafefe15f21e75dd0e084ecd25752b2" which is not related at all to the original hash we produced "42f7...". Note that the password is still the same, the user does not have to type the random data in, it will be added automatically by the web site. This works pretty well and means that if somebody is attempting a reverse lookup on this hash, they are unlikely to have pre-computed an effective password of "Password123(&*^(". If the salt is long enough then the chances of cracking it with reverse lookup are very low for any password however complicated.

Defeating Salt with Human Behaviour

Although the salt system appears to add enough randomness to prevent what would appear to be the only type of attack someone can perform, it does have a weakness and that relates to human beings not liking passwords and therefore many people using the same password as each other. In order for the salting above to work, the random data must always be the same so that you can match the hashes to determine whether the password is correct. In other words, if 10,000 people use the same password, even with salt, the hash value for each will be the same. Why is this useful? If you have enough users in the data you have stolen, you can determine which are the most common passwords and compare these to the most common, say, 10 passwords that people use: password, 123456, 12345678, abc123, qwerty, monkey, letmein, dragon, 111111, baseball. You then have a much more simple task as an attacker, you would take the password "password" and then add random data to the end of it and then compute the hash for each of these until you had a match to one of the 10 most common hashes in your stolen data. So compute the hash of "password", "password0", "password1", ... "passwordabfgdhjk" etc, naturally we would assume the salt wouldn't be something stupid like "salt" but you never know!
This attack is quite straight-forward and can be accomplished within anything from seconds to minutes. Importantly, once the value of salt that is added to the password has been determined for one password, it can then be used in the calculation of all the other passwords since the one piece of truly secret data has been computed.

Defeating Human Behaviour with variable salt

There is hope however and it involves using a salt that is different for each user. There is not much point in making it different only for each password (for example repeating the password before hashing: "password" >> "passwordpassword">>Hash) since this does not prevent the patterns of data in the database. One simple example might be to add the username to the password before hashing it: "password" >> "passwordlb1" >> hash. What this means is that a different user can have the same password as me but it won't produce the same hash because the salt will be different. This removes the patterns of data from the stolen data which makes it harder for the attacker but it is still possible to crack and it then depends on brute force.
Defeating Variable Salt with brute force
Brute force generally suggests running through all possible combinations of data to find the real value you are looking for. Since there are trillions of possible passwords, the chances of getting through them (or statistically half of them) is pretty small so this all sounds good. There are two problems with this however. Firstly, some password cracking systems are VERY fast and can crunch billions of passwords a second. Secondly, even though potentially there are very many passwords that people can use, you can still assume that at least most of the user accounts will use one of the most popular 100 passwords. In this case, you know the input (or that it is one of 100 different values) you know the hash value(s) from the stolen data so all you have to do is work out the salt just for one password which will probably give you the mechanism by which the salt is added and again then opens the door to cracking the remainder.

Strong Passwords and Defeating Brute Force

Even if the attacker has gleaned, e.g. that the passwords have the userid added to the end of the password before hashing, imagine they now want to determine the password for user xyz. Effectively they are attempting to work out Hash("xyz") = and this is where strong passwords really win.
The attacker at this point has to compute hashes for as many combinations of passwords they can think of and keep checking whether the result is correct, if not, it tries again. The attacker will build up dictionaries of common (and not so common) passwords like "password", "password123", "monkey" and also include dictionary words like "hotel", "climate" and these will include all likely substitutions of capitals, numbers and punctuation such as "h0tel", "Hotel", "Hot3l", you get the idea. This is why English dictionary words are bad, even if you think you are being clever with substitutions. Also, adding things like 123 on the end of your password is also common so don't bother with that.
Once the attacker has exhausted their "dictionary", they would only have the option of starting with, say, "a" and then going through all characters before moving onto "aa" etc. Taking into account letters, capitals, punctuation, even an eight digit password contains 5e+14 combinations (5 with 14 zeros!) which will take a long time to compute. This is time perhaps somebody might bother with for a specific account they are trying to attack (like president Obama's) but unlikely to bother with for some random person's gmail account. In other words, long random passwords are good so what are your options?
  1. Use a 'truly' random long password, say 16 characters, which you store in an electronic key chain like keepass. This way, you don't remember it, you just copy it from the electronic key chain. Not so useful for using when out and about unless you have access to the key chain.
  2. Remember a sentence and use the first letters of each word to form the password. For instance, to do that with the first sentence in this bullet point would give you rasautfloewtftp which is nicely random (and long)!
  3. Use a long sentence if the site allows you to. Sadly, many sites restrict the size of password to something like 10 characters but if not, you could use anything like "ThisIsMyPasswordAndYouWillNeverRememberIt" although making it slightly more random or personal would help here.




Tuesday, 22 January 2013

How to deal with software suppliers

I have no sympathy for suppliers who's software is found to contain SQL injection vulnerabilities. Of course, the people who lose because of the software they buy/use and trust, I do feel sorry for but there is still a whole credibility issue with software and people not being able to tell the difference between good software and bad. I guess it is the same as car repair garages. They all look similar but some are professional and good value, others are poor and expensive but what does this mean for software?
SQL Injection attacks have been known for ages and are so easy to avoid. Setting relevant permissions on the database user (i.e. not using root/sa); using stored procedures only, where possible; using parameterised queries; even basic input sanitisation most of which even by themself will help massively but together make the system air-tight. For any company still selling software with these basic and well known security holes speaks volumes about their ability/credibility and motivation. It is also not acceptable to blame 'legacy' software. If you sell software, it has to be fit-for-purpose and if you have inheritied legacy software you need to be able to extend or modify it to be suitable (or replace it if not). In many companies, it would seem, people are happy to make money on existing software as much as possible rather than investing in replacements/fixes/patches. So no sympathy there.
What are we to do, however, as software buyers if we ourselves are not software savvy? Even if we are savvy, what is to stop the supplier promising lots of things that are either half-truths or even lies? What are we to do when we are talking about potentially 1000s or even millions of pounds of software? Like most trades people we deal with, there are various industry standards including generic ones like ISO27001 (Quality assurance) as well as software specific ones like OWASP (Open Web Application Security Program). If you are spending decent money, you must ask about these. If a supplier has none of these, then ask them why you should trust them - be brutal. Personally, I wouldn't spend more than a few thousand pounds for software from someone with no accreditation. If they have an externally audited system like ISO27001 then you have some recourse if things start to slip but it will be limited and will probably not include any financial backup - the best you could expect is the auditor to pull the accreditation but that won't get your money back. Others like owasp are voluntary and so even if the supplier uses it badly or still allows something that shouldn't theoretically be possible with OWASP then there is still no direct comeback unless these issues are contracted against which can get complicated and expensive (but worth it to some extent for the largest contracts).
Because software in many ways has lots of unknowns, most suppliers will not be happy to provide fixed prices except for simple systems or ones where you can lock down the requirements for 2 years, something which is often unworkable. Others will simply multiply the cost by 5 and use that as their fixed price!
Really, all you have is some kind of relationship that you need to foster. You need to be open to talking about money and that includes things like, "you didn't tell us you needed X and it will cost another 5,000" and be open about asking for cost breakdowns and justifications. For larger systems, it might well pay to employ a dedicated Project Manager who understands software. Although there are skills that transfer from all project management, quite simply, unless the PM understands software, they will not understand whether developing X should really cost Y. When the customer says that a custom protocol for ethernet programmed in C will cost 50,000 - would you know if that was ballpark?
Since security is often where you as a customer can be stung, it is not unreasonable to expect proof that software has been independently penetration tested. Most large bespoke systems should be tested as part of the delivery but if you are buying it off of the shelf, why not visit the supplier web site and find out whether that software has been tested.
Lastly, the other big sting can be ever-increasing costs and for the customer, it can be deadly, for the supplier it might be justified with ever-changing requirements (the curse of software development). There are two things here. Firstly, be open up-front with the likelihood of requirements changing and perhaps what parts might change. In other words, if your system produces a report which is likely to change format - specify it so the software can be designed to easily modify the report later rather than a rebuilding. We assume most software is easy to modify but usually, this is only in certain areas. Secondly, be clear that you want a system that can deliver "the main thing" early on for user testing and feedback. Tell the supplier that you don't want a system that is undeliverable until zero-day when all the money is due because this doesn't allow you to cut your losses if the system is taking too long and costing too much money. Having the bulk of functionality e.g. half-way through the project allows some pressure to be taken off the supplier since this will prove many live issues early on and then allows additional functionality to be added later. Thirdly, be wary of massive customization which is often the part that takes all the time and money. Tweaking web pages is not as easy as most people think (depending on what you are changing) but if an off-the-shelf framework does most of what you want, try and adjust your business processes to suit that rather than spending 1000s trying to modify something into something else - you can imagine this rarely ends up looking pretty and it costs loads but it is so common it is almost unbelievable. Suppliers need to educate customers and customers need to listen to their suppliers. If supplier says X is hard - read expensive and don't dig your heels in for the sake of it.
Personally, I would like to see proper software guilds that individuals or suppliers can join which is like an audit but more closely matched to building trade guilds where non-compliance results in insurance-backed compensation and mediation by independent persons. The result is you get a badge which says, "I follow best practices" and "I get audited regularly". There are so many resistant to such things but I think if it is easy enough to join and not too expensive, then this raises the credibility of companies and gives more peace of mind to customers.

Thursday, 17 January 2013

Azure Deployment Fails for SSL

I got the following error when trying to deploy an Azure project. This project had previously (successfully) been deployed but I was trying to move it to another subscription:

13:35:01 - Preparing deployment for AzureWeb2 - 17/01/2013 13:34:56 with Subscription ID 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' using Service Management URL 'https://management.core.windows.net/'...
13:35:01 - Connecting...
13:35:06 - Object reference not set to an instance of an object.
13:35:06 - Deployment failed with a fatal error

(The short answer is I hadn't uploaded ALL of the SSL certs to the service I was deploying to.)

The problem was not noticeably described anywhere in any more detail and I couldn't find anything on Google that described how to log the error except that a manual deployment might help.

I worked out that I could upload to blob storage via the server explorer in Visual Studio and then "upload a deployment" from inside the management portal which then gave a more useful error:

"The certificate with thumbprint 01e3910d7f7cxxxxxxxxxxxxa1aa4b3de11c77f5 was not found."

This obviously explained the problem but what was confusing was that although I had two certificates linked to the project, only 1 of these was specified on an endpoint (this second one was a test certificate) so it seems strange that the deployment itself failed rather than a failure at runtime. It would also have been nice if Visual Studio displayed the same error as the management portal did!.

Anyway, as you might expect, once I also uploaded the second cert, it was all fine.

Thursday, 10 January 2013

Simple SSL Setup on Azure

Introduction

There are other longer guides to this but for many of us, we already know the basics and don't need screen shots just a simple list of items to enable SSL on Azure. It is pretty easy. I will mention commercial SSL certificates first and then self-signed certificates later - as you might imagine, there are lots of similarities.

Commercial Certificates


  1. Publish your site to Azure production so you know the IP address of the site. If you are using the Azure URL then write this down. If not, setup your own DNS to point your custom domain name to the relevant IP address of the Azure site. Note that this will cause the site to be online but you can either publish it with no endpoints or have some way of disabling access until SSL is setup (if required).
  2. Create an SSL certificate request in IIS7 by going into the Server Certificates and selecting Create Certificate Request. Fill in the details, the Common Name is the name of the URL you are securing like www.example.com. Although these fields are not all required by standard, IIS will insist you fill them all in.
  3. On the next tab, choose Microsoft RSA and a bit length of 2048. On the next tab, specify a file name for the request to be written to such as c:\req.txt
  4. Go to your preferred SSL certificate site (I like gogetssl.com) and perform the certificate request by pasting in the contents of the text file you just wrote.
  5. Once you get the certificate back, install it in IIS by selecting Complete Certificate Request in the same place. The friendly name is useful when adding it to your Azure project.
  6. You now need to add the fingerprint to your web application/service (note it does NOT add the certificate to the project, just its fingerprint). Do this by double-clicking the relevant role in the Azure project to open the config and go into Certificates. Choose a friendly name, select "My" as the store name and click the ... next to the thumbprint field. You then need to choose the correct certificate from the list (this is where the cert friendly name is helpful).
  7. Go into Endpoints and add the relevant endpoint for SSL (port 443 by default) and then select the certificate you just added. Optionally remove the non-SSL endpoint.
  8. Now you need to export the certificate from IIS with a password so it can be uploaded to Azure. In IIS, select the certificate you want to export and select Export on the right-hand side. Give it a file name (pfx extension by default) then type in a password to secure the certificate. You will tell Azure this password but anyone else who might be able to steal it cannot use it without this password.
  9. Once exported, go into Azure, select the web app, go to the Certificates tab and select Upload. Browse to the certificate and type in the password you already entered. Once this is uploaded, it will display the thumbprint in the Azure window - this should be the same as the one you can see in your project configuration. What Azure will do automatically now (once you re-publish your app with the SSL config included) is grab the certificate you uploaded by matching its thumbprint to that in the application config. The rest should work fine. Note that the SSL cert is tied to the URL not the IP address so if you want to access SSL by multiple URLs, you will need to either forward all of them to the same URL or buy multiple SSL certificates and add all of them to the project as above.

Self-signed Certificates

When you are first testing applications, buying a commercial certificate is not always an option. If, however, you know what the ultimate URL will be and it isn't currently used on another site, it is much easier to buy a commercial certificate rather than encountering the errors you get through lack of trust.
  1. Note that if you are performing communication using SSL between multiple applications on Azure, I could not find a way to enable the calling application to trust the self-signed certificate of the application being called. In this case, you might want to consider a commercial certificate for the application/service being called (about $5 for one year using gogetssl)
  2. You can easily generate a self-signed certificate, I won't go into details here but there is a previous blog post: http://lukieb.blogspot.co.uk/2012/12/etc
  3. Once this has been generated, you go through the same process as per the commercial certificates, that is, link the certificate to the Azure project in the Role configuration and then upload it to the Azure portal.
  4. Note that as expected, when you visit these sites, although the encryption will be secure, your browser (or other application) will warn you that the certificate is not trusted. In some instances this is important because anyone can self-sign a certificate and pretend to be your web site so a man-in-the-middle attack is possible when using self-signed certificates. If you want to nail this down some more, you can set the issuer of your self-signed certificate to be trusted on your local PC but this has potential implications for any viruses that can use your local machine to generate SSL certificates and then spoof web sites with them. Note you can always delete the trusted certificate after you have tested, otherwise, as mentioned, just buy cheap commercial ones and avoid the issue altogether.

Provider pattern

The provider pattern is a simple and useful way of enabling different implementations for the same interface and in particular for asp.net, using the built-in helper classes and web config to be able to set or change these at run time.

There is a great (and pretty long) guide here by Microsoft: http://download.microsoft.com/etc If you are just interested in writing your own custom providers and then scroll down most of the document past the built-in providers (such as role and membership providers).

The basics are straight-forward. You create a base class which inherits from System.Configuration.Provider.ProviderBase and define abstract methods that match the interface you want to use. Specialise this base class into the various flavours (for instance, I have providers for Azure and FileSystem).

Each of the specialised classes must override Initialize and carry out both sanity checks on the passed-in configuration and also read/initialise local variables from the configuration (all the details of what to check are in the link). Note that when reading in configuration values, it is best-practice to remove these after reading them in and if you are left with any extras by the end of Initialize, it means there are invalid configuration values and you should throw an exception (this is in the link).

You then add some configuration to your web config which will require a configuration section to be created for your providers (again, details in the link) and then a facade which implements a static method for each method on the interface and then lazy-loads the provider and passes the call(s) onto whichever provider is specified in the config. A simple example with only 2 concrete providers leads to 6 classes. It makes sense to create all of these in their own library rather than directly in the web application/web service.

You need to be very careful with threading. Any methods on your providers, except for Initialize, need to be thread safe. Local variables are fine, read-only fields are fine but any other static or instance variables must be locked before being accessed. This is because a single provider will be loaded per application so all HTTP requests share these methods on multiple threads. Use lock() on an object to do this but don't lock too much since this will impact on performance.