Wednesday, 31 December 2014

Why most of our projects are too expensive and executed poorly

I must admit, these thoughts were not inspired by software projects, although the more I think about it, the more this applies to many types of projects across countless industries, including software.

I was driving to Liverpool from home last week, a journey of 140 miles and which, on a good run, would take about 2.5 hours. It is almost entirely a motorway journey and I usually plan not to stop en-route to save time. It took me 4 hours to get there and 4.5 to get back. Not quite nightmare material but pretty poor nonetheless. The reason? I had chosen to travel on the Saturday and Sunday between Christmas and New Year, clearly a very popular time for people to travel to see the family or friends and the sheer volume of traffic overwhelmed the motorways.

The Problem

A first type of person would simply accept that too much traffic equals congestion - the defeatist. I would like to think that the kinds of people who manage these things however accept that there is a solution - the big problem, and the basis of my argument, is that of the remaining people, most lack any real innovation. They basically believe that money = solutions. Their solutions are expensive to install, expensive to maintain and will life expire probably long before they were supposed to by which point there is no come-back other than to spend even more money.

Then there are the innovators!

A Good Example

Let's rewind to the 1930s and the removal of the tram lines in Halifax, Yorkshire and the resultant problem that car drivers could no longer see the road well at night. It turns out that the silver rails were useful for navigation in the days when street lighting was not universally installed. If most people at the time were asked for a solution, some would have said there is nothing you can do, most would have probably said, "let's install a load of electric lighting" - a non-innovative (but possibly acceptable) solution based on what has already been invented. Percy Shaw however, inspired by seeing lights reflect in the eyes of a cat, produced a relatively cheap and clever device - the "Cat's Eye", which was a simple pair of reflective studs embedded into the road surface and using the car's own lights to reflect where the centre of the road was. These clever devices are still installed largely unchanged from the original (although some idiots have tried to improve them with LEDs and solar power etc. not realising their error) and this goes to show that innovation improves things, most other times the solutions just make things more expensive.

A Bad Example

Let us return to the UK motorways. There are various motorways that at certain times are very heavily used, mostly around London and Birmingham. The M5/M6 junction, particularly, get's very busy as two sets of traffic attempt to merge (just before an exit which probably makes it all a bit worse). What have the Department for Transport done here and elsewhere? They have produced "managed motorways", a new flavour of the month and a horrifically expensive way to try and improve the throughput of the motorways at these busy spots. I would be interested to see how much they help things but let us consider the additional expense.

Basic Motorway

A carriageway, some emergency phones, an occasional overhead LED sign, some occasional hazard warnings lights and either street lighting or cats eyes. Fairly easy to maintain.

Managed Motorway

The hard shoulder is converted into another lane that is used when required to increase throughput. Because of this, the lane markings are now confusing - you have a solid line separating lane 1 and the hard shoulder but you sometimes want people to drive in it - you do this by having loads of overhead signs telling people to use the hard shoulder (in English - so foreigners beware!) you then get slight mess at exits since the traffic might be exiting from the hard shoulder or lane 1 depending on whether the hard shoulder is in use or not. There are a load more overhead gantries with variable speed limit signs - about 5 of these weren't working when I came through (not great for a system that is less than a year old). Because the hard shoulder is now used for another lane, there are additional refuge areas for broken down vehicles, which had to be built. Of course, you cannot assume that a broken down vehicle will reach that far so if someone breaks down, they should be seen in the hundreds of additional cameras that are now fitted above the hard shoulder along its length so that it can be closed again to traffic. All of this requires speed sensors, cameras and more importantly, a control room staffed at various times of day or night so that all this technology can be used as well as it can.

What's the problem? Non-innovative people have thrown money and technology at a problem and have probably made a limited difference to the problem. How can you measure success? Should you even be allowed to spend money if there are no objective success criteria? There are a couple of reasons why this particular scheme doesn't work as well as I'm sure the salesmen said it would. Firstly, we are conditioned not to use the hard shoulder - ever. People are fined for stopping there for any reason other than emergency (and understandably so) so why do you think that putting a sign in English above it will make people comfortable driving into it. I drove down a mile or 2 section where not one vehicle was in the hard shoulder despite the signs telling us to do it. Secondly, The complexity is worrying since it requires a whole plethora of additional signs that drivers are supposed to read and respond to while driving. This is always dangerous but seems to have been missed on the design of these managed motorways. Since most people gravitate to the right of the motorway, there will be loads more undertaking, especially where the hard shoulder is for an exit and someone is shooting up the inside just as another vehicle decides to exit here also. People do not obey the variable speed limits and like most road laws are so rarely enforced, it is almost a joke. If people ignore the speed limits, the throughput cannot be increased by slowing everyone down a bit. You can tell these limits don't work well because when it gets busy, the limit is almost always higher than the speed you can actually drive at.

The worst section was going southbound where the M5 and M6 diverge. Historically, it was a simple 1 + 1 + 1 scheme. Left lane for M5, middle lane for M5 or M6 and right-hand lane for M6. It's obvious, it's instinctive and it worked here and elsewhere. Now we have the wretched temporary hard shoulder, the overhead sign (which is printed so it can't change) now says you have 1 lane only for the M5 and 2 for the M6 so everyone for the M5 is in lane 1. Guess what? When they switch on the hard shoulder, you now have two lanes for the M5 but as mentioned previously, people don't quite get the idea of driving in the hard shoulder so they don't. 2 lanes of traffic now squashed into 1 with a whole load of new signs that people are trying to read. I knew that what they had ended up with is worse than before. I have driven that junction many times and it's simply worse. They added money and no innovation and ended up marginally better in some places (at what cost?) but worse in others.

The Solution

OK, it's easy to criticise a specific example but I think the fact that most people are average and lack not only innovation but even any training or encouragement to be innovative in their solutions. What is the solution?

The answers are specific to each problem but I think the principles probably work across the range of different industries.

1) What is the problem and why? If we really understand this, the first question is how can we avoid the problem in the first place? We accept that roads get busy at certain times but why is this so? Regimented work hours - probably inherited from the Victorians, school runs all over the place, seasonal travel. There are actually ways to deal with these issues - good alternative transport, schemes to car-share, schemes to spread out the traffic either cooperatively or with some system that drivers could use to ensure that they can make their journey at a time they know will be clear (rather than guessing).
2) If you have to manage the problem - this is the worst scenario and to be avoided - how are you going to garner ideas? Do you assume that all the most innovative ideas will come from the government departments or local councils? Do you actually have the desire to source ideas from other countries or people outside the immediate industry? There are things they do in Japan that would work in this country but for reasons that are beyond me, they are not done. Do you encourage your people to think creatively and give them training in how they might do that? Do you have a philosophy that actively discourages ideas that are not in the mainstream of thinking and do you have good reason for this other than just the fear of risk - even at the immense amounts of money these things cost to get wrong?
3) So you have a solution. Are you going to be able to measure it's success objectively? Are you going to hold onto the idea for pride reasons even if it isn't actually value for money? Can you set a timeframe when you will know if it works? Just because it worked in one scenario, do you know how to measure its effectiveness elsewhere rather than just rolling it out? What is the cost of reversing the work if it doesn't actually work out as you thought? Are you going to allow people to critique the solution or have you already decided what you are going to do?

Maybe there is no innovative solution - or at least not one that would probably work so the last important question is, "do we have to do anything?". If you cannot find a solution that seems to make provable benefit to the scenario, are you better off leaving it alone? Let people see that there is a problem so they avoid it themselves rather than you pretending you're helping and encouraging even more people to make the problem even worse?

Software

Since this is a software blog, I guess I should try and anchor some of these points into the software world, although I think most of the principles are the same. You mostly likely have a customer who effectively wants a solution. That customer might even be you if you are a service provider but the whole design process and your solution is in danger of doing what the roads people have done. Their assumption is that the solution is more of the same and yours might be likewise. 

You give the customer a web site with a shopping cart but is that really what they want or need? You give them a framework hacked around to suit - is this scalable and maintainable? Do you understand their business need enough to think creatively about architecture and site layout or are you just going to work on the basis that the better the system, the more it will cost?

Are you going to ask the correct questions of your customer? "How often do you need to change things?"; "How often will the design or branding change?"; "How often do your business rules change?"; "How long is this site likely to be used for?"; "What is the worst case scenario for number of users total and simultaneous?"; "Are you looking for cool or traditional?"

Friday, 19 December 2014

Planning the deployment for your new scalable site

Lots of us don't get the chance to rewrite a site from scratch and a lot of the time, that is the correct decision. Most changes are relatively small and the risk of rewrites, especially of large sites are large but if you do get the chance, like I do, since we are majorly redesigning the UX then it is worth spending some time thinking about the structure and eventual deployment locations of your site and its resources.

Scaleable?

If your site is never likely to scale beyond a few hundred users, you might not worry too much about performance or architecture but some performance you can get for free and some decisions make it easier to maintain. It is worth correctly factoring these in code so that when you create your new site, you can easily find them and copy them into the new site. For instance, the default version of Response.Redirect in .Net does not terminate the current request or wipe the response buffer, it sends the whole normal page, but it includes the redirect header as well. This is a waste of bandwidth and in most cases is undesirable so I have a version of Redirect in a utility class that wipes the response and only sends the 301 redirect header.

My site is potentially a multi-million user site (although we are an authentication provider so they won't stay logged into our site) so I want to make some good decisions out of the box. I want to minimise load on the main web servers where possible and although I want to minimise database work, I won't be changing those layers in this rewrite, that is already sorted.

The Trade Offs

One of the problems you quickly learn is that there are trade-offs. For instance, using Less CSS makes for a much more easy and flexible way of updating styles and colours on your site but this either introduces the complexity of deploying the changes or otherwise, if done dynamically, would add potential disk, memory and/or CPU load to the web servers. If you have a web server farm, you might also have issues with caching mechanisms not being shared across instances.

Another example is offloading static resources to Content Delivery Networks. This is generally good since CDNs tend to have multiple locations, often closer to the client but this adds some deployment complexity, some cache issues and also the danger of having multiple content domains which require more connections from the client and one of these can stop your page from loading.

Where to Start

I recommend starting from an empty web project. It is tempting to start with a template and that might be alright although, only if you know what the template includes. It is really not worth including anything you don't use unless you think you might in the future and it is easier to include it now.

So in my case, I have used Visual Studio and created an "Empty Web Project". This gives you no content, a single web.config and the only references are to system dlls which are present on all servers that have the .Net framework installed.

Source Control

Time for a cup of tea! Just kidding. One thing that would be useful to setup right away, even with this basic project, is source control. I have just had to rebuild a laptop because some rubbish Azure tools mucked up my Powershell setup (or I mucked it up while trying to fix it) and it seems Windows can't repair this and the web installer thinks everything is already installed. The same thing can happen on a site. You start making a change thinking it's all great and then run into trouble. Can you unpick your changes to go back? You can with source control: Git, Subversion, Perforce, TFS, whatever. Do it now and get into two habits. Firstly, regular check ins/tags so you can easily work out what was changed for what reason and secondly, isolated short-duration tasks, of which you only do one at a time and then check them in. I have made the mistake of perhaps working on 6 tasks at the same time, if one fails test, it can be a nightmare to undo just that one. All deployments should be from a tag - ideally via another machine - so that bugs can be reproduced locally with source that you KNOW matches what is deployed. Try and avoid file-copy deployments unless they came via source control.

HTML and Server-side Scripting

I don't think there is much that can be said specifically about the basic HTML since it is partly site-specific and partly language-specific. I already mentioned my Response.Redirect utility for .Net but that is unlikely to be useful for another other languages, which will potentially have their own quirks.

Making HTML readable is definitely useful for maintenance so indenting and all the rest is important but that is not really a setup issue. Also, if you go too crazy, you can introduce a large amount of whitespace, which although it should be reduce significantly by your web server with GZip is an extreme to be avoided.

CSS (and Less)

Whether you are likely to need Less for CSS will depend partly on whether you are using a framework (like Bootstrap) that is based on less and partly on whether the UX of your site is really important and might need tweaking. If it's a largely functional site, you might not care too much and will just use all the default styles and colours on your user interface and live with it. In that case, you could manually compile the less into CSS (or download it ready compiled) and then just treat it as static css.

Static CSS

Static CSS, like other static resources, incurs a trade-off. On the one hand, you don't want too many separate resources, so you could bundle a whole load of CSS in one file, which only requires one download but then the problem occurs if you make one change to one of those files, the whole thing becomes invalid and needs downloading again. Using a CDN (See later) does not remove this issue of to bundle or not to bundle. Ideally, you should make a list of all the CSS you will be using (at least the stuff you have easy control over, some .Net stuff gets injected for you and is quite hard to modify!). Next to these lists, write the size of the file and whether it will be changed during the immediate life of your site. We aren't talking about upgrading jQuery from 9 to 10 but day-to-day changes. You can generally bundle all the stuff that doesn't change into one file and then depending on the size of the files that do change, make one or more separate bundles. A large file should probably be its own download so only its own changes require it to be downloaded again.

Minification is useful but makes minimal difference with CSS since it is largely whitespace that is removed. Trying to remove unused rules can be hard since some rules might only be used on a few pages on the whole site so trying to work that out can be hard. With frameworks like bootstrap, again, there is so much secret magic, it is hard to know what can be removed or not (although using their selective download feature can help if you don't use, for instance, modals, you don't have to download certain sections).

Less CSS

As already mentioned, Less is a very useful way to create CSS based on rules and on common parameters (as well as other features). This means that, for instance, if you want to change a bootstrap button colour, rather than finding every single reference to it in CSS (including the related colours that aren't an exact match!), you can simply change one parameter, regenerate and it all works.

Once your less becomes CSS, it is treated like other CSS except in one scenario, which is when you are generating it on-the-fly. This allows changes to be made in real-time but can add load onto the web server, which might or might not be able to make effective use of caching to avoid compiling it on every request. This on-the-fly process usually takes a plugin which will handle things for you but personally, I prefer to pre-compile the CSS and then use the normal CSS bundling mechanism to deploy it.

Scripts

Scripts are similar to CSS files in some ways, they are static and some are liable to change but generally speaking, they are not created on the fly and they benefit much more from minification since not only whitespace can be removed but variable names can be changed from their human readable values to single letters. A good bundling plugin will handle bundling and minification for scripts and css files.

Caching and Cache Busting

Caching should be considered early on for all static resources (most of the basic pages are not cacheable since they are dynamic). In general, you should set static resources to have cache expirations of 1 year where possible and cache busting used to force re-downloads. An important tip here is to TEST the caching. There are a few gotchas and it doesn't always do what you think. For instance, as well as the cache headers, your server should support the 304 Not Modified response when a client is asking "has this item been modified since", otherwise it might as well just re-request it and incur the entire download.

One of the issues in HTTP 1x is that although caching can save download time for subsequent page visits, since by definition, the client would not normally re-request the item, how can you update it when it has changed? The only way is to trick the browser into thinking it needs to get a different item. You could obviously change the name of the script/css/image resource but that is clunky. Most often, you can add a querystring to the end with a value that changes when the content changes. Again, most bundling plugins will do this for you but otherwise, a simple way is to tag the md5 hash of the file as its querystring so that any files changes are reflected in the url. This allows long cache times (no greater than 1 year, apparently things get weird in some browsers with longer durations).

Using a CDN

I've already mentioned CDNs. They are effectively locations to serve your content from but there are a few advantages and a few potential disadvantages.

The main intention of a CDN is to replicate content across multiple potentially worldwide locations so that when a user requests a resource, the DNS works out the closest location and gets the resource from there. Naturally, this works best with content that is not changing otherwise you will have different copies of resources until the locations all replicate the change.

If you have a single private CDN location, this works well because you are in control of what is happening with it and should know whether you have a service level agreement with its uptime. If you are using a public CDN for something like jQuery or Bootstrap, it saves you some money but there is always a danger that one of those CDNs could go down and you might not have any control over getting it back up. In reality, are these public CDNs likely to go down any more than your own CDN? I don't know.

If you use CDNs, try to not use too many different domains. Imagine you bring in 10 third-party frameworks from 10 different CDN domains, your browser will need to make 10 connections, possible using SSL to these 10 domains, this could well add noticeable and unacceptable slowness to your site. One or two separate ones are probably OK but otherwise, consider copying all the resources to your own CDN and have them all coming from the same place. The costs are not usually massive and if you are scaling at this level, you hopefully have some income to pay for it!

Disadvantages include the additional cost of the CDN over the web server you are already paying for and some of the risks associated with third-party CDN domains but other disadvantages only really occur when you are pulling in lots of resources from different CDN domains with the already mentioned SSL burden and lack of control of the domain, as well as the fact that any one of these domains going down could break your site.

Clearly, there is also a process implication with using a CDN since objects have to be uploaded to CDN and replicated before they are used.

Conclusion

I will update this post (if I can find it!) once I have my basic shell ready. I already know what frameworks I need so I should be able to plan what to serve from the web server, what from CDN and what from third-party CDNs. I should also have a bundling/minification plugin ready to go and my less folders set up to generate my CSS either on build or possibly manually.

Tuesday, 16 December 2014

Error 403 when calling Soap web service with client certificate from PHP

The SoapClient in PHP is theoretically useful and I have used it to call onto a .Net web service for a year or so. I don't like chaining web services but in this case, .Net does something that PHP cannot so I have to do it for now.

Anyway, my setup uses a client certificate to provide authentication to the soap web service and after moving my PHP web service from an Azure Cloud Service to a pair of Virtual Machines (after someone cocked up the PHP Azure tools!), I realised that the web services were not talking any more. All I could easily ascertain was that I was getting a 403 (forbidden) at some point.

SoapClient provides very poor tracing capability so it was virtually impossible to find out why it wasn't working any more. I could access the web service fine from the PHP VM in Internet Explorer so the IP address/route was available.

I tried Fiddler but it doesn't work too well on Server 2012/Windows 8 and the utility that was supposed to work around this didn't work either so every time I visited my web service in IE to see what should happen, all I could trace were calls to microsoft.com! No help.

What can you do when you have very little data and something that used to work fine? You have to ask what has changed. In my case the PHP version had been increased from 5.3 to 5.5 but nothing in the docs for soap client seemed to say this had changed anything. I re-exported my client certificate but no difference. It was only by chance I saw something on an MS article that made me remember, I had also made some changes to the Soap Web Service, which meant that although it seemed to be working fine from our web application and via the browser, it might have broken something.

THE CERTIFICATE CHAIN!

What I had removed from the Soap Web Service in Azure was a reference to the intermediate certificate for the client certificate. It turns out that the web service checks this by default and in most cases, a browser or the web application, which does have the intermediate certificate available, can send this on request. However, the PHP SoapClient does not use Windows to provide the certificates but simply grabs it from a PEM file and attaches it to the SoapClient constructor. The intermediate certificate is not in the pem file and since it was not on the Soap web service either, the chain was not there.

Solution tl:dr
I could have added the intermediate certificate to the PEM file that is attached to the SoapClient but it was more desirable to simply include it on the Soap web service so it could find the intermediate certificate locally rather than asking for it from the client.

I breathed a large sigh of relief once it was working!

Unable to execute dex: Multiple dex files define Lorg/spongycastle/LICENSE

You know those really annoying problems that you kind of understand but you don't know how to fix? This is one of them!

Android Studio project, one app and one library module, the app is dependent on the library module and they BOTH require the spongycastle encryption library. This issue is certainly not specific to spongycastle though. You add the jar file to both libs folders and each module builds (as expected) but when you build the project to debug, you get an exception and "unable to execute dex: Multiple dex files define....."

What is happening is that dex is trying to merge all the symbols from both projects into one executable but, obviously, there are multiple items with the same namespace and names coming from each of the jar files (which are the same version and everything).

Seems like a normal requirement so how can you avoid the error? Theoretically, you can exclude certain dependencies in the dependencies section of the build.gradle but the syntax requires a tag name, which is not obvious from the jar file (i.e. it isn't just the jar file name) so I couldn't get that to work.

Fortunately, maven dependencies de-duplicate automatically so if you have a maven repository for the project you are using (and spongycastle does, fortunately), then you can compile from repository instead of from a jar file. This is what I did and it sorted out the problem.

If you do not have a maven repository for it, you can either setup a company or local maven repo and move your dependent library to that so that you app can consume it and avoid the problem or you can run a command line utility to give you more information about the dependencies and work out what to exclude (although I didn't test this so it might not work exactly as I expect).

Use the terminal window in Android Studio and the command .\gradlew -q :modulename:dependencies where module name is the name of your module (like "app"). This will display using ascii art, the dependencies path of your module and most importantly, it will show you the names of the dependencies, not just their namespace. For instance, you might find that the Android support library, referenced as com.android.support is actually called "appcompat-v7", which is the name you need to use when excluding it from your main build.gradle like this:

compile (project ("uaf")) { exclude module "appcompat-v7" }

I think there are other exclude options (other than "module") so have a play around.

Friday, 12 December 2014

Random problem with UIView rotation and transforms

I was seeing the weirdest problem on my iOS app with rotation. The problem I have is that you cannot force rotate an iOS screen (although you can on Android). In my case, if the user is viewing a landscape image, I need to rotate it so it is correct, whatever way round the user is holding their phone.

As is the way in the world of iOS, I had to find a workaround and UIView.setTransform looked like the answer. It allows you set a transformation matrix to a UIView so that all content drawn to it is rotated, skewed or scaled (I was only using the rotation part). This should have been fine and appeared to work when I accessed my app directly. However, when I accessed my app via a 3rd-party app (it is a single-sign-on utility), it didn't work, or rather, it appeared to functionally do what I expected (a view in the view inspector was rotated) but it was different than the direct access even though it was the same account, same image and the same code definitely called to transform the window.

It is so hard debugging when you don't even know what might be causing it. The only thing I noticed was the way that the view controller was created when accessing the app directly vs the 3rd-party app. If accessed directly, the following code was being called:

UIStoryboard *storyboard = [self getStoryBoard]; // Relevant one for iOS type
UIViewController *initialViewController = [storyboard instantiateInitialViewController];
self.window = [[UIWindow alloc]initWithFrame:[[UIScreen mainScreen]bounds]];
self.window.rootViewController = initialViewController;
[self.window makeKeyAndVisible];

This worked. When I inspected the view stack, it showed more windows stacked including the navigation controller, layout controller etc. as well as the view controller views themselves. When entering from the 3rd-party app, the code was this:

UIStoryboard *storyboard = [self getStoryBoard]; // Relevant one for iOS type
UIViewController *initialViewController = [storyboard instantiateViewControllerWithIdentifier:@"LoginPointsView"];
self.window = [[UIWindow alloc]initWithFrame:[[UIScreen mainScreen]bounds]];
self.window.rootViewController = initialViewController;
[self.window makeKeyAndVisible];

This didn't work. When I inspected the views, I found that there was only the UIWindow and then the views from the view controller itself, none of the intermediate views. Why this caused a problem was unclear, the windows in both cases seemed to be aligned correctly, the only difference was that the view controller's UIView was 90 degrees rotated with respect to the UIWindow in this broken case. As I mentioned, this seemed more correct since I was setting a 90 degree transform but this made it not work. I modified the broken code to be this instead and it all worked:

UIStoryboard *storyboard = [self getStoryBoard]; // Relevant one for iOS type
UIViewController *initialViewController = [storyboard instantiateInitialViewController];
self.window = [[UIWindow alloc]initWithFrame:[[UIScreen mainScreen]bounds]];
self.window.rootViewController = initialViewController;
[self.window makeKeyAndVisible];
UIViewController *ivc = [storyboard instantiateViewControllerWithIdentifier:@"LoginPointsView"];
[(UINavigationController*)self.window.rootViewController pushViewController:ivc animated:NO];

Just another mysterious reason why iOS is fine for basic apps but really becomes hard when you do anything out of the ordinary.

Monday, 8 December 2014

Understanding XCode, iOS, iPhone and Simulator Architectures

Architectures seem to cause so many problems in iOS/XCode development and its not suprising. Searching for problems I have had have uncovered so many problems, all of them seem to have different solutions although they are all really related to the same thing.

The types of problems you encounter using consist of Linker errors from shared libraries (ignoring file X missing required architecture Y in file Z) or not getting any output from your program or even finding it doesn't run or cannot be installed on certain devices.

The problems stem from:
  1. XCode sucks as an IDE
  2. There are several architectures flying around
  3. XCode does something weird with Active Architecture Only
  4. The fact that there are "valid" architectures as well as architectures is confusing
  5. There are lots of different target devices as well as the simulator, which doesn't actually simulate a device but runs on the host's architecture!
Fortunately, when the pieces are placed in order, it is not too difficult to understand, although you might find yourself hunting through un-necessary hassle to find the correct settings!

Warning: You might find the errors appear sometimes and not others, when linking. I think this is related to way the project is marked as invalid and "build required". What this means is that sometimes you won't have broken or fixed something, it'll just pretend to be OK until the next time it does a full build and fails again!
You should probably keep backups of your projects in case you royally screw something up and have to go back again. This is good practice anyway using git or subversion, for instance.

Firstly, you presumably know that architectures relate to the hardware of the various iPhone and iPad devices. The reason this is important is that at the lowest level, the way numbers are represented in memory and/or the instruction set for programs is simply different on different hardware. The way the program is compiled needs to be different. Another variable is that some older devices (pre iPhone 5s) were only 32 bit, whereas all the newer models are 64 bit (although they still support 32 bit apps).

So, we start from there. We will look at probably the most complicated example: a shared library that needs to support a load of architectures including the simulator. If you open your library project and go into the project settings, under Build Settings, you will see Valid Architectures and Architectures. The output will include any architectures that appear in both lists. Why have both? I'm not sure but it is likely to do with code warnings - you statically set the full set of architectures you will use in Valid Architectures and then you might change the Architecture as you are testing.

Setup your Library Architectures

In my library, I use the shortcut $(ARCHS_STANDARD) for both architecture settings, which in XCode 6.1.1 includes armv7 and arm64, which should work on most newer devices. The problem I now have (is it still a problem?) is that my library will also be run under the iOS simulator, which itself is likely to run on either i386 or x86_64 on newer Macs. It also needs to specify that it is a simulator build and all of this cannot be accomplished in XCode directly. To avoid building both separately, you can use another target, mine's called UniversalLib, and this runs a script that builds the normal build that would happen directly but then also runs another build into a separate directory and which specifies the simulator sdk and the i386 architecture. The script then uses lipo to glue these together into a single "fat" library. I can't find the original link I used but there are plenty of guides like this one. Once this is done, you can run "lipo -info libFile.a" from a terminal and it will list the architectures contained in the universal library. In my case, this is i386, armv7 and arm64.

Link the Library to your App

So, we now need to link to this library in the usual way, making sure we are linking to universal one, not one of the other two that were used to create it.

Build Your App

This is the part that seems easiest to cause confusion. If you just build what you have, it might or might not succeed. This is because XCode sucks with its configuration management. You have devices, targets, schemes and architectures all over the place and it is not clear how all of these things add together.

If you are not targetting older iPhones, you should probably have your app valid architectures and architectures set to $(ARCHS_STANDARD) to give you armv7 and arm64. This is a safe starting point. There is another setting "Build Active Architecture Only" which can be very confusing because what it does is not made very clear to the developer! It is usually set to "Yes" in debug and would generally always be set to "No" for release (archive) builds. "No" is easiest, it means build all of the architectures in "architectures" that appear in "valid architectures". This is important for the binaries you distribute because that is how it works on multiple devices. When you are debugging, however, there is no point building everything when you will only be testing on a single device, it is just going to slow you down. You set "Build Active Architecture Only" to Yes and it only builds one. Which one? That is slightly harder to work out but if your current target is set to "iOS Device" i.e. the simulator, the active architecture will be your host architecture, in my case x86_64 (I think it also allows i386). If you have a real device plugged in and that is selected, then that will be the active architecture. If you try and build at this point (I had iPhone 6 selected but not plugged in) then XCode will look for the host architecture in the linked libraries (x86_64 in my case) and fail linking if it isn't there.

XCode Schemes

What can also make this more complicated is the XCode schemes. If you go into Product->Schemes->Manage Schemes, you will find for EACH device, a set of schemes for Build, Run, Test, Profile, Analyse and Archive. In each of these, you set whether the build is for debug or release (and therefore whether it picks up the debug or release build settings). Archive is how you distribute applications and would almost always be Release but the other settings depend on what you are doing and whether you need to debug.

Build for the Correct Scheme

Once you set these up, you then need to make sure you are building for the correct scheme and device. For some reason, this is not really obvious like it is in Visual Studio. You can choose the device easily enough but what scheme does Build build? I don't know but you can choose Product->Build for->Running or whatever, to make sure it is correct.

Once you have looked in all of these places, you should be able to more easily follow the ridiculously complicated flow from library to app and work out why it errors!

What I haven't talked about is what happens if the library is from someone else and doesn't seem to have the correct architectures. You can obviously request the correct architectures but if that is not possible and they only provide, for instance, i386, you might have to downgrade your own architectures to 32 bit if you want to use that library. Google it!

Good luck, you'll need it.

Lumino City - Great Game but could use some hints!

I've just finished Lumino City, a fun game on the Steam platform. It only took a few days to finish but it was only £15 so that's fine. The impressive thing was that the "set" of the game was built by hand and photographed to provide the game background. The characters were then superimposed on top.

The game basically follows Lumi as she travels through Lumino City looking for her granddad. In order to get to him, she has to solve a series of puzzles that work on various parts of the brain. Most of these are quite challenging but I felt some of these could have used more hints to tell me what to do. I could probably have got them eventually but it is frustrating when you think you've done everything you can and it still doesn't work. I have included below some hints for the game - not the answers mind you. Read them carefully if you don't want to read anything that makes it too easy for you!

The Big Red Book

The only purpose of the Big Red book that your grandad gives you is that it contains the answers to most (all?) of the puzzles. You can either find them by flicking through or otherwise you have to work out the maths puzzles in the contents page, which usually refer to something in each of the puzzles (the number of cogs for instance). Not all of these are easy to work out though, one of the puzzles I only found by looking through the book!

Getting Salt

This was the first one I really struggled with. There is a woman in the house and the only useful information seems to be that she's really sad about losing their dog over the edge. Hint: She never comes out of the house and you never go in. You also see the man on the BBQ next to the salt. If you try and take some, he sees you and tells you that you can't have it. Hint: He never leaves the BBQ. Hint: You do not need any other object to get the salt. Hint: The wife is part of the solution so keep your eyes peeled.

Spinning House

This was another one that was confusing because you only get some basic information about fixing down the wardrobe. I also got to the point where I appeared to be stuck in a part of the house. Hint: You can slide down some of the blue poles. Hint: You need a hammer and nails, both of which are in different parts of the house. Once you have these, you need to fix the wardrobe. Hint: The wardrobe is fixed down from the floor below, not from inside the bedroom! Once that is done, you can go out to the yellow box to get the key, which you then need to use on the blue back door to get into the side of the cliff.

Cliff Houses - Cogs

When you arrive here, you hear some women talking about a key in an orange room but that is all you get. Hint: First make your way to the top right and solve the puzzle in the box on the roof. Hint: Ignore all the boxes and numbers on the diagram, the task is just to link the two cogs. You won't need all the cogs and won't use all the spindles on the diagram. After this, the house start moving and you need to make your way back across and down towards the window in the orange room. The game will hint where you can click to get across. Hint: You need to stand forwards inside the orange house to access the cupboard where the key is. This opens the white box next to the door you came through to the cliff.

Cliff Houses - Patch Bay

This was by far the hardest puzzle to solve. Even after looking at the answer, I didn't understand the method but I did make the mistake of unplugging the wires that were already plugged in and I don't know whether that is your starting point. Effectively each symbol has a numeric value that you need to deduce in order to work out which symbols are equivalent to which other symbols. I think leave the default wires in and see if you can work out that, for instance if hexagon + half a hexagon = 6 then hexagon presumably equals 4 and half a hexagon is 2. Once you have done it, press the button and if it is correct, the green light goes on and the houses stop moving, you can then go over to the other side and enter the power station.

Power Station Password

All I will say for this is that it is a psychology question. The answer is in the same room, what passwords do people choose generally? It can't have more than 8 letters and I don't think it is case sensitive since lower case letters appear upper case on the screen. Hint: When you put the card into the slot, make sure that it displays what you think you punched in. You might have made a mistake without realising. If you punch the wrong hole, you cannot undo it, just post it and it will be wrong and you'll have to start again. Hint: Once you have done that it will ask you to enter a number. I don't know what happens if you enter 3 by mistake so if you accidentally punch the wrong hole in the card, make sure you punch the other holes to ensure you don't enter 3 instead of 2!

Power Station Tape Winding

This was another one that you might eventually work out by accident. The gist of the puzzle is to get the tape to run over the 4 readers (which look like small lights) by correctly placing the other spindles. Obviously the tape cannot cross itself. It doesn't matter that the spindles look different, it is only the size that is important for the puzzle. Hint: You do not use all of them! Hint: Go up to top-left from the start, across to top right and zig-zag downwards to get back to the finish.

Power Station Exit Lock

Hint: To get the code, you have to go into the projector room, fix the projector and watch the film. The code is at the end of the film. Fixing the projector should not be too hard although the links are not visible, they appear when you click between two adjacent circuit board points. You can click again to remove them.

Anything else?

If you want any more hints for this game, write in the comments and I'll add them although I think the rest of the puzzles are doable without any extra help (even if they take a while).

Sunday, 7 December 2014

Why the Royal Mail depresses me

The Royal Mail epitomises many large corporations so although I am having a go at them below, I am having a go at many companies who lack any true innovation, motivation or empowering of employees and who therefore have a company who will always struggle to keep up.

I am not a postal expert or any employee of Royal Mail, nor have I ever been. When I visit the sorting office, however, to collect a parcel that could not be posted through my door, I can see problems screaming at me that seem to either have been unnoticed by various levels of Royal Mail management or otherwise have been noticed but no-one is allowed to actually address them in any way.

The only reason I use Royal Mail, like most people, is because they virtually hold a monopoly on small postage items. They are also used by many companies by default, including Amazon, even though in some cases, there are much more desirable, reliable and reasonable alternatives (like Collect+ or any of various couriers).

Back to the story, the first thing that strikes me is that there are 20 people queuing in a room that is barely big enough. My first thought is that maybe the staffing level is poor, perhaps people are off sick or something (not that that is a good excuse) but, no, there are at least 3 people working behind the counter. Most of the people are just there to collect items that wouldn't fit through the letter box so it shouldn't really take too much time? If I was a manager, the most obvious question is, why are there so many people here in the first place? The answer is pretty obvious, we buy a lot online, most houses have two earners who are not in all day and the postman is no longer able to deliver before about 8 or 9 o clock, it seems. I remember as a kid, we had two delivers, first post was something like 7am, when everyone is in, the second was later. Now, I am lucky to see the postman before midday. In other words, I am never in when they arrive.

This sounds like something that is outside Royal Mail's control but that is what I would expect an old-fashioned, poorly managed organisation to say. I would expect a proactive, innovative, employee-enabled company to say, "how can we deliver when people are at home so that loads of them don't all turn up at the sorting office to collect stuff?". Actually, this is not really that hard. They should have the flexibility to deliver at different times to suit - wait for it - the customer! If that means weekend deliveries, God forbid, then that's what it means. If it means you split the shifts into an early morning shift for letters and an evening shift for larger items, you could do that. The technology to track and organise mail is actually dirt cheap and is partially already in use but if you want something recorded? That'll be another £3 and not worth it for the cheap power supply I bought from Amazon - more importantly, recorded doesn't actually buy you flexibility it only buys some basic tracking and paltry insurance. I remember once getting a Royal Mail evening delivery and wondered why it was not sold much more strongly by Royal Mail. I'm sure paying people to work some evenings instead of permanent mornings wouldn't be the end of the world, it would also spread out the workforce meaning you need less transport and facilities.

So clearly Royal Mail haven't worked this out yet. Let us return to the sorting office.

3 people serving and 20 waiting, that shouldn't take too long right? 5 minutes? 10 tops? Nope. 20 minutes later, I'm still waiting. Why is this? You learn so much in the queue as you watch the people waiting.


  1. Where on earth are the employees going to find these parcels? France? If I turn up with a card, they type something (slowly) into the computer to get a code and then go off for a walk somewhere. Perhaps they are drinking tea in between. Why are they typing anything? I have a card, why doesn't the card have the relevant number on it? Even if they have to walk half way up the building, it still takes ages. If it takes 2 minutes per item, that queue is never going down.
  2. Some woman was there with her card but they couldn't find the item. Apparently, when this happens, they call the postman and ask them where they put it. Really? They can't find it? Surely there is one place to put these parcels and it will be there right? Nope. "Sorry madam, we can't find it, we'll have to look, try and deliver it next week if we find it otherwise you will have to claim for it". I would be mortified if my sorting office lost one parcel, but it happens all the time (I've seen it about 3 times in the same number of months I've visited the sorting office). The poor lady doesn't even know what package it is so she could re-order something or whatever, all she knows is that "a package" was not delivered (and now lost).
  3. Another lady has turned up. She has two letters, one couldn't fit through the door (OK) but the other has postage to pay on it. What is this item of mail that doesn't have postage on it? Some crappy flyer from an online shop whose franking machine obviously wasn't working properly. Of course, this woman didn't want to pay postage on some piece of crappy marketing mail so the lady offered to "return it to the sender". Really? The system allowed a letter to enter with NO postage on it and expected the person at the end to pay all of it? Un-paid mail (as opposed to something that might have been underpaid) should just be rejected at source. Imagine if that lady didn't have the second piece of mail to collect, she's just waited 20 minutes for a piece of mail she didn't want.
  4. Another lady had a package that had arrived from America and had also gone missing. You could hear her telling the previous lady that the postmen must just steal things. Again, this is criminal and the management clearly can't fix it.
  5. Then after my long wait, I told the man that I had been left a card for a parcel that wouldn't fit through the letter box but I had misplaced the card, but it was alright because I had my driving licence, complete with photo, name and address. It was also not tracked, just too big to post. He initially said it wasn't a problem and then returned to say he couldn't give it to me. According to Royal Mail Twitter, this is for security reasons. I don't really see how having photo id with matching name and address somehow undermines the high security of the red generic card that they put through my door, but again, it smells of "corporate policy" that is a terrible fit for each and every situation in the sorting office. Let the staff decide whether they think I am stealing a £4 power supply by somehow managing to fake someone else's photo id but not having the skill to get a red card from their letterbox! They offered to re-deliver it so they could re-post a card that I could then use to get the item from the sorting office again (it takes me 15 minutes to cycle there, I don't have a car!).
The whole experience depresses me. The people feel distant and not like the kind of people who management can trust to make judgments. They don't seem like the kind of people who are encouraged to streamline the collection process so the queue can be kept small.

This has all happened over the past 2 weeks when I have also not received two other packages (one 1st, the other 2nd class). Where do these things go? Does anyone investigate? If people are damaging, discarding or stealing items, are they punished under the full wait of the law, or do they simply get fired only to be replaced with more of the same culture?

Toyota are large, but they are successful. Why? Because they realise that the people on the shop floor know what works and what doesn't and management (who spend time on the shop floor) are prepared to invest in the ideas that the workers have to make things better. Better workers means happier workers not dicking around with things that should work properly or arguing with customers who are fed up with the sloppy service. Better workers mean a more efficient company which means more profits. It means customers choose your service because it is good rather than putting up with it because there are few alternatives. It means when a new company does come along to rival yours, you might actually survive the competition rather than dying a public and shameful corporate death.

Tuesday, 2 December 2014

Creating a scalable PHP web application on Azure Virtual Machines

Background

I have been using PHP cloud services on Azure for a web service used for PixelPin. I like this model because it (theoretically) means I don't have to manage or worry about anything at the operating system level. I create an application and deploy it directly to the cloud service and Azure takes care of the provisioning and scaling, including duplicating the installing for multiple machines. Updating is usually fairly painless and it allows me to concentrate on what I'm good at: writing web apps.

That's the theory. The problem is that the PHP support is very much the poor cousin of the .Net integrations and that limits various things. It means you cannot deploy from Visual Studio since VS doesn't natively support PHP (yet?), this in turn means you have to deploy with Powershell Scripts, which in itself is OK, except that there is a nightmare related to conflicting versions of libraries and bugs in the scripts.

I thought these had settled but the latest version has a weird bug (which is known but apparently not resolved) where the XMLSerializer that writes out the configuration files (for reasons that I don't understand - they are my config files to edit) writes out an XML declaration with UTF-16 specified even though the files are UTF-8 and this screws up the upload. The fix was supposed to be downgrading the tools version to an earlier version, which I did, but which then totally stopped working and would no longer generate the upload package. No errors, no nothing. There appeared to be Fusion log errors but why couldn't these scripts find assemblies and more importantly, why weren't these errors reported from the script rather than it pretending to work?

There are other problems too. The tools don't allow you to deploy to Windows Server 2012, despite PHP being compatible, and this was caused by the scripts that run on the instance to install PHP etc. somehow not working on 2012 - something I didn't want to debug and I definitely didn't want to start changing source code and trying to rebuild stuff.

The Solution

The solution I decided was to use plain virtual machines again (platform-as-a-service), something I was not very happy about but which seemed the only option.

I was not sure how this would all work because Cloud Services hides all the details from you. For instance, I could create a correctly configured server but then how do I make it scale? How do I duplicate the VMs, can this be automated or would every update have to be done manually, one at a time?

This was my journey (it will be long!).

Creating the VM and setting up IIS

Creating the VM was all pretty straight-forward and then I logged in with Remote Desktop to stat setting things up. The VM doesn't come with any roles, so the first thing to do is add the web role using the Server Manager screen. This basically adds IIS. I didn't add any particular extensions but that is up to you.

It is my simple experience of these long configurations that you make regular tests to ensure each step is working as expected. For instance, you could go straight ahead and install PHP but if it didn't work, it might actually be IIS that isn't working. A simple visit to localhost in IE proved that IIS was basically working - so far so good!

Install and Test PHP

The Web Platform Installer is a fairly useful way of installing MS related software. It is not installed by default on the Azure VM so you can get it from here: Web Platform Installer but be warned, the dreaded "IE Enhanced Security Configuration" is on by default and that just makes most navigation of web sites a real pain, so you might want to disable it for now (in the server manager).

Once you have run up the web platform installer, search for PHP and there will be various versions available. I installed PHP 5.5 and also the Windows Cache Extension for PHP 5.5 which helps with content caching (I don't really know what it does!).

I then created a new site in IIS. You could reuse the existing default web site but I like keeping the default web site the default web site (or remove it altogether) to make my sites a little harder to find for people who just hit IP addresses, whereas I can add a host header for my site so it is found correctly only by URL. I added the host header on this new site

Creating a new site, I adjusted the application pool and changed the .Net CLR Version to "No Managed Code" since it will be a PHP site. I left the pipeline as "Integrated" but I don't think that means anything without .Net code. If you leave the application pool identity as "ApplicationPoolIdentity" then you will need to setup permissions on your PHP web root to allow that user to access the folders. Instructions are here: App Pool Identity Understood

Once that was done, I went into my PHP web root (which I created alongside wwwroot in inetpub just for consistency) and created a test PHP file, which just echoed phpinfo() I saved it as index.php. Note that phpinfo contains information useful to an attacker, you should generally not dump that data to the web site once the public port has been opened unless it is in an obscure filename that an attacker would not easily find. You can simply echo something like "Hello world" to prove that you are reaching the site during further testing.

Since I had already set up a host header in my site (remember, that dictates the need to visit the site via a url, not an IP address or locahost) I had to edit the hosts file on the server to point that test url to the localhost ip address 127.0.0.1. I then visited the site in IE using the full URL and it all worked? Nope, big crash! Opening up the Event Viewer, I found an error which wasn't formatted very clearly but which was caused by php-cgi.exe and mentioned msvcr110.dll.

It seems that there is a dependency on php-cgi.exe, which isn't installed by the web platform installer, possibly because it used to be installed on older versions or perhaps more standard versions of Windows Server. Anyway, I visited and installed both the 32 and 64 versions of this dll (apparently you need the 32 bit one even on Win64) from here: http://www.microsoft.com/... and then my page sprang to life!

Open the VM Endpoint

By default, the VM won't have opened up the web endpoint to the world, this is because when you create it, Azure doesn't know what you are going to use the server for. Fortunately, this is pretty easy to do.Note: You might have already set this up when you created the VM, in which case, just skip straight to testing that the open endpoint is working.

Open up the management portal and select the VMs icons in the left hand side. Important: Do not click on it via the "All" button, because otherwise you do not see the Endpoints menu (or at least I didn't - not sure if it's a bug or not). After clicking on VMs, click on the VM you are working on and you should see an Endpoints menu in the portal. Click this.

By default, there is an RDP and a Powershell endpoint (if you ticked the box when you created the VM). Obviously the RDP endpoint is needed for remoting (unless you tunnel via another machine on the virtual network). The powershell endpoint is for remote powershell operation, in other words, you can automate things by calling powershell as if you were on the remote machine - very useful but not something I need right now.

Hit "Add" and in the add dialog, use "Add a stand-alone endpoint" and then you can choose the name from a dropdown list of "known" endpoint types - e.g. http (if you want https, there will be more configuration to do on the box i.e. the installation of the cert and linking it to the web site). Don't tick any boxes about load balancing just yet - we only have one server - which will be deleted in a minute.

You can test the box, either by using real DNS lookup, or in my case, temporarily, by setting my LOCAL hosts file to direct the test URL to the public virtual IP address of the VM (as seen in the dashboard for the VM).

If you visit the site from your development machine, it should show. If you used phpinfo() in your test page, I suggest removing that now since the information contained within it is quite valuable to an attacker.

Clone the virtual machine into Image

In order to have a web server farm, naturally, you want to clone the contents of a VM. This link shows you how to do that on the source VM and then in the Azure portal. Note, this process will delete the original VM, keeping its disk in a state that can be used to create new VMs.

This involves running sysprep.exe on the source VM, which will capture the relevant system information and then shut the virtual machine down, setup an image and then delete the original virtual machine. You must manually delete the cloud service that would have been created for this virtual machine unless you plan to re-use it for your new load balanced farm.

Once this is done, you can create new virtual machines based on the image you have just taken.When you choose "Add" under Virtual Machines, choose the "From Gallery" option and then click "my images" on the left. Select the image you have just created and choose Next.

Note: In my case, the site I created under c:\inetpub\ was copied as part of the image, but I don't know exactly what is or isn't copied (I know networking settings are not) - so please test things to make sure it cloned what you think it cloned.

At this point, you can create the availability set and load balancing endpoint or otherwise see the next section about adding it later (you might as well add it now).

Once this has finished, you will have a replacement VM based on the image you took - which is a good test of whether it is correctly set up. If not, you can modify your new virtual machine and run sysprep again to create a new image until new VMs are created correctly. Even before you have created additional VMs, you should be able to reset your DNS to point to the new IP address of this VM and your site should still work.

Create Availability Set and LB

Creating multiple VMs for a service can give both performance and resilience benefits. Performance benefits because more than one VM can handle the incoming requests but also resilience because the multiple VMs can be created across "racks" in the data centre, each of which has redundant power supplies and network switches meaning a failure that affects multiple VMs is unlikely.

You might have already done this during VM creation but otherwise go to the configure tab in the portal and choose to create a new availability set and give it a name. The VM might have to restart when you save this change.

You can also modify the endpoint you created earlier for the web site and tick "Create a load-balanced set". You can then keep the defaults. The probe information is about how often the load balancer checks for unresponsive endpoints so they are not used until another e.g. 15 seconds at which point they will be checked again. I don't imagine my services will ever be unresponsive any time soon.

Create Additional VMs

By now you should have a single VM running in an availability set with a Load Balanced endpoint. You now need to create any additional VMs that you want in a similar way (From Gallery) with two differences.

Firstly, do not choose "Create new cloud service" but select the name of your first VMs cloud service. Also, do not add an HTTP endpoint, we will link the load balanced one after creation.

Once the VM is started, click to select it and choose "Endpoints". Click "Add" but instead of choosing "add a stand-alone endpoint", choose the option for "Add an endpoint to an existing load-balanced set" and select the load balancing endpoint you setup for the first new VM.

Once this is done, the easiest way to test it is to alter the default page on each server in some way that you can tell which server is supplying the page. Don't expose anything useful about the server, for instance, you probably should avoid returning the azure cloud service name (which is not exactly private but is not necessarily public either) and instead return something innocuous like "Server 1" and "Server 2"

Now visit the site and keep hitting ctrl-F5 to force refresh and make sure that you get pages served from both servers. They won't necessarily alternate exactly in turn but as long as you see both, it is working OK.

Deploying your Project

Deployment and change control is obviously a big subject and not something you should just jump into. My experience is that you want quick, easy deployments (especially when someone is waiting for an urgent fix!) but it should also be easy to rollback if you deploy something that doesn't work properly. In addition, you should have good visibility of changes, because trying to track down a bug if you are not properly tagging code changes can be really difficult. In a recent bug I had, because I could diff the code changes, I knew that the problem was a broken deployment and not a code error.

My code lives in Subversion and I tag releases before they go live. What I plan to do is to write a Powershell script that can use the remote powershell functionality to 1) copy the current live site to a backup directory 2) Pull in a labelled tag from Subversion into the web root and 3) Provide the option to rollback the deployment if it all goes wrong. I haven't written this yet but it should be fairly easy since it will mostly use svn command-line arguments and a few directory copies. It might have to use some IIS functions to point the site to different directories, which might be easier than copying files over the top of other files (which always goes wrong when a process has them open/in-use).