On Friday last week our new Democracy webpages were put to the test with the Local Election results being broadcast and reported live.
Unfortunately we experienced some downtime during the morning which clearly isn’t and wasn’t a good thing.
We want to share some of the lessons we learned based on a team conversation this morning.
Firstly lets focus on some of the positives, which are actually contributing factors to why it went down, but are worth celebrating at the same time.
The stats for Friday:
- 42,668 visits
- 255,281 page views
- Elections Live Widget – 4,633 visits
- Average time spent on site – 5 minutes 45 seconds
- 6,066 visits from Mobile
For comparison our main corporate website stats for Friday were:
- 48,938 Page views
These stats are fantastic, but we didn’t anticipate this level of usage given that on an average day the main council website gets around 13,000 visits – so we had planned that we would “worst case” get twice this amount during the day which would have been great….However the critical factor for us, which is why the site went down was that the majority of the 42,000 visits (nearly 4 times the daily average) to the Election pages were during a 2 hour period starting from 10am.
Our current hosting arrangements clearly were not up to supporting this level of usage which is a lesson we have learned and something we have decided to review immediately – more on that later in the post.
Some other successes were our use of social media (Twitter and Facebook) and the use of mobile devices such as iPads which essentially created a back-up for announcing results, but clearly only provided half a service during the time the site was down.
Site performance and optimized themes
One of the factors for the site falling over was due to the sheer quantity of hits over a very short period of time. The main culprit for this was our elections widget, which many districts and news sources had added to their sites. While it is great that they made use of the widget we hadn’t anticipated the impact this would have on the server load for our hosting.
The next time we attempt something like this we will be more rigorously testing our code to ensure we understand the ramifications for the hosting server. The short term solution put in place by our hosts was to move the site onto a dedicated varnish server in front of their cloud hosting. Because of the way this server handles load our hosts don’t believe that a similar situation to the one we experienced on Friday morning would result in the site falling over, though they did recommend a more streamlined approach to delivering the widget.
We’ve had our current hosting arrangement for about 3 years now and was originally used to support the team’s development of school websites, however we always had spare capacity in terms of bandwidth etc which inevitably led to additional websites being created and hosted in the same environment.
However, a couple of years ago we stopped providing school websites and started focusing entirely on building the new Devon website using WordPress.
We’ve been talking about reviewing our hosting since the start of the year as our current arrangement comes up for renewal in the summer so we’ve been capturing requirements (essential and desirable) since earlier this year. What we have so far is as follows in no particular order:
- A fully managed service
- Excellent and robust security
- WordPress compatible hosting
- Back-up and Disaster recovery systems
- 100% network availability
- 100% server uptime
- 24/7/365 support – varying levels depending on impact/issue
- Managed monitoring
- Mirrored server configuration for beta development and testing
- Rapid scalability (lesson learned from elections)
- Commitment to ISO27001 accreditation or actual accreditation
- Optimized WordPress hosting
- Set-up and migration of existing websites to new servers
So our challenge now
Our next stage would be to gauge the alternative hosting provision out there to locate something more robust and able to stand up to the kind of server load we experienced during the live elections period. While we had adequate bandwidth to stand up to the overall traffic, the site wasn’t on a server geared up to handle the sheer number of visits per second/minute that we experienced. In effect we had DDoS’d ourselves!
We will be looking at the various third party hosting options available at the enterprise level, seeing what extra services these hosts provide for the types of site we are building and whether or not we can procure these services and under what terms.
One thing is for sure… before we start channelling more mainstream traffic to our WordPress multisite we need to be absolutely sure that the server it sits on can handle even the most unexpected of loads.