Today we announce the second instalment in our series of GoSquared updates. This update has been in development since November and marks out a key milestone for LiveStats in terms of the service’s under-the-hood architecture. LiveStats 1.2 introduces a completely re-written and re-structured tracking backend, geared up for blazing performance, lower latency, more accurate statistics and greater app stability. In short, the LiveStats 1.2 update launches us into the freedom of unlimited scalability, enabling us to take full advantage of the expansive power of the Cloud. Although there are no notable front-end improvements to the UI, the update allows us to scale our hardware horizontally and provide you with an even better service.
When LiveStats was experimentally released back in late October, we didn’t have a great idea about how the app would perform under high load. However, thanks to a surge of sites implementing the service, we weren’t in the dark for very long; we quickly discovered that the system in its initial incarnation was unable to cope with the load it was being subjected to. LiveStats 1.0 was hideously unoptimised, and was completely dependant on a sluggish database which was unable to scale to a high level of writes.
Something needed to be done quickly to tune the performance of the system so that it could hold for long enough while we worked on a different approach to the tracking architecture. James detailed these modifications in his post LiveStats – One Week On and that was the system we had been operating since… until now. So, what’s changed?
Server layout
Previously, the system was bound to a single server running a single, slow and unoptimised MySQL database. Now it has been liberated by a horizontally scalable design in which any number of new servers running our custom AMI can be manually or automatically booted up as LiveStats “nodes” when other nodes approach their load limits. We developed our own internal API which empowers all of our servers to collaborate with each other in a conglomerated stack. Using this API, Sites are allocated to specific nodes depending on their size and resource utilization of each node, so resource constraints can be easily avoided, leaving LiveStats functioning quickly and reliably. In addition to this, we have moved our DNS hosting to UltraDNS for super low-latency, high availability DNS query resolution for all of our services, and also thanks to their API, granting our systems with the ability to dynamically assign subdomains to our server nodes.
Software
A major objective for the 1.2 rewrite was to completely eliminate reliance on the MySQL database. Data collected by LiveStats is currently not persistent anyway (apart from custom naming, more on that later) so there was no need for the use of a MySQL database when there are far more suitable tools available, like memcached. LiveStats 1.1 introduced partial memcached support which, along with closing the service as invite-only, relieved a considerable amount of load, allowing the system to just about hold out for this long despite still being database-bound. With LiveStats 1.2, each node operates its own native memcached instance with access to a large amount of memory, in which the real-time statistical data is temporarily stored. This way data is not delayed by arbitrary disc I/O caused by MySQL, and instead resides in memory which is extremely fast to access.
Persistent Data
For certain features like custom naming of visitors, persistent data is required. Memcached however is not a database, and a database of some sort is usually the best way to go for persistent data. While we haven’t yet committed to a particular solution, we’ve been researching into a relatively immature technology: Key/Value databases, like Redis, Project Voldemort (used at LinkedIn) and Cassandra which is in production use at Digg and Rackspace. Currently we’re using a simple implementation of Redis for custom names, although this is subject to change in the future once we take a closer look at persistent data.
Code
All of our backend code has been redesigned from the ground up to take full advantage of these changes in our system architecture. The way we handle our data has completely changed, and has been fully optimised with performance in mind. As such, we hope you’ll find the requests in the LiveStats app load at consistent intervals with fewer delays. Furthermore, accuracy of the data has increased. In LiveStats 1.0 and 1.1, due to the database architecture and data propagation across databases, some hits would occasionally be dropped, or sometimes null or “ghost” hits would show up where incomplete information has been received. These problems have been eliminated and LiveStats now displays a very accurate reflection of who’s visited and where.
Now that our capacity has increased, we will soon be distributing more invites more regularly. If you want one, ask us on support or twitter, it’s likely you’ll get one! As always, new software isn’t perfect and there are likely to be bugs, so if you encounter problems, get in contact and we’ll look into it.
Finally, just a quick mention of LiveStats 2; it is under development and introduces some really intuitive new features that we know you’re going to love! More on this soon.
From all of us at GoSquared, we wish you all the best this year and we hope you enjoy LiveStats!
Geoff Wagstaff, James Gill, James Taylor
A.K.A. The GoSquared Team.