Scaling and Sessions

For a developer, scalability is one of the most hotly debated topics in the realm of high-traffic web development.  There are tons of best practices, loads of useful technologies that can be leveraged for performance and an infinite supply of opinions about how things should be configured.  The use of Sessions in HTTP is one of these topics that has no single right answer, but sure does invoke a lot of opinion. With the right approach, you can ensure scalability when using Sessions just by keeping a few basic concepts in mind.

Sessions vs. State

Most importantly, the difference between the two must be clearly understood. I was recently told that because an application uses sessions, it is therefore not stateless. This is simply not a true statement (unless we are talking about ASP.NET’s SessionState class). By definition, an application is Stateful if the server itself has to persist the user connection in order to keep providing the correct data to a single connected client. A user state is created on that server, and if a user loses their connection, they will have to reconnect to the same server in order to maintain a consistent user experience, or create a new state on a new server.

After installation or an initial PHP compile, the session handler is set to use the file system on that web server. In this case, the configuration of PHP’s session management IS stateful, and would in no way be scalable because it will not work behind a load balancer. Note that the reason this is stateful is not because the application simply uses sessions, but rather the server configuration.

Now for Sessions: A Session is simply a “conversation” between a user and an application that is used to persist user data. This can be done many different ways, including through custom handlers with PHP. The session could exist entirely client-side in cookies, entirely server side in the file system or on a shared resource somewhere in between. Also, by definition, putting that information on a “shared resource” (or client-side) ensures that your web servers remain stateless and scalable.

How Sessions Can Be Scaled
That being said, statelessness is a fundamental aspect of making sure your application is scalable. Creating an application with a dependency on user states means that you forfeit the ability to use a load balancer properly and that you are immediately creating an exposure to DoS attacks. There are a number of ways to achieve scalability with sessions.

1. Memcached.

Personally, I consider this the easiest and most effective method of managing user sessions. This is especially useful if a caching pool is already available to your application; however installing memcached on your web servers is trivial if you do not want the overhead of extra resources. I won’t elaborate on setting up and configuring memcached as there are plenty of resources available for that, but providing a pooled, redundant cache for PHP’s session management immediately moves your dependency to a shared data source, and creates a stateless and scalable environment. It also requires minimal effort on the application side if you are using a framework or platform that depends on server-side sessions (such as Symfony2).

2. MySQL (or any database for that matter).

I have found this method to be fairly common in my research for this post; however I have not used this method, because I dislike creating additional, unnecessary dependencies on the database (it is a delicate flower!). That being said, MySQL is another way to accomplish the same result if you prefer doing it this way or simply cannot use memcached. This still achieves the same result — stateless application — however, it does require a bit of extra overhead as PHP does not have native session handlers for MySQL session storage. They do seem fairly straightforward to create and there are classes available all over the internet.

3. User Cookies.

The most obvious and common way to ensure statelessness is to put the dependency entirely on the user’s browser. This way you completely step around the issue by putting the burden of persistence client-side, so that regardless of the conditions of the application environment, their data will live on their side. As long as you use secure, encrypted cookies, this is also one of the best methods for maintaining scalability.  However, forcing everything client-side is not always an option, so handling sessions server-side may render this point moot.

All 3 of these options (and I’m sure there are plenty of others) are perfectly acceptable ways to manage sessions without sacrificing scalability.  For more information (and yet another wikipedia link), check out this high level page about Session Management.

Twitter: @jsmckinney

blog comments powered by Disqus
Careers | Copyright 2013

Chapel Hill

1506 E Franklin St.
Suite 300
Chapel Hill, NC 27514

Brooklyn

67 West St
Floor 4 (Greendesk, #C6)
Brooklyn, NY 11222