One of the most common tasks we encounter when building web sites is setting up an authentication/authorization system so users can log in, access their own personal information, and not be able to access other people's information. Nowadays we use a framework that has this sort of thing built in, but in the old days we used to have to start from scratch.
No problem, right? We just make a Users table, with a primary key (user_id), a username, and a password (encrypted with bcrypt, of course). Then, whenever we have any information that belongs to any one person, we add a user_id column to the record, and make sure the current user can only see rows that have her user_id in the user_id column. Any searches we do will have WHERE user_id = ? in it so we can pass in the user ID of the current user, and get only that user's records. Easy.
No problem, right? We just make a Users table, with a primary key (user_id), a username, and a password (encrypted with bcrypt, of course). Then, whenever we have any information that belongs to any one person, we add a user_id column to the record, and make sure the current user can only see rows that have her user_id in the user_id column. Any searches we do will have WHERE user_id = ? in it so we can pass in the user ID of the current user, and get only that user's records. Easy.
That looks simple, but it isn't. We have "complected" two different things here: user identity, and access control. In our first, phase 1 design, we've built an easy-to-implement access control system, that's cooked into our database schema, and that's an underlying assumption in how we design all of our data objects that are user-specific.
But we don't notice that right away. The design goes quickly, implementation goes quickly, push to production comes off without a hitch, and people start using it and really liking it. Everything seems great, we love our job, and the business side of the house loves us.
The next day they come to us with their projects for phase 2: they need to set up some admin users to look at other people's data and make changes to it. Ok, that doesn't sound too bad, we'll just add another column on the Users table and call it is_admin. That's how we know who is allowed to access the data for other users.
Now the pain starts to set in. We have to go through all the code and find every place that checks the user_id field, and change the test so it reads:
if (user_id == user.user_id || user.is_admin) {
// ...
}
We also have to find all the places where our database search is looking for records WHERE user_id = ?, and modify them not to restrict the results by user ID if the user's is_admin flag is set.
This is tedious and error prone. We have to touch so much of our code, and even then we might overlook some sneaky way our code has been influenced by the assumption that every object with a user_id property will have access tied to the ID of the current user. We're less confident about this push, but we do tons of tests---which we also had to re-write, since they were also written under the same assumptions---and all our tests pass, so we cross our fingers and release. We find a few bugs that we missed, and get them fixed over the next week or so, and then heave a sigh of relief, sit back and ask business what's next.
What's next, business tells us, is that we've out-sourced our customer support, so we now need 3 tiers of admin access: customer support reps, who can access customer accounts but no other accounts, Tier-2 Support, who can access customer accounts AND customer service rep accounts, but not Tier-3 accounts, and lastly the Tier-3 folks, which are like sysadmin-level access, but for the business folks in-house. Oh, and some of our customers now want corporate accounts, where multiple individual users have access to shared data, but only for their own company.
At this point, we begin to question our career choices, but let's be honest, we started digging our own grave back at the beginning when we decided to entangle access-control and user identity. That's complex, in the technical sense of complexity. We've intertwined two separate ideas, the idea of identity and the idea of access. Had we based our original design on simpler ideas, it would have been easier to update later on.
In Part 2, I want to look at separating the notion of user identity from the notion of access control. This separation of concepts will help us build a system where all the moving parts are simpler components, which can then be modified and upgraded with a minimal impact on other, unrelated components.
No comments:
Post a Comment