Getting Rid of Emails
Getting Rid of Emails
My original website hosting the Learn Python the Hard Way book had no passwords. My course at the time was aimed at people with very little computer experience, so it was reasonable that they'd have problems with passwords. When I sat down to make the system more self-service I sort of stumbled on the idea of just removing passwords and using email. Seemed like a good idea at the time.
The email system today is called a "magic link", and the idea is you don't require a password but instead you send the user an email with a special authentication link. They click the link and then you authenticate that browser forever with a cookie. This is as secure as any system with an email based password reset, but they only need to remember their email address.
Easy right? Just get your email address right and you don't need a password. Hahahahahahaha oh was I naive.
The only thing people get more wrong than passwords is email. They get the domains wrong, transpose letters, forget them completely, can't receive emails, can't read them, have aggressive spam blockers, and can't seem to find the most recent emails to click. I actually can't blame any of the users because I can't even get most of this stuff right and I've been using email for decades.
People are also oddly afraid of giving out their email to a service. I say it's odd not because their concerns aren't valid. I say the paranoia over email is unjustified given they'll hand out their phone number like it's free candy. Up until recently companies could pay to actively track the location of any phone number and the mobile carriers were all fined for it. The idea that users think email is some vast tracking conspiracy but phone numbers are totally secure is bizarre.
Phone numbers are way worse. I'll have people who are willing to give me a phone number or log in with a Facebook login, walk around with 200 tracking apps on their phone, but then freak out if I asked for an email so I can save them some trouble using a password? Weird.
I think the security concerns regarding email are valid, but that compared to many other privacy invading vectors they're much safer. But, I like a technical challenge. What happens if I just...
Don't Store Your Email
I want to get off emails as an authentication system entirely and not store anything I don't need to store. Can I remove emails completely from my database and use only username/password combinations?
First, we have to analyze what are some ways that I might abuse an email someone gives me. Keep in mind I'm not Facebook and seriously do not have the budget or desire to data mine someone to figure out what kind of porn they like. I am definitely not that powerful, but here's what storing your email in my database could do:
- I could spam you with things you don't want.
- I could sell your email to someone else who spams you.
- Someone could break into my computers and harvest your emails and passwords, then hack your other accounts.
- I could use your email with many of the "tracking helper services" out there to then figure out where you are on the internet.
I'm not Google or Facebook so the one thing I can't do is track you across the internet based on your email. That requires getting a large enough sample of other websites to put a tracking pixel or similar tracker that I then use to figure out where you go.
Given the above 4 big problems, how can I simply remove them as a problem entirely?
No Emails At All
Let's say I simply don't ask you for your email. Let's go old school and I just ask you for a password and username, and that's all I store. To be safe, I store the password with the industry standard bcryptjs or bcrypt to protect them.
bcrypt uses a form of encryption to convert your password into a very large number that's easy to confirm, but difficult to reverse engineer with only the number. When you give my server your password I run it through the same encryption and if it comes equal to what I have in the database (more or less), then I know you gave me the correct one.
What makes bcrypt secure is someone with this large number has to spend a lot of time feeding bad passwords through bcrypt--requiring expensive CPU usage--until they find your password. In older systems the hacker would simply have access to the passwords in plain text, and since people share passwords between systems (but shouldn't) it's easy to then hack someone's accounts across multiple servers.
With bcrypt an attacker now has to solve a complex decryption problem that could take centuries to solve. I think it's centuries. Either way it's a really long time compared to before, so we consider
bcrypt exploitation later unless someone uses a really terrible easy to guess password.
Note: Just to demonstrate how easy it is to crack most people's terrible passwords you should go get john the ripper and try it on your computer.
I'm telling you all of this about
bcrypt because it plays into what I end up deciding later. Hang on. For now, I have a system with no information about you. You could give me two totally randomized phrases as your username and password and pay me with Monero and I'd probably never know who you are.
Except, we now have one problem.
Be honest with yourself. How many times a year do you forget or lose a password and need to reset it? I'm an avid random password user and use a password store system and I still have to reset a random password periodically. Sometimes I have to reset a password not because of my fault, but because the programmers on another website messed up their password hashing code.
Non-programmers are far worse off than a professional like me. They use passwords they remember, and we all know how terrible human memory is at remembering anything. This limitation in human memory is the main reason people recycle passwords between systems, and why they need to do password resets periodically.
My super private course system has no emails. How will you reset your password? You don't remember your password, or even your username sometimes. How will I confirm you actually own the account without some secondary identity proof?
Alternatives to Email are Worse
People need password resets through some secondary proof of identity. Email works as a secondary identity and that's why people are worried about it used as a tracking mechanism. If we need password resets, then we need an alternative proof of identity. Maybe one of these:
- Phone Numbers -- Worse than email for privacy.
- OAuth Logins -- Still being tracked by the other site, and that other site can shut down my site for no reason.
- 2FA -- This might work, but now people need to setup a 2FA which is difficult and then why have a password at all?
- WeChat, WhatsApp -- Just as much tracking as phone numbers or email, also difficult to code against.
Clearly Phone Numbers, WeChat, WhatsApp or any alternative communication system relying on a 3rd party is just as bad as email. I'd say these are actually worse than email because people believe they're better when they're not.
OAuth also doesn't solve the problem of not tracking, it just makes that problem someone else's problem. It also puts the control of my user's logins in the hands of a company I most likely do not trust.
That leaves 2FA (Two Factor Auth) as an alternative second auth. In this case you would install an appn (like Google Authenticator) on your phone that uses the current time and some cryptography to create a synchronized rotating shared secret. If you lose your password, I challenge you to show me your 2FA code, and if it matches then I let you ... wait a minute.
How I got in this mess was trying to add password resets and realizing with password reset emails you don't need passwords. It's just security theater. If I'm letting you reset with a 2FA device, well then that 2FA is useless as an actual 2FA. It's just a login password and I'm simply making this more complicated.
I believe that if you want password resets on your website (and you could debate that), then you are stuck asking for either something difficult for most users to use, or something that invades their privacy worse than email.
Let's take a look at those four big risks people see with giving me an email:
- Spam you
- Sell your email
- Hacker gets your email
- Track you across the internet with my vast resources (LOL).
What essential element do I need to make all of these work? I need your email in my database in clear text. Meaning, no encryption or obfuscation. If all I want email for is password resets, then could I store these emails encrypted somehow so they're only available during a password reset request?
Duh, bcrypt The Email!
If I use something like AES to encrypt the emails then any hacker getting in--or myself if I suddenly feel like it--can simply use the key the server is using and read the emails. We need an "encryption" that not even I can decrypt from the database.
We already do this with passwords in our design. We use
bcrypt to encrypt the password, and decrypting it is difficult for most attackers. Why not store the email with
bcrypt as well? Here's how it might work:
- When you register I store your email just like I do passwords using
- You lose your password, but you know your username so you give me your
- I take your email and process it just like a password. I pull up your record via your username, grab your stored
- I then hash the email you've submitted against what I have in the database, and if
bcryptsays they match, then I can confirm you at least know the email.
- Not done yet. I take the email you gave me--which is temporarily available to my server--and send you the usual password reset email. This prevents someone who just knows your email (which is easy given all the leaks) from gaining access. I need to still confirm you control that email.
- You do the typical click a link in this email, reset your password, and we're done.
- Finally, after sending the password reset I toss out your email rather than store it. I also can't log it anywhere.
In this scenario, my database is full of the same kind of garbage hashes that
bcrypt makes for passwords, and attempting to extract the email out of them would require far to much cost for me or for an attacker who gets the database.
The Risks Involved
Compared to storing nothing (which prevents password resets) this seems like a reasonable compromise. Users are already giving the servers their password to test the
bcrypt(password) hash, so it's assumed that this has at least that level of security. If someone can get in and access the server to grab passwords then they can grab emails anyway so we don't really lose much.
The main risk is if I'm a bad actor and straight up lie about storing them in the clear. I'd have to risk getting caught, create a second system to grab emails as people request resets, and store them. Since the emails are only given to me at signup and when there's a password reset, this would take quite a while. Again, far too expensive and time consuming.
This last risk could be mitigated by allowing people to opt-in to give an email for password resets. I think this would need to make it very clear they will lose their purchase if they lose their username/password, but if someone is super worried about it, then I see no reason to say they can't agree to this and not give one.
Then we come to the following policy to at least reach a position that has reduced risk while also allowing password resets for people who want them:
- Email addresses are stored just like passwords using bcrypt.
- Password resets are the only place I ask for an email.
- I use the email given at the time of a reset to confirm the
bcrypt(email)then send the reset.
- Once I send this I don't store the email and don't log it anywhere.
- At signup I tell people that they can give an email for password resets, or accept that they will lose their purchase if they don't.
- They can change this decision at any time in the future, just in case their risk model changes.
Time To Try It
I'm still working on the site site for Learn JS The Hard Way, but I think I'll go with this plan and see how the code shapes out. The above is mostly just writing about the idea, my thought process, and what I think might work. We'll see how this works in practice, and after I'm done I'll do a follow-up post explaining how it works.
More from Learn Code the Hard Way
How to Read Programmer Documentation
An excerpt from Learn Python the Hard Way, 5th Edition that explains how I analyze learn from projects with poor or no documentation (which is most of them).
The 5 Simple Rules to the Game of Code
An experimental idea to teach the basics of a Turing machine before teaching loops and branching. Feedback welcome.
Announcing _Learn Python the Hard Way_'s Next Edition
Announcing the new version of _Learn Python the Hard Way_ which will be entirely focused on Pre-Beginner Data Science and not web development.
Ten Reasons Youtube's Streaming is Awful
I did a test of Youtube and its streaming has tons of problems. Here's 10 reasons why Youtube's streaming is mostly pointless when compared to Twitch. I'll use Twitch for streaming, then post to youtube.