We're using django to make a json webservice front-end for mysql. We have apache and django running on an EC2 instance and MySQL running on an RDS instance. We've started benchmarking performance using apache bench and got some really poor performance numbers. We also noticed that while running the tests, our apache/django instance goes to 100% cpu usage at very low load and the MySQL instance never gets above 2% cpu usage.
We're trying to make sense of this and isolate the problem, so we did several ab tests:
Why is authenticate so slow? Is it writing data to the db, finding a billion digits of pi, what?
We would like to keep the call to authenticate in these functions, because we don't want to leave them open to anyone that can guess the url, etc. Has anyone here noticed that authenticate is slow, and can anyone suggest a way to remedy it?
Thank you very much!
The Django authentication system handles both authentication and authorization. Briefly, authentication verifies a user is who they claim to be, and authorization determines what an authenticated user is allowed to do. Here the term authentication is used to refer to both tasks.
If you have an authenticated user you want to attach to the current session - this is done with a login() function. To log a user in, from a view, use login() . It takes an HttpRequest object and a User object. login() saves the user's ID in the session, using Django's session framework.
Authenticating Users It checks the credentials against the authentication backend and returns User objects if they are valid. If they are not valid for a backend or they have no permissions, Django will return “none.”
auth import authenticate, login def my_view(request): username = request. POST['username'] password = request. POST['password'] user = authenticate(username=username, password=password) if user is not None: if user. is_active: login(request, user) # Redirect to a success page.
I am no expert in authentication and security but the following are some ideas as to why this might be happening and possibly how you can increase the performance somewhat.
Since passwords are stored in the db, to make their storage secure, plaintext password are not stored but their hash is stored instead. This way you can still validate user logging in by comparing the computed hash from the typed password to the one stored in the db. This increases security so that if a malicious party will get a copy of the db, the only way to decode the plaintext passwords is by either using rainbow-tables or doing a brute-force attack.
This is where things get interesting. According to Moore's Law, computers are becoming exponentially faster, hence computing hash functions becomes much cheaper in terms of time, especially quick hash functions like md5 or sha1. This poses a problem because having all of the computing power available today combined with fast hash functions, hackers can brute-force hashed passwords relatively easy. To combat this, two things can be done. One it to loop the hash function multiple times (output of the hash is fed back into the hash). This however is not very effective because it only increases the complexity of the hash function by a constant. That's why the second approach is preferred which is to make the actual hash function be more complex and computationally expensive. Having more complex function, it takes more time for the hash to be computed. Even if it takes a second to compute, it is not a big deal for end-users, but it is a big deal for brute-force attack because millions of hashes have to be computed. That's why starting with Django 1.4, it uses a pretty computationally expensive function called PBKDF2.
To get back to your answer. It's because of this function, when you enable authentication, your benchmark number drastically goes down and your CPU goes up.
Here are some ways you can increase the performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With