Reimplementing Microsoft SQL Server `PWDENCRYPT` / `PWDCOMPARE` in Python
This was originally posted as a self-answered question on Stack Overflow.
The Background Story
We're currently rebuilding our platform in Python and Django. The existing platform is based on .NET and related MS technologies. We're soon preparing to launch, and part of that is also migrating user data from the old to the new platform.
In the old system, user passwords are hashed and verified using the built-in
PWDCOMPARE functions of Microsoft SQL Server.
As we want the transition to be as frictionless as possible for our users, we of course also want them to not have to reset their password after we launch. So I got the seemingly simple task to create a custom Django password hasher that is able to verify user entered passwords againts the hashes from
First I set out to find some documentation on what those MS SQL functions actually do. It seems to officially not be very well documented what happens behind the scenes (if at all), and according to the docs you're not really supposed to use it:
PWDENCRYPT is an older function and might not be supported in a future release of SQL Server. Use HASHBYTES instead. HASHBYTES provides more hashing algorithms.
Fortunately, a couple of Stack Overflow questions were able to guide me in the right direction, although not all the way, due to errors in their explanations and examples, which sent me on several detours.
After spending way too much time on it, I gave up. But then a colleague gave it a try and found the missing key: The passwords must be encoded using
UTF-16LE. A detail I missed, but might be obvious to someone who is used to work in a Windows environment.
So, let's go through it step by step.
Let's try generating a password hash in SQL Server and look at the result:
SELECT PWDENCRYPT("very-secret-passwørd") GO 0x02008D6818FFB31BAAA6F849583A0C1319E1E73A260525CFEC22D4BC9FE2FBCC535C682342634D5B87B2ED1A1DA2C63B7FED3CCDFC82DE31DD5670CCAC9BA7BA5866EB893843
The result consists of 4 parts:
0x an indication that this is a hexadecimal value.
0200 tells us the version of the hashing algorithm used. In this case, it tells us that it's hashed using
SHA2_512 in the vocabulary of SQL Server). If the header is
0100, it is instead hashed using
8D6818FF is the salt for the hash.
Is the actual salted and hashed password.
This function is used to compare a password against a salted hash. Let's try it out:
SELECT PWDCOMPARE("very-secret-passwørd", 0x02008D6818FFB31BAAA6F849583A0C1319E1E73A260525CFEC22D4BC9FE2FBCC535C682342634D5B87B2ED1A1DA2C63B7FED3CCDFC82DE31DD5670CCAC9BA7BA5866EB893843) GO ----------- 1
1 is a boolean value, indicating that the password matched the hashed password. Using another hash, the result will be
SELECT PWDCOMPARE("very-secret-passwørd", 0x02008D6818FFB31BAAA6F849583A0C1319E1E73A260525CFEC22D4BC9FE2FBCC535C682342634D5B87B2ED1A1DA2C63B7FED3CCDFC82DE31DD5670CCAC9BA7BA5866EB893800) GO ----------- 0
I changed the last 2 digits of the hash to be
00 and got a
0 back: The password did not match the hashed password. So far so good.
Reimplementing in SQL Using
Now, in order to understand what's going on, I sort of reimplemented this with
HASHBYTES based on the information found in the previously mentioned Stack Overflow questions:
SELECT HASHBYTES("SHA2_512", "very-secret-passwørd" + CONVERT(NVARCHAR(MAX), 0x8D6818FF)) 0xB31BAAA6F849583A0C1319E1E73A260525CFEC22D4BC9FE2FBCC535C682342634D5B87B2ED1A1DA2C63B7FED3CCDFC82DE31DD5670CCAC9BA7BA5866EB893843
Yay! I get the same hash. All good.
Reimplementing in Python
And then the grand finale: Doing the same in Python. This is where I ended up spending days in agony, but I will save you from it and show you what to do instead.
>>> import hashlib >>> hashlib.sha512("very-secret-passwørd".encode("utf-16le") + bytes.fromhex("8D6818FF")).hexdigest() 'b31baaa6f849583a0c1319e1e73a260525cfec22d4bc9fe2fbcc535c682342634d5b87b2ed1a1da2c63b7fed3ccdfc82de31dd5670ccac9ba7ba5866eb893843'
There you go! Adding
.upper() to the result allows us to compare the hashes from SQL Server and Python:
>>> hashlib.sha512("very-secret-passwørd".encode("utf-16le") + bytes.fromhex("8D6818FF")).hexdigest().upper() == "B31BAAA6F849583A0C1319E1E73A260525CFEC22D4BC9FE2FBCC535C682342634D5B87B2ED1A1DA2C63B7FED3CCDFC82DE31DD5670CCAC9BA7BA5866EB893843" True