So I took to Google:To “brute force” a password means to try all possible combinations of characters until you finally guess the correct password.
In my case, I was confident that the password to my daughter’s tax form consisted of 12 digits.
In this age of computing, that can’t be too hard to crack, can it?The first step was to extract the password hash from the PDF document.
If “hash” makes you think of “hash browns,” you’re not too far from the truth.
Password hashes are even “salted” to make them more difficult to crack.
To make hash browned potatoes, you need to grate the potatoes into little shreds.
Let’s call this grating tool a “hashing algorithm.
” To continue the analogy, in order to crack our password, we need to feed potato after potato through the hashing algorithm until we find one that comes out exactly like the original hash.
As you can imagine, that takes a lot of potatoes.
Are you hungry yet?Me too.
Here’s my recipe:In the future, only the cool kids will write.
First, download and install Perl.
Perl is a super-geeky programming language that I wish I knew.
I’m on Microsoft Windows (hey, be nice) so I tried the ActiveState and Strawberry flavors of Perl.
They both work, the critical part is making sure that Windows associates the .
pl file extension with Perl.
Next, I downloaded the GitHub repo for John the Ripper.
John the Ripper (henceforth “JtR”) is another geek tool with a really long history.
It’s main purpose is to grate our potatoes into hashes as fast as possible until we get a match.
I downloaded the Windows build and unzipped it.
It requires no installation.
It’s not very apparent what to click on to download the build, so here’s a screenshot.
I located the “run” folder inside of the JtR directory, and copied the PDF file (“TaxForm.
pdf”) I was trying to crack into it.
I was preparing my workspace.
Then I opened Windows Command Prompt and navigated to the JtR “run” directory, where there are myriad, no — a plethora — of Perl and Python scripts we can use to extract password hashes from all different types of files: 7z2john.
The one I needed was called pdf2john.
pdf output the hash from the PDF file on the screen.
But JtR needs that hash in a text file.
So I redirected the output to a text file by adding >hashfile.
(This hash isn’t really the hash of my daughter’s birth date and SSN.
)I was almost ready to start cracking.
I just needed to figure out how to tell JtR to only try combinations of numbers 0–9 that were 12 characters in length.
The option to tell JtR to only use numbers is–incremental=digits, but specifying a length of 12 characters required editing the john.
conf file (conveniently located in the “run” directory).
Normal Windows notepad won’t detect the line breaks in the john.
conf file, so I opened it with Notepad++.
It’s a large file, but I found the parameters I was looking for around line 1210.
I set MinLen = 12 and MaxLen = 12.
I fired up JtR with john –incremental=digits hashfile.
txtJtR starts running— on my machine — over 50,000 combinations per second through its hashing algorithm, trying to find a hash the matches the one we extracted from TaxForm.
JtR doesn’t display much output on the screen, but it will keep running in the background until Ctrl-C is pressed.
For more details about what it’s doing, I needed to repeatedly check the john.
log file:I let it run all night long and into the next day.
Then it occurred to me: there are 900 billion different combinations of 12 digits.
At 50,000 attempts per second, it would take over 6 months to try them all.
Of course, I might get lucky and find the right hash after only a few weeks, but that’s still a long time.
Not really worth it if my daughter’s employer can just snail mail us the form in a few days.
[I should note that my machine has a Core i5 processor.
I’ve heard of people who are able to daisy-chain a bunch of PlayStation processors together.
They can probably speed up this process over 100x faster than my little CPU can.
]I started to wonder if there was a better way.
Another way JtR can crack passwords is by dictionary attack.
Instead of JtR trying all possible random combinations of characters, we supply it with a list of pre-generated passwords that it can try.
In the pattern of [date of birth in MMDDYYYY + last four digits of SSN], I figured there were only about six unique digits used, and I could cut in down to five if I assume that the first three (the MMD part) were correct.
So I pieced together the following Powershell code to generate a list of all possible permutations of the final nine digits, prefixed by the first three that I hoped were correct.
This script took about 15 minutes to complete, but when it was finished I had a list of over 600,000 permutations I could use to try to crack the hash of TaxForm.
I copied my list.
txt file to JtR’s run directory, and tried john –wordlist=list.
txtJtR easily found the password in just a few seconds.
It was displayed on my screen as shown above, but that may have been because I told it to in some further tweaking I did to the john.
In any case, JtR stores passwords it cracks in a file called john.
I discovered that my daughter’s employer had her social security number wrong.
It was even incorrect on the tax form that I now was finally able to open.
I was fortunate that they had it wrong in such a way that it was caught by one of the limited permutations I had in my wordlist.
A final note: for dictionary attacks, JtR was quite particular about the encoding of the wordlist file I gave it.
It works best when the file is encoded as UTF-8.