|
|
Database Solution for "Rainbow Tables"
|
|
|
|
Addicted to MacNN
Join Date: Jan 2000
Location: Stoneham, MA, USA
Status:
Offline
|
|
So I have a hash generator tool on WhatsMyIP.org, and I'm thinking about making a "Hash lookup" tool to go along with it. Of course, you can't reverse an MD5 or SHA1 hash, so a lookup tool has to have a database of pre-hashed strings to search.
My server setup is this: I have an intel Xserve that lives in a data center. That is my main server. I could compile a few wordlists, salt up the lists with number prefixes and suffixes, and come up with a mysql table of a few hundred million rows, which should only take up 20 or 30 GB of space. I certainly have the space. And I've done some testing with much bigger mysql hash tables, and they're very fast. Thats part 1. Part 2 is that I have a 'home server' mac mini. I would buy a 3 TB hard drive (or more) to use just for this, and I would make a brute force hash database, where I would just completely fill the entire hard drive with hashes. So when you did the search on the website, it would search it's own local database in about 1/10th of a second. And if it found no results, it would then send a request to my home server, to search its mega database. The details of getting all that working are very easy, a little jQuery, a few php wrappers and I'm good to go.
HERE is what I'm wondering though. Is there a better solution for these tables, than MySQL? With MySQL, I can store the hashes as raw binary data, so its 16 bytes for each MD5 hash and 20 bytes for each SHA1. But then you need to index the tables, otherwise each search will take a few days. And the indices are huge. A full index on each hash column doubles the size of the database. I am doing a little experimenting right now with partial indices, but until I actually have a 1 trillion row table, I really won't be able to know for sure just how few bytes of each column I can get away with indexing. This is going to be a straight lookup table. Each row will consist of three columns: The string, the MD5 of that string, and the SHA1 of that string. No joins, no inserts (other than during creation of course), just simple single row lookups.
AND THIS is what has me wondering. Is there a solution available that might be just as fast (no more than say 2 seconds per search), but smaller in data size. A simple database solution that could in theory, handle up to a 6 TB database? Can MySQL even handle a 6 TB MyISAM table? The search is going to be done through a php script, through apache. But if there was a better database to use, as long as I can do queries through a simple command line app with easily parseable output, that should be totally fine.
Sooooo what do you think?
(
Last edited by l008com; May 2, 2014 at 07:27 PM.
Reason: Typos)
|
|
|
|
|
|
|
|
|
Administrator
Join Date: Jun 2000
Location: California
Status:
Offline
|
|
Since you want to look up via MD5 or SHA1, you either need indexing or two DBs. If you don't want to index twice, the only other option I can think of is to have two DBs. MD5 + string, SHA1 + string. Do your own inserts and lookups, so you create each DB already sorted by hash value. Which might double your DB size anyway, and creates extra overhead / debugging to do your custom insert / lookup code.
I'd eat the extra space and fully index each hash field. Doing it manually won't save a dramatic amount of space, just creates a lot more work for you.
|
|
|
|
|
|
|
|
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
|
|
|
|
|
|
|
|
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
It seems like the chances of being able to lookup a hash and get its original value are pretty low, how do you intend to do that, and what would the useful application for this be, aside from cracking passwords?
|
|
|
|
|
|
|
|
|
Posting Junkie
Join Date: Oct 2005
Location: Houston, TX
Status:
Offline
|
|
|
|
|
|
|
|
|
|
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
Originally Posted by besson3c
It seems like the chances of being able to lookup a hash and get its original value are pretty low, how do you intend to do that
Never mind this, this is explained in the Wikipedia page on rainbow tables.
|
|
|
|
|
|
|
|
|
Administrator
Join Date: Apr 2001
Location: San Antonio TX USA
Status:
Offline
|
|
I have used your site in the past, but never used any of the other tools...there are a LOT of hash functions I wasn't aware of. I'm curious though about what users would use your hash function for, and how the hash lookup would be used - both whether it would be linked to the hash tool and whether it could be used to crack passwords by unethical users.
Aside from that, I'm WAY over my head in the database structure and techniques field. I can manage using SQL (slowly) but getting past implementing a search algorithm is something I haven't even looked at for (holy cow!) over 20 years...
|
Glenn -----OTR/L, MOT, Tx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Forum Rules
|
|
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
|
|
|
|
|
|