Extract e-mails BIG DB dump file

Bablo

DEER
Messages
168
Reputation
14
Reaction score
22
Points
18
CREDITS TO AUTHOR



LetZ say you got some 2 GB .sql DB dump file and you're only interested in getting users e-mail from it. What's the best way to do it?


1. Find out DB's structure
Since you don't need whole DB, it will save your time & server load if you work only with user table from this point on.

First, you must somehow get DB's structure. In order to do that, use grep:
Code:
grep .Table structure. somesiteDB.sql | cut -d\` -f2 > dbstruct.txt

Now open dbstruct.txt and search for user table (_user, users, _members, etc.) - your file structure will look similar to (this is vb DB structure):
Code:
.
.
.
prefix_user
prefix_useractivation
.
.
.
So you found your user table (prefix_user), but you'll need to write down the following one too (prefix_useractivation) because you'll need it in next command.


2. Extract user table
We'll use sed to do it (be careful where you'll put prefix_user and prefix_useractivation and don't change anything else!):
Code:
sed -ne "/- Table structure for table .prefix_user./,/- Table structure for table .prefix_useractivation./p" somesiteDB.sql > usertable.sql
Basically, you're copying everything between those two strings (prefix_user and prefix_useractivation) using that command ^^^...


3. Extract e-mails
OK, this is the last and easiest step - perl script should do it just fine:
Code:
perl -wne'while(/[\w\.\-]+@[\w\.\-]+\w+/g){print "$&\n"}' usertable.sql | sort -u > emails.txt

And you're done! Proof of concept:
emails.png
 
Top