Hi Freehck,

  Here is the final revision of the project, it can be written in bash, python, other script language or C etc.
  Here is some description:
  file 1. email.txt contains 4 million lines, one email per line, file size 80 MegaBytes
  file 2. cidr.txt contains 7778 lines of CIDR block

  Let's call the final script/program filter.sh, we need some parameters here:
  1. -f </path/to/file>		File path of the email list.
  2. -c </path/to/file>		File path of the CIDR formatted IP block list.
  2. -p <number>		Concurrent threads/process use to do the job, 500 ~ 10000 will be used in production time, default is 100.
  3. -o1 </path/to/file>	File path of Output type 1
  4. -t <number>		Timeout number of DNS lookup etc, default is 1 seconds.
  5. -d <number>		Concurrent connection per domain, descripted in the end.

  The script/program reads lines from email list, each line contains an email address, but the only part we use is the domain suffix, which is the "corporation.co.uk" part of "richard@corporation.co.uk", script resolve the MX record of the domain name, then resolve the A record of the previous result, if everything is right, we will get an IP address. If there are more than 1 record found, just use the first one. If some error occured in this stage, say domain name doesn't exist, MX record doesn't exist, then output this email address to the file option "-o1 </path/to/file>" specified, well as you know this file contains error email addresses.

  If everything goes right, we have an IP address now, then we need to determine if the IP address is in a list of CIDR formatted IP blocks, the file path of CIDR file is specified using -c </path/to/file>. The file contains 7,888 lines so if you can use a effective algorithm it will be awesome :)
  If the IP address is within the range of the CIDR list, then append this email address to the file as "email.txt.match", elsewise append the email address to the file as "email.txt.unmatch". So if the parameter "-f e.txt" used, then the two file will be "e.txt.match" and "e.txt.unmatch", well I think it's kind of simple is all right :)

  About the concurrent/parallel threads/processes, the server has 24-cores, I think a resolve and lookup function will not fully utilized a core of CPU, so the parallel number will be 500~10000 in production time.

  Well I think the script/program is done, my server configuration is 24-Core CPU with 140 GB memory, 100 Mbps broadband bandwith connection. OS is Centos7 64-bit.
  There is the full version of CIDR list file contains 7,778 lines, and a sample version of email address file contains 1,000 lines, if there is anything wrong please let me know.

  I'm glad you accpet the jobs, I'll write a good revision for you after the project done.

  By the way, can you write a script/program that test if a email address is actually exist? I mean send HELO message and TO message to the SMTP server, to test if a username/email actually exist, if the email address is not exist then the SMTP server will return a message like user not found. As we got the result, the SMTP connection can be end. The description don't really accurate, you may find many examples in github, I found some but these scripts doesn't do the job concurrent/parallel.
  If the SMTP server returns good result, then the email address will be append to file "email.match.yes" or "email.unmatch.yes", elsewise append to file "email.match.no" or "email.unmatch.no". Default SMTP connection timeout is 1 and can be configured using option "-t <number>", unit is second(s). The "-t <number>" option apply all connection timeout like DNS, SMTP.
  The domain of email list of very spare, giant proviers like gmail, hotmail, yahoo mail address is not included, so it may not to worry the SMTP server refuses connection from my IP address, but anyway concurrent connection per domain is 10 by default, and can be set using option "-d <number>". If connection to the SMTP server timed out, then the email address will be append to "email.match.timeout" or "email.unmatch.timeout", just add the ".timeout" suffix as filename, these files stored for future use.
  Please quote this script/program if you like, my target price is around 10 USD.. It can be intergated to one script/program, or a seperate one.

  Have a nice day!
Regards,
Steven