Information Gathering (Footprinting)


Information Gathering/Footprinting is crucial in the whole process of penetration testing.More the information gathered about the target(application/user), more is the probability that appropriate results are obtained. Many tools are available for footprinting which I will be sharing in this post.

Information gathering is not just one phase in security testing! Its an art where each one of us should be a master shifu at gathering relevant info for a better experience in the whole process of Penetration testing.

This post would give you a glimpse of footprinting like, what is footprinting? Purpose of footprinting? What does an attacker gain from footprinting? The various information that can be gathered and how could that be useful in attacking/securing an application or a website.

Ok let me give you a generic example, If a theif wants to rob a bank how does he plan his moves? Will he directly go to the bank and rob without a plan or will he plan by collating all the information required to execute the plan successfully?

The theif would plan by collecting the information from different sources like security loopholes and other required information. Once he plans his moves he would be ready to execute his plan and would go ahead with robbing the bank.

Without this information it would be very difficult for a robber to successfully rob a bank.

Cautionary note: This is only an example. Please use the information wisely and for the purpose of learning only.

Reconnaissance is one such way of collecting the information about an organization/application. The basic reason behind Information gathering here in security context is to learn about the architecture, infrastructure, design, security loop holes of an organization or any application..

There are 2 ways of collecting various information about an organization or 2 types of footprinting,

  1. Active Footprinting

  2. Passive Footprinting

I would be explaining about different ways of gathering information in both the ways using different variety of tools and techniques.


1.    Whois is a widely used Internet record listing that identifies who owns a domain or who has registered that particular domain and how to contact them. The Internet Corporation for Assigned Names and Numbers (ICANN) regulates domain name registration and ownership. Whois records have proven to be extremely useful and have developed into an essential resource for maintaining the integrity of the domain name registration and website ownership process.

A Whois record contains all of the contact information associated with the person, group, or company that registers a particular domain name. It also provides information about when was a particular domain registered or getting expired, and when was the last update made on that domain and sometimes this records may also provide the administrative and technical contact information.

2.    Metagoofil is an information gathering or footprinting tools used for extracting information or data which is publicly available on internet belonging to company. INformation can be of any formats like pdf,xls,ppt,doc and much more. Basically metagoofil performs google search in obtaining different files it also uses different file type libraries like PDFMiner which have an index of different PDF files and others. It also provides very useful information like usernames which would in turn be helpful for brute force attack and other information like versions of different softwares and servers being used.

 3.    The Harvester is also used for information gathering where it helps you in extracting the email address and subdomains of a particular target, Harvester is an simple python script which searches information from giant search engines like Google, Yahoo, Bing and much more.

4.    Nmap can be extensively used for information gathering with the help of Nmap Scripting Engine. It can he helpful in basic information about the target like IP address and the ports open and services running, it can be used in determining the information like whois over an nmap console which we discussed earlier, it is also used in harvesting email addresses which discussed earlier(The Harvester), it also helps in discovering additional host names or sub domains that exist on the same server. You can learn more on Namp and other tools on my previous post(Security testing for beginners)

5.    A search engine named Bing( by Microsoft has a unique feature where in which it could help hacker in enumerating all hostnames which bing had indexed on that server or specific IP address. We can easily use it parameter IP: followed by IP address of the server where in which it provides all the websites hosted on that server. An alternative to the same would be Reverse IP lookup

6.    Blackwidow and HTTrack Website copier is used in better understanding of the website flow as it can be used in cloning the entire domain and could help in offline debugging and to perform tests on local. It can be used in the cases where the server responds only on a particular network.

7.    One of the easiest and craziest way would be Social Engineering. It is an art of wangling people to reveal confidential information which is not supposed to be told out. It involves gaining the trust of an individual in order to obtain confidential information. Social Engineering is a non technical attack but involves tactics for making a victim get trapped. This is an art of gaining important information about an organization, its employees Department, Extension, Email, Role, Phone number etc.

For more information you can have a look at my previous post What is Social Engineering which can give interesting insight on how anyone can be victimised.

I could conclude as, Information Gathering or Footprinting or Reconnaissance is the initial step for penetration testing, more the information you gather more you would be successful in performing the penetration tests. If you are interested in learning further, I would suggest you to start using Kali-Linux or BackTrack!