Skip to content

Information Gathering

Information gathering is the first major phase of a penetration test. The goal is to build a clear picture of the target environment before attempting exploitation.

Good enumeration helps answer questions like:

  • What systems are exposed?
  • What services are running?
  • What technologies are in use?
  • What names, domains, certificates, and public records exist?
  • What attack paths may be worth investigating?
  • What should be avoided because it is out of scope?

A strong penetration test is rarely won by guessing. It is usually won by careful observation, note-taking, validation, and patience.

Why Information Gathering Matters

Every target has an attack surface. This may include public websites, VPN portals, mail servers, cloud services, employee portals, DNS records, exposed storage, forgotten development systems, and internal services discovered later during the assessment.

That attack surface changes over time. New systems are deployed, old systems are forgotten, DNS records become stale, certificates reveal hostnames, and services may be exposed unintentionally during migrations or troubleshooting.

Information gathering helps turn a vague target into a structured map.

Instead of thinking:

“I need to hack this target.”

Think:

“I need to understand what exists, what it does, how it is exposed, and where weaknesses may appear.”

Passive vs Active Information Gathering

Information gathering is usually split into two categories: passive and active.

Passive Information Gathering

Passive information gathering uses sources that do not directly interact with the target’s systems. The goal is to collect useful information while minimizing contact with the target infrastructure.

Examples include:

  • WHOIS records
  • DNS records from public sources
  • Search engine results
  • Public code repositories
  • Certificate transparency logs
  • Job postings
  • Technology fingerprinting databases
  • Public breach data, where legally permitted
  • Shodan, Censys, and similar internet search engines

Passive reconnaissance is useful because it can reveal domains, subdomains, technologies, usernames, email formats, cloud providers, and exposed services before a single packet is sent directly to the target.

Active Information Gathering

Active information gathering involves direct interaction with the target environment. This may include sending packets, making web requests, querying DNS servers, scanning ports, or connecting to services.

Examples include:

  • DNS enumeration
  • TCP and UDP port scanning
  • Service detection
  • Web directory discovery
  • SMB enumeration
  • SMTP enumeration
  • SNMP enumeration
  • TLS certificate inspection
  • Manual interaction with discovered services

Active enumeration is often where the most useful technical detail appears, but it must be performed carefully and within the approved rules of engagement.

Core Mindset

Information gathering is not just about running tools. Tools collect data, but the tester must interpret it.

For every result, ask:

  • What did I learn?
  • Is this result confirmed or only guessed?
  • Does this system belong to the approved scope?
  • What service or technology is exposed?
  • Is the version known?
  • Does the result suggest another place to enumerate?
  • Could this become part of an attack path later?

A single DNS record may lead to a forgotten web application. A certificate may reveal internal naming patterns. A web server header may identify a framework. An SMB share may expose usernames. Enumeration is the process of connecting those small pieces into a useful picture.

Information Gathering Workflow

A practical workflow looks like this:

  1. Confirm the scope.
  2. Identify domains, IP ranges, and hostnames.
  3. Collect passive intelligence.
  4. Enumerate DNS records and subdomains.
  5. Discover live hosts.
  6. Scan TCP and UDP ports.
  7. Fingerprint services.
  8. Enumerate each service manually.
  9. Record evidence and commands.
  10. Build a target map.
  11. Prioritize likely attack paths.

Do not rush from discovery to exploitation. Missing one service during enumeration can mean missing the easiest path into the environment.

What to Document

Your notes should be detailed enough that you can return later and understand exactly what happened.

At minimum, record:

  • Target hostname or IP address
  • Commands used
  • Tool output
  • Open ports
  • Service versions
  • Interesting web paths
  • Usernames or naming patterns
  • Authentication portals
  • Error messages
  • Screenshots where useful
  • Possible vulnerabilities
  • Follow-up tasks

Good notes make exploitation easier, privilege escalation faster, and report writing much less painful.

Example Target Notes Format

```text Target: 192.0.2.25 Hostname: web01.example.local Status: Live

Open ports: - 22/tcp SSH OpenSSH 8.x - 80/tcp HTTP Apache - 443/tcp HTTPS Apache - 445/tcp SMB

Interesting findings: - HTTPS certificate contains additional hostname: dev.example.local - Web server redirects to /login - robots.txt references /backup - SMB allows anonymous connection but no readable shares yet

Follow-up: - Add dev.example.local to /etc/hosts - Run web content discovery against both hostnames - Enumerate SMB manually - Check Apache version and modules