Why another open source penetration testing framework?

IBM’s 2024 Cost of a Data Breach report puts the average breach at $4.88 million. That number keeps climbing. And yet most engineering teams still choose between two bad options: pay $20k+ for a manual pentest engagement that takes three weeks, or run an automated web security scanner that buries you in false positives.

We spent two years watching this play out across CloudDrove’s client base. Teams would run ZAP or Nuclei on a Friday, get back 400+ findings on Monday, spend the next sprint triaging—and still miss the SSRF that actually mattered. Something was broken in the workflow, not the tools themselves.

ReconX started as an internal project to fix that workflow. It’s now open source.

What ReconX actually does (and how it differs from Burp, ZAP, and Nuclei)

ReconX is an open source penetration testing framework that runs 26 scanner modules and feeds every result through an AI analysis layer before anything reaches your report. The scanners cover the full OWASP Top 10—and then some.

But listing 26 scanners tells you nothing useful. Here’s what matters about a few of them:

SQL Injection scanner — Most automated tools test for SQLi by injecting a single-quote and watching for a database error. That catches maybe 30% of real-world injection points. Our module runs union-based, blind boolean, time-based, and error-based detection across PostgreSQL, MySQL, MSSQL, and SQLite query patterns. It also tests parameterized endpoints that other scanners skip because the responses look “normal” at first glance.

SSRF scanner — Server-side request forgery is the vulnerability class that consistently slips through automated scanners. OWASP reports that 94% of applications tested had some form of broken access control, and SSRF often hides behind it. Our module tests internal IP ranges, cloud metadata endpoints (169.254.169.254 and its IPv6 equivalent), DNS rebinding scenarios, and protocol smuggling via redirect chains.

WAF detection and adaptation — Before any scanner module fires, ReconX fingerprints the target’s web application firewall. This isn’t just a nice-to-have. Without it, half your payloads get blocked silently and you end up with a report full of “not vulnerable” findings that are actually “not tested.” The WAF module identifies Cloudflare, AWS WAF, Akamai, ModSecurity, and others, then adjusts payload encoding across all subsequent modules.

JWT analysis — We built this after seeing the same JWT misconfiguration pattern at five different clients in a single quarter: algorithm confusion attacks where the server accepts none as a valid algorithm. The module also tests for weak signing keys, expired token acceptance, and kid header injection.

Tools like Burp Suite Professional go deeper on manual testing workflows—if you have a skilled operator and a license budget. ZAP is excellent for CI integration but doesn’t reduce the triage burden. Nuclei has a massive template library for known CVEs but won’t find application-specific logic flaws. ReconX occupies different ground: automated scanning with an AI layer that validates findings before they reach a human.

Why multi-LLM support is an engineering decision, not a marketing checkbox

ReconX supports Anthropic Claude, OpenAI GPT-4, Google Gemini, and local models through Ollama. The natural question is: why four providers?

The honest answer is that no single model is best at every part of the analysis pipeline. In our testing, Claude performed strongest at reasoning through multi-step attack paths. GPT-4 was more reliable at structured output for report generation. Gemini handled large scan datasets well within its context window. And Ollama matters for the teams—especially in defense and finance—where scan data cannot leave the network. Period.

The AI engine handles five jobs: validating whether a finding is a true positive, mapping how findings chain together into attack paths, generating context-aware payloads for verification, scoring severity with business context, and writing the final report. Swap providers freely. The pipeline stays the same.

Scan profiles built around real engagements

Not every scan needs maximum depth. We shaped five profiles around actual use cases our team runs:

  • Quick Scan: 8 modules, finishes in under 5 minutes. Good for pre-merge checks or verifying a specific fix.
  • Standard: 18 modules. The default for weekly security reviews.
  • Deep Scan: All 26 modules. Full OWASP scanner tool coverage for quarterly assessments or new application onboarding.
  • Stealth: 12 modules with throttled request rates and randomized timing. Built for production systems where a burst of 200 requests/second will page the on-call engineer.
  • API Only: 10 modules targeting REST and GraphQL endpoints, including authentication flow testing and rate limit bypass checks.

Reports come out in HTML (interactive, with filterable tables and charts), PDF (formatted for executives who won’t click an HTML file), and JSON (for feeding into your SIEM or CI/CD pipeline).

Quick start: two commands to your first scan

pip install reconx
reconx scan --target example.com --profile standard

To enable AI-powered analysis and get validated findings instead of raw scanner output:

export ANTHROPIC_API_KEY="your-key-here"
reconx scan --target example.com --profile deep --ai claude

That’s it. No YAML configuration files, no Docker compose stacks, no 40-page setup guide.

What we learned building this

A few engineering decisions worth sharing, since other teams building security tooling might find them useful.

Scanner ordering matters more than scanner count. Early versions ran all modules in parallel. The results were noisy because later scanners didn’t have context from earlier ones. Now, reconnaissance modules (DNS, subdomain enumeration, technology fingerprinting) run first and feed target metadata into the vulnerability scanners. The false positive rate dropped roughly 35% from that change alone.

WAF interference is the silent accuracy killer. We burned weeks debugging “missed” vulnerabilities that turned out to be WAF-blocked payloads returning clean 200 responses. The WAF detection module exists because of real pain, not because it looked good on a feature list.

AI hallucination in security reports is dangerous, not just annoying. A hallucinated vulnerability in a pentest report can send a team chasing a ghost for days. We added a verification step where the AI must cite the specific scanner output and raw HTTP response that supports each finding. If it can’t cite evidence, the finding gets flagged as unverified rather than dropped—because sometimes the AI identifies a real pattern that the scanner didn’t explicitly flag.

What ReconX cannot do yet

We’re being direct about the current limitations. ReconX does not yet support authenticated scanning—it cannot log into your application and test pages behind authentication flows. This is the single biggest gap and it’s at the top of the roadmap. If your critical attack surface is behind a login page, you still need a manual tester or a tool like Burp Suite for that portion.

The roadmap also includes CI/CD plugins for GitHub Actions and GitLab CI, a web dashboard for managing scan targets across teams, and expanded modules for GraphQL-specific and serverless function vulnerabilities.

Get involved

ReconX is MIT-licensed and we actively review every pull request. Whether you want to improve a detection module, add a new scanner, or just report a bug—the project is open.

Read more about how ReconX covers the OWASP Top 10, explore how AI enhances penetration testing beyond simple summaries, or see how ReconX compares to Burp Suite, Nuclei, and ZAP.

Visit the GitHub repository to get started, file issues, or submit pull requests.