AI Pentest Lab
This is a hands-on experiment, not a polished guide.
The idea is simple: set up a self-contained pentest environment on a Mac using OrbStack — a Kali Linux VM as the attack machine and OWASP JuiceShop as the target — then hand the keyboard to Goose, an AI agent with access to real security tools, and see what happens.
The goal is not to prove that AI can replace a pentester. The goal is to find out exactly where it helps, where it struggles, and where it falls flat — and document all of it honestly.
What this experiment covers
- Setting up a contained pentest lab on macOS using OrbStack
- Running Kali Linux as a VM with standard pentesting tools installed
- Deploying OWASP JuiceShop as the target (intentionally vulnerable web app)
- Configuring Goose with shell access inside Kali
- Letting Goose run recon, scanning, and exploitation attempts
- An honest breakdown of what the AI actually managed vs. what required human intervention
Why JuiceShop
OWASP JuiceShop is purpose-built for security training and CTF-style challenges. It contains over 100 documented vulnerabilities across the OWASP Top 10 — SQL injection, XSS, broken authentication, IDOR, and more. It’s a fair target: well-documented enough that you can verify findings, complex enough that a dumb scanner won’t get far on its own.
Why Goose
Goose is an open-source AI agent from Block that runs locally and can execute shell commands, read output, and chain tool calls. With a shell extension enabled, it can run nmap, sqlmap, nikto, and anything else available on the system. It’s one of the more capable local agents for this kind of task — and it’s already covered in the Goose – Getting Started guide.
What you will need
- A Mac (Apple Silicon or Intel)
- OrbStack installed — see the OrbStack guide
- Basic familiarity with the terminal
- No prior pentesting experience required — but it helps to know what the tools are doing