Skip to content

AI Pentest Lab

This is a hands-on experiment, not a polished guide.

The idea is simple: set up a self-contained pentest environment on a Mac using OrbStack — a Kali Linux VM as the attack machine and OWASP JuiceShop as the target — then hand the keyboard to Goose, an AI agent with access to real security tools, and see what happens.

The goal is not to prove that AI can replace a pentester. The goal is to find out exactly where it helps, where it struggles, and where it falls flat — and document all of it honestly.

What this experiment covers

  • Setting up a contained pentest lab on macOS using OrbStack
  • Running Kali Linux as a VM with standard pentesting tools installed
  • Deploying OWASP JuiceShop as the target (intentionally vulnerable web app)
  • Configuring Goose with shell access inside Kali
  • Letting Goose run recon, scanning, and exploitation attempts
  • An honest breakdown of what the AI actually managed vs. what required human intervention

Why JuiceShop

OWASP JuiceShop is purpose-built for security training and CTF-style challenges. It contains over 100 documented vulnerabilities across the OWASP Top 10 — SQL injection, XSS, broken authentication, IDOR, and more. It’s a fair target: well-documented enough that you can verify findings, complex enough that a dumb scanner won’t get far on its own.

Why Goose

Goose is an open-source AI agent from Block that runs locally and can execute shell commands, read output, and chain tool calls. With a shell extension enabled, it can run nmap, sqlmap, nikto, and anything else available on the system. It’s one of the more capable local agents for this kind of task — and it’s already covered in the Goose – Getting Started guide.

What you will need

  • A Mac (Apple Silicon or Intel)
  • OrbStack installed — see the OrbStack guide
  • Basic familiarity with the terminal
  • No prior pentesting experience required — but it helps to know what the tools are doing