AI Pentest Lab

This is a hands-on experiment, not a polished guide.

The idea is simple: set up a self-contained pentest environment on a Mac using OrbStack — a Kali Linux VM as the attack machine and OWASP JuiceShop as the target — then hand the keyboard to Goose, an AI agent with access to real security tools, and see what happens.

The goal is not to prove that AI can replace a pentester. The goal is to find out exactly where it helps, where it struggles, and where it falls flat — and document all of it honestly.

What this experiment covers

Setting up a contained pentest lab on macOS using OrbStack
Running Kali Linux as a VM with standard pentesting tools installed
Deploying OWASP JuiceShop as the target (intentionally vulnerable web app)
Configuring Goose with shell access inside Kali
Letting Goose run recon, scanning, and exploitation attempts
An honest breakdown of what the AI actually managed vs. what required human intervention

Why JuiceShop

OWASP JuiceShop is purpose-built for security training and CTF-style challenges. It contains over 100 documented vulnerabilities across the OWASP Top 10 — SQL injection, XSS, broken authentication, IDOR, and more. It’s a fair target: well-documented enough that you can verify findings, complex enough that a dumb scanner won’t get far on its own.

Why Goose

Goose is an open-source AI agent from Block that runs locally and can execute shell commands, read output, and chain tool calls. With a shell extension enabled, it can run nmap, sqlmap, nikto, and anything else available on the system. It’s one of the more capable local agents for this kind of task — and it’s already covered in the Goose – Getting Started guide.

What you will need

A Mac (Apple Silicon or Intel)
OrbStack installed — see the OrbStack guide
Basic familiarity with the terminal
No prior pentesting experience required — but it helps to know what the tools are doing

Part 1 – Setup & Experiment