Join us
@devopslinks ・ Dec 01,2025

A security research project led by Luke Marshall scanned 5.6 million GitLab repositories, uncovering over 17,000 live secrets and earning $9,000 in bounties, highlighting GitLab's larger scale and higher exposure risk compared to Bitbucket.
The research uncovered over 17,000 verified live secrets in public GitLab repositories, indicating a significant risk of credential exposure.
The use of AWS Lambda and SQS allowed for the efficient scanning of 5.6 million repositories in just 24 hours.
The study found that secrets are more likely to be exposed on the same platform they are associated with.
The presence of old, valid credentials dating back over a decade underscores the need for regular secret rotation and management.
The process of disclosing exposed secrets to over 2,800 organizations required significant automation and effort.
Luke Marshall's latest security research project is a fascinating exploration into public GitLab repositories. Armed with TruffleHog, he scanned a staggering 5.6 million repositories, unearthing over 17,000 verified live secrets. This impressive haul netted him more than $9,000 in bounties. The technical expertise behind this feat? A clever use of AWS Lambda and SQS, which allowed him to complete the entire operation in just 24 hours. Interestingly, GitLab stood out with a larger scale and a higher number of exposed secrets compared to Bitbucket, boasting nearly twice as many public repositories and a denser concentration of leaked secrets per repository.
Now for the part that made this whole project actually work. The setup started with a simple Python script that pushed every repository name into an AWS SQS queue - basically a giant, reliable to-do list. An AWS Lambda function picked up each task, ran TruffleHog on the repo, and saved whatever it found. The whole thing was fast, clean, and impossible to mess up: no duplicate scans, and if anything crashed, the system just picked up where it left off.
The results were eye-opening. GitLab didn’t just have more public repositories than Bitbucket - it also leaked secrets more often. On average, GitLab repos had about 35% more exposed credentials, including a surprisingly high number of GitLab-specific tokens. It painted a clear picture: GitLab’s ecosystem is bigger, more active, and unfortunately, more exposed.
One of the more interesting insights from Marshall's research was the idea of 'platform-locality' - secrets tend to leak on the same platform they're associated with. For instance, 406 valid GitLab keys were found on GitLab repositories, compared to just 16 on Bitbucket. This highlights the importance of disciplined, large-scale scanning to identify and mitigate high-impact exposures on major Git platforms. But the project didn't just stop at discovery; it automated the triage process to disclose the leaked secrets to over 2,800 organizations, leading to the revocation of thousands of live keys. Despite the higher volume of exposed credentials on GitLab, the total bounty payout was similar to that of Bitbucket, suggesting that more leaks don't always equate to a higher critical impact.
The number of public GitLab repositories included in the full-platform secret exposure scan.
The number of verified live secrets detected across all scanned GitLab repositories.
The total bounty rewards earned during the disclosure of exposed secrets.
The total runtime required to scan all public GitLab repositories using AWS Lambda and SQS.
The cost incurred for cloud infrastructure to complete the scanning pipeline.
The growth factor representing how many more GitLab repositories were scanned compared to Bitbucket.
The growth factor representing how many more verified secrets were found on GitLab compared to Bitbucket.
The percentage increase in leaked-secret density on GitLab repositories relative to Bitbucket repositories.
The number of valid GitLab API tokens discovered leaking within GitLab-hosted repositories.
The number of valid GitLab API tokens discovered leaking in Bitbucket-hosted repositories.
The earliest year found among commits containing valid leaked secrets during the scan.
The approximate number of public GitLab repositories added between the initial scan and publication.
The total number of unique domains associated with the exposed secrets discovered.
The number of organizations that received responsible disclosure notifications about leaked secrets.
The number of SaaS providers contacted directly to resolve exposed customer credentials.
The concurrency level used by each Lambda invocation when running TruffleHog scans.
Conducted a project scanning public GitLab repositories to identify exposed secrets.
The platform where the security research identified exposed secrets, highlighting potential security issues.
Used by Luke Marshall to scan GitLab repositories for exposed secrets.
The earliest valid secret uncovered during the GitLab scan, committed in 2009—nearly two years before GitLab itself launched.
GitLab was released as a Git-based code hosting platform, later becoming one of the largest public repository ecosystems.
Bitbucket’s leaked-secret frequency stabilized in the mid-hundreds annually, contrasting with GitLab’s rising exposure trend.
Beginning of the research effort during which over 5.6 million public GitLab Cloud repositories were identified via GitLab’s API.
All 5.6 million repositories were scanned using AWS Lambda and SQS, uncovering over 17,430 verified live secrets.
Roughly 100,000 new public GitLab repositories were created shortly after the initial study, highlighting the platform’s rapid growth.
Subscribe to our weekly newsletter DevOpsLinks to receive similar updates for free!
Join other developers and claim your FAUN.dev() account now!
FAUN.dev() is a developer-first platform built with a simple goal: help engineers stay sharp without wasting their time.

FAUN.dev()
@devopslinks