Join us

Researcher Scans 5.6M GitLab Repositories, Uncovers 17,000 Live Secrets and a Decade of Exposed Credentials

Researcher Scans 5.6M GitLab Repositories, Uncovers 17,000 Live Secrets and a Decade of Exposed Credentials

TL;DR

A security research project led by Luke Marshall scanned 5.6 million GitLab repositories, uncovering over 17,000 live secrets and earning $9,000 in bounties, highlighting GitLab's larger scale and higher exposure risk compared to Bitbucket.

Key Points

Highlight key points with color coding based on sentiment (positive, neutral, negative).

The research uncovered over 17,000 verified live secrets in public GitLab repositories, indicating a significant risk of credential exposure.

The use of AWS Lambda and SQS allowed for the efficient scanning of 5.6 million repositories in just 24 hours.

The study found that secrets are more likely to be exposed on the same platform they are associated with.

The presence of old, valid credentials dating back over a decade underscores the need for regular secret rotation and management.

The process of disclosing exposed secrets to over 2,800 organizations required significant automation and effort.

Luke Marshall's latest security research project is a fascinating exploration into public GitLab repositories. Armed with TruffleHog, he scanned a staggering 5.6 million repositories, unearthing over 17,000 verified live secrets. This impressive haul netted him more than $9,000 in bounties. The technical expertise behind this feat? A clever use of AWS Lambda and SQS, which allowed him to complete the entire operation in just 24 hours. Interestingly, GitLab stood out with a larger scale and a higher number of exposed secrets compared to Bitbucket, boasting nearly twice as many public repositories and a denser concentration of leaked secrets per repository.

Now for the part that made this whole project actually work. The setup started with a simple Python script that pushed every repository name into an AWS SQS queue - basically a giant, reliable to-do list. An AWS Lambda function picked up each task, ran TruffleHog on the repo, and saved whatever it found. The whole thing was fast, clean, and impossible to mess up: no duplicate scans, and if anything crashed, the system just picked up where it left off.

The results were eye-opening. GitLab didn’t just have more public repositories than Bitbucket - it also leaked secrets more often. On average, GitLab repos had about 35% more exposed credentials, including a surprisingly high number of GitLab-specific tokens. It painted a clear picture: GitLab’s ecosystem is bigger, more active, and unfortunately, more exposed.

One of the more interesting insights from Marshall's research was the idea of 'platform-locality' - secrets tend to leak on the same platform they're associated with. For instance, 406 valid GitLab keys were found on GitLab repositories, compared to just 16 on Bitbucket. This highlights the importance of disciplined, large-scale scanning to identify and mitigate high-impact exposures on major Git platforms. But the project didn't just stop at discovery; it automated the triage process to disclose the leaked secrets to over 2,800 organizations, leading to the revocation of thousands of live keys. Despite the higher volume of exposed credentials on GitLab, the total bounty payout was similar to that of Bitbucket, suggesting that more leaks don't always equate to a higher critical impact.

Key Numbers

Present key numerics and statistics in a minimalist format.
5.6 million

The number of public GitLab repositories included in the full-platform secret exposure scan.

17,430

The number of verified live secrets detected across all scanned GitLab repositories.

9,000 USD

The total bounty rewards earned during the disclosure of exposed secrets.

24 hours

The total runtime required to scan all public GitLab repositories using AWS Lambda and SQS.

770 USD

The cost incurred for cloud infrastructure to complete the scanning pipeline.

2.1 times

The growth factor representing how many more GitLab repositories were scanned compared to Bitbucket.

2.8 times

The growth factor representing how many more verified secrets were found on GitLab compared to Bitbucket.

35 %

The percentage increase in leaked-secret density on GitLab repositories relative to Bitbucket repositories.

406

The number of valid GitLab API tokens discovered leaking within GitLab-hosted repositories.

16

The number of valid GitLab API tokens discovered leaking in Bitbucket-hosted repositories.

2009

The earliest year found among commits containing valid leaked secrets during the scan.

100,000

The approximate number of public GitLab repositories added between the initial scan and publication.

2,804

The total number of unique domains associated with the exposed secrets discovered.

120

The number of organizations that received responsible disclosure notifications about leaked secrets.

30

The number of SaaS providers contacted directly to resolve exposed customer credentials.

1000

The concurrency level used by each Lambda invocation when running TruffleHog scans.

People

Key entities and stakeholders, categorized for clarity: people, organizations, tools, events, regulatory bodies, and industries.
Luke Marshall Security Researcher

Conducted a project scanning public GitLab repositories to identify exposed secrets.

Organizations

Key entities and stakeholders, categorized for clarity: people, organizations, tools, events, regulatory bodies, and industries.
GitLab Platform

The platform where the security research identified exposed secrets, highlighting potential security issues.

Tools

Key entities and stakeholders, categorized for clarity: people, organizations, tools, events, regulatory bodies, and industries.
TruffleHog Security Tool

Used by Luke Marshall to scan GitLab repositories for exposed secrets.

Timeline of Events

Timeline of key events and milestones.
December 16, 2009 Oldest exposed secret found

The earliest valid secret uncovered during the GitLab scan, committed in 2009—nearly two years before GitLab itself launched.

2011 GitLab launch

GitLab was released as a Git-based code hosting platform, later becoming one of the largest public repository ecosystems.

2018 Bitbucket exposure plateau begins

Bitbucket’s leaked-secret frequency stabilized in the mid-hundreds annually, contrasting with GitLab’s rising exposure trend.

October 9, 2025 Start of GitLab repository enumeration

Beginning of the research effort during which over 5.6 million public GitLab Cloud repositories were identified via GitLab’s API.

October 2025 Full-scale secret scanning executed

All 5.6 million repositories were scanned using AWS Lambda and SQS, uncovering over 17,430 verified live secrets.

November 2025 Post-scan updates

Roughly 100,000 new public GitLab repositories were created shortly after the initial study, highlighting the platform’s rapid growth.

Enjoyed it?

Get weekly updates delivered straight to your inbox, it only takes 3 seconds!

Subscribe to our weekly newsletter DevOpsLinks to receive similar updates for free!

What is FAUN.news()?

Let's keep in touch!

Stay updated with my latest posts and news. I share insights, updates, and exclusive content.

Unsubscribe anytime. By subscribing, you share your email with @devopslinks and accept our Terms & Privacy.

Give a Pawfive to this post!


Only registered users can post comments. Please, login or signup.

Start writing about what excites you in tech — connect with developers, grow your voice, and get rewarded.

Join other developers and claim your FAUN.dev() account now!

FAUN.dev()
FAUN.dev()

FAUN.dev() is a developer-first platform built with a simple goal: help engineers stay sharp without wasting their time.

Avatar

DevOpsLinks #DevOps

FAUN.dev()

@devopslinks
DevOps Weekly Newsletter, DevOpsLinks. Curated DevOps news, tutorials, tools and more!
Developer Influence
1

Influence

1

Total Hits

56

Posts