GitHub has created a new code search experience, using its own search engine called Blackbird, specifically designed for the domain of code search.
- GitHub was motivated to create its own solution due to poor user experience, slow indexing, and expensive hosting from existing solutions.
- Blackbird uses a special type of inverted index called an n-gram index, which is useful for looking up substrings of content.
- With Blackbird, GitHub is able to index 45 million repositories representing 115 TB of code and 15.5 billion documents.
- The search engine also provides a level of query consistency that other search engines do not offer.
- The GitHub team optimized its ingest order and built Blackbird to perform well on GitHub's scale.
















