Article 68RJZ GitHub Built a New Search Engine for Code 'From Scratch' in Rust

GitHub Built a New Search Engine for Code 'From Scratch' in Rust

by
hubie
from SoylentNews on (#68RJZ)

upstart writes:

GitHub built a new code-focused search engine in Rust because popular text search engines couldn't scale enough:

The Rust programming language continues to grow in popularity and now developer platform GitHub has used it to build its new code-focused search engine, Blackbird.

Instead of perusing forums for answers, GitHub wants users to use its search engine, which is currently in beta.

[...] "At first glance, building a search engine from scratch seems like a questionable decision. Why would you do that? Aren't there plenty of existing, open source solutions out there already? Why build something new?" writes GitHub's Timothy Clem.

His short answer is that GitHub hasn't found success using general text search products to power code search.

"The user experience is poor, indexing is slow, and it's expensive to host. There are some newer, code-specific open source projects out there, but they definitely don't work at GitHub's scale," he writes.

[...] The Rust-written custom search engine, Blackbird, is more efficient and gives GitHub "substantial storage savings via deduplication and guarantees a uniform load distribution across shards", according to Pavel Avgustinov, VP of software engineering at GitHub.

He argues GitHub's scale means it can't use a Unix 'grep' (global regular expression print) for search. In effect, it would be too slow when considering the possibility of processing hundred of terabytes of code in memory. Queries would take too long.

Original Submission

Read more of this story at SoylentNews.

External Content
Source RSS or Atom Feed
Feed Location https://soylentnews.org/index.rss
Feed Title SoylentNews
Feed Link https://soylentnews.org/
Feed Copyright Copyright 2014, SoylentNews
Reply 0 comments