GitHub Built a New Search Engine for Code 'From Scratch' in Rust
upstart writes:
The Rust programming language continues to grow in popularity and now developer platform GitHub has used it to build its new code-focused search engine, Blackbird.
Instead of perusing forums for answers, GitHub wants users to use its search engine, which is currently in beta.
[...] "At first glance, building a search engine from scratch seems like a questionable decision. Why would you do that? Aren't there plenty of existing, open source solutions out there already? Why build something new?" writes GitHub's Timothy Clem.
His short answer is that GitHub hasn't found success using general text search products to power code search.
"The user experience is poor, indexing is slow, and it's expensive to host. There are some newer, code-specific open source projects out there, but they definitely don't work at GitHub's scale," he writes.
[...] The Rust-written custom search engine, Blackbird, is more efficient and gives GitHub "substantial storage savings via deduplication and guarantees a uniform load distribution across shards", according to Pavel Avgustinov, VP of software engineering at GitHub.
He argues GitHub's scale means it can't use a Unix 'grep' (global regular expression print) for search. In effect, it would be too slow when considering the possibility of processing hundred of terabytes of code in memory. Queries would take too long.
Read more of this story at SoylentNews.