Jul. 27, 2021
GitHub Wiki SEE: Search Engine Enablement
I built this in March, but I figure I’ll write about this project better late than never. I supercharged it in June out of curiosity to see how it would perform. It’s just a ramble. I am not a good writer but I just wanted to write stuff down.
TLDR: I mirrored all of GitHub’s wiki with my own service to get the content on Google and other search engines.
If you’ve used a search engine to search up something that could appear on GitHub, you are using my service.
Did you know that GitHub wikis aren’t search engine optimized? In fact, GitHub excluded their own wiki pages from the search results with their robots.txt. “robots.txt” is the mechanism that sites can use to indicate to search engines whether or not to index certain sections of the site. If you search on Google or Bing, you won’t find any results unless your search query terms are directly inside the URL.
The situation with search engines is currently this. For example, if you wanted to search for information relating to Nissan Leaf support of the openpilot self-driving car project in their wiki, you could search for “nissan leaf openpilot wiki”. Unfortunately, the search results would be empty and would contain no results. If you searched for “nissan openpilot wiki”, “https://github.com/commaai/openpilot/wiki/Nissan" would show up correctly because it has all the terms in it. The content of the GitHub wiki page is not used for returning good results.