Seo

GitHub Wiki SEE: Search Engine Enablement: Year One

It’s been near a year since I supercharged GitHub Wiki SEE with dynamically generating sitemaps.

Since then a few things have changed:

  • GitHub has started to permit a select set of wikis to be indexed.
  • They have not budged on letting un-publically editable wikis be indexed.
  • There is now a star requirement of about 500, and it appears to be steadily decreasing.
  • Many projects have reacted by moving or configuring their wikis to indexable platforms.

As a result, traffic to GitHub Wiki SEE has dropped off dramatically.

stats

This is a good thing, as it means that GitHub is moving in the right direction.

I’m still going to keep the service up, as it’s still useful for wikis that are not yet indexed and there are still about 400,000 wikis that aren’t indexed.

Hopefully GitHub will continue to move in the right direction and allow all wikis to be indexed.

9:04 pm / github , wiki , seo , flyio

GitHub Wiki SEE: Search Engine Enablement

I built this in March, but I figure I’ll write about this project better late than never. I supercharged it in June out of curiosity to see how it would perform. It’s just a ramble. I am not a good writer but I just wanted to write stuff down.

TLDR: I mirrored all of GitHub’s wiki with my own service to get the content on Google and other search engines.

If you’ve used a search engine to search up something that could appear on GitHub, you are using my service.

https://github-wiki-see.page/

Did you know that GitHub wikis aren’t search engine optimized? In fact, GitHub excluded their own wiki pages from the search results with their robots.txt. “robots.txt” is the mechanism that sites can use to indicate to search engines whether or not to index certain sections of the site. If you search on Google or Bing, you won’t find any results unless your search query terms are directly inside the URL.

The situation with search engines is currently this. For example, if you wanted to search for information relating to Nissan Leaf support of the openpilot self-driving car project in their wiki, you could search for “nissan leaf openpilot wiki”. Unfortunately, the search results would be empty and would contain no results. If you searched for “nissan openpilot wiki”, “https://github.com/commaai/openpilot/wiki/Nissan" would show up correctly because it has all the terms in it. The content of the GitHub wiki page is not used for returning good results.

[... 1,550 words]

6:12 pm / github , wiki , seo , rust , cloud_run , bigquery , gcp

GitHub Wiki SEE

https://github-wiki-see.page/

A project to get GitHub Wikis indexed by Google Search.

Explanation is right on the front page.

If a search engine can’t see it, it may as well be invisible.

Long-term project involving Rust, BigQuery, Cloudflare Workers, and lots of random hosting.

Probably one more the more visited sites on the internet and a continous effort to keep costs low.

Has caused GitHub to revisit their policy on a limited subset of Wikis meeting some criteria.

Will continue until all Wikis are indexed even if they are publically editable or don’t meet some star count criteria.

Personally, I just got really dismayed all this documentation I was doing on a Wiki wasn’t visible to Google. Then I realized it was a good idea to make it visible. Then I realized others don’t realize it was invisible at all. This was a problem.

12:00 am / github , wiki , seo , rust , cloud_run , bigquery , gcp , flyio , cloudflare