Warning: GitHub hosts a huge number of repositories. You'll have to take this into account when designing your index.
I can think of a few options:
- The legacy GitHub search API. You'll have to cope with the API rate limit though.
- This StackOverflow answer could be a good start to get a rough grasp of the number of repos per language.
- Leveraging the GitHub Archive project which records the public GitHub timeline. (Note: As the project only exposes events back from February 12, 2011, you won't get any data about repositories showing no activity since this date.)