IEEE Top Programming Languages: Design, Methods, and Data Sources 2020
The IEEE Spectrum Top Programming Languages app synthesizes 11 metrics from eight sources to arrive at an overall ranking of language popularity. The sources cover contexts that include social chatter, open-source code production, and job postings. Below, you'll find information about how we choose which languages to track and the data sources we use to do it.
What We TrackStarting from a list of over 300 programming languages gathered from GitHub, we looked at the volume of results found on Google when we searched for each one using the template X programming" where X" is the name of the language. We filtered out languages that had a very low number of search results and then went through the remaining entries by hand to narrow them down to the most interesting. We labeled each language according to whether or not it finds significant use in one or more of the following categories: Web, mobile, enterprise/desktop, or embedded environments.
Our final set of 52 languages includes names familiar to most computer users, such as Java, stalwarts like Cobol and Fortran, and languages that thrive in niches, like Haskell. We gauged the popularity of each using 11 metrics across eight sources in the following ways:
Google Search
We measured the number of hits for each language by using Google's API to search for the template X programming." This number indicates the volume of online information resources about each programming language. We took the measurement in April 2020, so it represents a snapshot of the Web at that particular moment in time. This measurement technique is also used by the oft-cited TIOBE rankings.
Google Trends
We measured the index of each language as reported by Google Trends using the template X programming" in April 2020. This number indicates the demand for information about the particular language, because Google Trends measures how often people search for the given term. As it measures searching activity rather than information availability, Google Trends can be an early cue to up-and-coming languages. Our methodology here is similar to that of the Popularity of Programming Language (PYPL) ranking.
We measured the number of hits on Twitter for the template X programming" for the 12 months ending April 2020 using the Twitter Search API. This number indicates the amount of chatter on social media for the language and reflects the sharing of online resources like news articles or books, as well as physical social activities such as hackathons.
GitHub
GitHub is a site where programmers can collaboratively store repositories of code. Using the GitHub API and GitHub tags, we measured two things for the 12 months ending April 2020: (1) the number of new repositories created for each language, and (2) the number of active repositories for each language, where active" means that someone has edited the code in a particular repository. The number of new repositories measures fresh activity around the language, whereas the number of active repositories measures the ongoing interest in developing each language.
Stack Overflow
Stack Overflow is a popular site where programmers can ask questions about coding. We measured the number of questions posted that mention each language for the 12 months ending April 2020. Each question is tagged with the languages under discussion, and these tags are used to tabulate our measurements using the Stack Exchange API.
Reddit is a news and information site where users post links and comments. On Reddit we measured the number of posts mentioning each of the languages, using the template X programming" from June 2019 to June 2020 across any subreddit on the site. We collected data using the Reddit API.
Hacker News
Hacker News is a news and information site where users post comments and links to news about technology. We measured the number of posts that mentioned each of the languages using the template X programming" for the 12 months ending April 2020. Just like those used by the websites Topsy, Stack Overflow, and Reddit, this metric also captures social activity and information sharing around the various languages. We used the Algolia Search API.
CareerBuilder
We measured the demand for different programming languages on the CareerBuilder job site. We measure the number of fresh job openings (those that are less than 30 days old) on the U.S. site that mention the language. Because some of the languages we track could be ambiguous in plain text-such as D, Go, J, Processing, and R-we use strict matching of the form X programming" for these languages. For other languages we use a search string composed of X AND programming," which allows us to capture a broader range of relevant postings. We collected data in July 2020 using the CareerBuilder API, courtesy of CareerBuilder, which gave us access now that the API no longer publicly provides this information
IEEE Job Site
We measured the demand for different programming languages in job postings on the IEEE Job Site. Because some of the languages we track could be ambiguous in plain text-such as D, Go, J, Processing, and R-we use strict matching of the form X programming" for these languages. For other languages we use a search string composed of X AND programming," which allows us to capture a broader range of relevant postings. Because no externally exposed API exists for the IEEE Job Site, we extracted data using an internal custom-query tool in May 2020.
IEEE Xplore Digital Library
IEEE maintains a digital library with over 3.6 million conference and journal articles covering a range of scientific and engineering disciplines. We measured the number of articles that mention each of the languages in the template X programming" for the years 2019 and 2020. This metric captures the prevalence of the different programming languages as used and referenced in scholarship. We collected data using the IEEE Xplore API.