GitHub Report Charts Rise of Open Source AI

Artificial intelligence projects were front and center in GitHub's new Octoverse report examining the past year's activity on the open source development platform and code repository.

The State of the Octoverse 2018 report, published last week (Oct. 16), reveals a huge amount of such activity, including collaboration across 1.1 billion contributions, with more than 31 million developers taking part, from 2.1 million organizations. The number of GitHub repositories as of Sept. 30, 2018, climbed more than 40 percent from last year to reach more than 96 million.

Among those repositories and projects, AI was featured prominently, with machine learning a major focus.

For example, the second-fastest-growing project was PyTorch, a Python package that includes two main features:

  • Tensor computation (like NumPy) with strong GPU acceleration
  • Deep neural networks built on a tape-based autograd system

PyTorch, with a 2.8x change in growth from the previous year, was second only to the Microsoft Azure documentation project, which registered a 4.7x change.

"Overall, we're seeing trends in growth of projects related to machine learning, gaming, 3D printing, home automation, scientific programming, data analysis, and full stack JavaScript development," GitHub said of the fastest-growing rankings.

Another AI-related project making the top 10 was TensorFlow Models, a collection of different kinds of models implemented in the TensorFlow open source machine learning framework created by Google. Those models include official models and those created by researchers, which aren't officially supported. The project also includes samples and models described in the TensorFlow tutorials.

Yet another fastest-growing project (clocking in at No. 7) is Spyder, an IDE and more for scientific Python, a primary programming language for AI and ML development.

"Spyder is a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts," the project states.

Python itself fared well in the new Octoverse report, taking the No. 3 spot in the top programming languages ranking, a position it has held for four years behind only JavaScript and Java.

Top Programming Languages Used over Time
[Click on image for larger view.] Top Programming Languages Used over Time (source: GitHub).

In the fastest-growing language ranking, Python claimed the No. 8 spot, experiencing 1.5x growth from last year. Python also ranked among the leaders in top emoji reactions.

Machine learning was also the eighth-most tagged topic on GitHub, indicating the number of contributions made to specific areas. In the machine learning space, the top 5 projects were: TensorFlow; Keras (deep learning for humans); scikit-learn (a Python module for machine learning built on top of SciPy); Caffe (a fast framework for deep learning); and TensorFlow Examples.

Speaking of topics, PyTorch and machine learning were the No. 2 and No. 3 fastest-growing topics, behind Hacktoberfest. "Machine learning and React are trending topics among the GitHub community: PyTorch, a machine learning library, and React-based Web development tools like Gatsby are both among the fastest growing topics this year," GitHub said. "Topics across different areas of blockchain development are also trending. And of course, Hacktoberfest is topping the list."

Finally, one last indicator of the growth of open source AI was GitHub's own internal ranking of "cool open source projects." Topping that list was Google's Dopamine, described as "A research framework for quickly prototyping reinforcement learning algorithms."

"You open sourced a lot of exciting work this year, from machine learning frameworks to games," GitHub said of its open source coolness shout-out. "These projects aren't the fastest growing or highest grossing but we thought they were star-worthy -- and so did the community!"

About the Author

David Ramel is an editor and writer for Converge360.