Study: Deep Learning Shifting from TensorFlow to PyTorch (Well, Kind Of)
According to The Gradient's 2019 study of machine learning framework trends in deep learning projects, released Thursday, the two major frameworks continue to be TensorFlow and PyTorch, and TensorFlow is losing ground -- at least with academics.
According to Horace He, author of the study and the article presenting the findings, PyTorch is taking over in academia, with a majority of papers presented at every "major conference" in the last year using PyTorch over TensorFlow. Depending on the conference, it was used in 50 to 75 percent of the papers, whereas in 2018, PyTorch was often in the minority.
"While some believe that PyTorch is still an upstart framework trying to carve out a niche in a TensorFlow-dominated world, the data tells a different story," He writes. "At no conference except ICML has the growth of TensorFlow even kept up with the overall paper growth. At NAACL, ICLR, and ACL, TensorFlow actually has less papers this year than last year." (See the study for charts and other detailed data.)
He speculates that researchers are switching to PyTorch in part because of its simplicity and available APIs. He notes that this bump in popularity could be temporary, but that it's not likely, nor would it be likely to reverse very quickly even if that were the case:
Even if TensorFlow has reached parity with PyTorch functionality-wise, PyTorch has already reached a majority of the community. That means that PyTorch implementations will be easier to find, that authors will be more incentivized to publish code in PyTorch (so people will use it), and that your collaborators will be most likely prefer PyTorch. Thus, any migration back to TensorFlow 2.0 is likely to be slow, if it occurs at all.
TensorFlow will always have a captive audience within Google/DeepMind, but I wonder whether Google will eventually relax this. Even now, many of the researchers that Google wants to recruit will already prefer PyTorch at varying levels, and I've heard grumblings that many researchers inside Google would like to use a framework other than TensorFlow.
In addition, PyTorch's dominance might start to cut off Google researchers from the rest of the research community. Not only will they have a harder time building on top of outside research, outside researchers will also be less likely to build on top of code published by Google.
It remains to be seen whether TensorFlow 2.0 will allow TensorFlow to recover some of its research audience. Although eager mode will certainly be appealing, the same can't be said about the Keras API.
Of course academia is one thing, but what about machine learning frameworks in production environments? Here, He compared data from job listings, Medium articles and GitHub stars over the last year to see which was more popular.
His findings? TensorFlow still rules among the enterprise and working deep learning professionals. "TensorFlow had 1541 new job listings vs. 1437 job listings for PyTorch on public job boards, 3230 new TensorFlow Medium articles vs. 1200 PyTorch, 13.7k new GitHub stars for TensorFlow vs. 7.2k for PyTorch," He wrote.
According to He, TensorFlow is often preferred in production environments in part due to pure speed advantages: "[I]ndustry considers performance to be of the utmost priority. While 10 percent faster runtime means nothing to a researcher, that could directly translate to millions of savings for a company."
"Another difference is deployment," He explained. "Researchers will run experiments on their own machines or on a server cluster somewhere that's dedicated for running research jobs. On the other hand, industry has a litany of restrictions/requirements."
"Historically, PyTorch has fallen short in catering to these considerations, and as a result most companies are currently using TensorFlow in production."
He goes on to examine recent changes in each framework and how they might change their popularity in the coming year. Read his full research and analysis here -- the report is available for free and without registration.
About the Author
Becky Nagel is the vice president of Web & Digital Strategy for 1105's Converge360 Group, where she oversees the front-end Web team and deals with all aspects of digital projects at the company, including launching and running the group's popular virtual summit and Coffee talk series . She an experienced tech journalist (20 years), and before her current position, was the editorial director of the group's sites. A few years ago she gave a talk at a leading technical publishers conference about how changes in Web browser technology would impact online advertising for publishers. Follow her on twitter @beckynagel.