Distributing TensorFlow

While at it’s core TensorFlow is a distributed computation framework besides the official HowTo there is little detailed documentation around the way TensorFlow deals with distributed learning. This post is an attempt to learn by example about TensorFlow’s distribution capabilities. Therefor the existing MNIST tutorial is taken and adapted into a distributed execution graph that can be executed on one or multiple nodes.

The framework offers two basic ways for distributed training of a model. In the simplest form the same data and computation graph is executed on multiple nodes in parallel on batches of the replicated data. This is known as Between-Graph Replication. Each worker updates the parameters of the same model, which means that each of the worker nodes are sharing a model. Updates to the shared model get averaged before being applied, this is at least the case for the synchronous training of a distributed model. In case of an asynchronous training the workers update the shared model parameters independently of each other. While the asynchronous training is known to be faster, the synchronous training proofs to provide more accuracy.
Continue reading “Distributing TensorFlow” →

TensorFlow: Further Reading

Some collection of papers and work around deep distributed learning to deepen once understanding in that topic:

Large Scale Distributed Deep Networks (link) (December, 2012)
Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng

This paper published, among other contributes, by Jeffrey Dean together with Andrew NG probably marks the cornerstone to TensorFlow as it is today.
[PDF]

Efficient Estimation of Word Representations in Vector Space (link) (Januar 2013)
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean

The fundamental work around projects like Word2Vec is presented in this paper, where vector representation of words for similarity trained by a neural net is being described.
[PDF]

Sequence to Sequence Learning with Neural Networks (link) (September 2014)
Ilya Sutskever, Oriol Vinyals, Quoc V. Le

The work around sequence to sequence learning is actually quite old. Which seems like a fairly abstract problem to solve has recently proved to significantly improve for example speech to text recognition among other disciplines.
[PDF]

Show and Tell: A Neural Image Caption Generator (link) (November 2014)
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

Another area were the above described concept of sequence to sequence learning is described is the exploration of images. In this case the input sequence is a bitmap of an image which is transferred to a text sequence describing the image. This marks a fundamental breakthrough in computer AI.

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (link) (November 2015)
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

The TensorFlow Whitepaper [PDF]

Webinar: TensorFlow: A Framework for Scalable Machine Learning (link) (October 19, 2016)
Martin Wicke, Software Engineer at Google
Rajat Monga, Engineering Director at Google

Martin and Rajat, both software engineers for Google working on TensorFlow, walk through the architecture and design of TensorFlow throughout this webinar.

Install SVM-light for R

SVM^light is an implementation of the Support Vector Machine providing methods for efficient estimation methods for both error rate and precision/recall. SVM^light exploits that the results of most leave-one-outs (often more than 99%) are predetermined and need not be computed. Further more it can also train SVMs with cost models. Many tasks have the property of sparse instance vectors. This implementation makes use of this property which leads to a very compact and efficient representation. Continue reading “Install SVM-light for R” →

JPMML Example Random Forest

The Predictive Model Markup Language (PMML) developed by the Data Mining Group is a standardized XML-based representation of mining models to be used and shared across languages or tools. The standardized definition allows a classification model trained with R to be used with Storm for example. Many projects related to Big Data have some support for PMML, which is often implemented by JPMML. Continue reading “JPMML Example Random Forest” →

Training Multiple SVM Classifiers with Apache Pig

Inspired by Twitter‘s publication about “Large Scale Machine Learning” I turned to Pig when it came to implement a SVM classifier for Record Linkage. Searching for different solutions I also came across a presentation of the Huffington Post using a similar approach to training multiple SVM models. The overall idea is to use Hadoop to train multiple models with different parameters at the same time, selecting the best model for the actual classification. There are some limitations to this approach, which I’ll try to address at the end of this post, but first let me describe my approach to training multiple SVM classifiers with Pig.

Disclaimer: This post does not describe the process of training one model in parallel but training multiple models at the same time on multiple machines.
Continue reading “Training Multiple SVM Classifiers with Apache Pig” →