# deepdist

**Repository Path**: daqingba/deepdist

## Basic Information

- **Project Name**: deepdist
- **Description**: Lightning-Fast Deep Learning on Spark
- **Primary Language**: Python
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 0
- **Created**: 2021-06-06
- **Last Updated**: 2021-09-03

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

Training deep belief networks requires extensive data and computation. [DeepDist](http://deepdist.com) accelerates the training by distributing stochastic gradient descent for data stored on HDFS / Spark via a simple Python interface. Overview: [deepdist.com](http://deepdist.com)

Quick start:
----

Training of a [word2vec](https://code.google.com/p/word2vec/) model on [wikipedia](http://dumps.wikimedia.org/enwiki/) in 15 lines of code:

```python
from deepdist import DeepDist
from gensim.models.word2vec import Word2Vec
from pyspark import SparkContext

sc = SparkContext()
corpus = sc.textFile('enwiki').map(lambda s: s.split())

def gradient(model, sentences):  # executes on workers
    syn0, syn1 = model.syn0.copy(), model.syn1.copy()
    model.train(sentences)
    return {'syn0': model.syn0 - syn01, 'syn1': model.syn1 - syn1}

def descent(model, update):      # executes on master
    model.syn0 += update['syn0']
    model.syn1 += update['syn1']

with DeepDist(Word2Vec(corpus.collect()) as dd:

    dd.train(corpus, gradient, descent)
    print dd.model.most_similar(positive=['woman', 'king'], negative=['man'])
```

How does it work?
----

DeepDist implements a [Downpour](http://research.google.com/archive/large_deep_networks_nips2012.html)-like stochastic gradient descent. It start a master model server (on port 5000). On each data node, DeepDist fetches the model from the server, and then calls __gradient()__. After computing the gradient for each RDD partition, gradient updates are send the the server. On the server, the master model is then updated by __descent()__.

![Alt text](http://deepdist.com/images/deepdistdesign.png)

Python module
----

[DeepDist](http://deepdist.com) provides a simple Python interface. The with statement starts the model server. Distributed gradient updates are computed on partitions of a resilient distributed dataset (RDD) data. The gradient updates are incorporated into the master model via custom descent method.

```python
from deepdist import DeepDist
 
with DeepDist(model) as dd:    # initialized server with any model    
    
    dd.train(data, gradient, descent)
    # train with an RDD "data" by computing distributed gradients and
    # descending the model parameters space according to gradient updates
 
def gradient(model, data):
    # model is a copy of the master model
    # data is an iterator for the current partition of the data RDD
    # returns the gradient update
 
def descent(model, update):
    # model is a reference to the server model
    # update is a copy of a worker's update
```

Training Speed
----

Training speed can be greatly enhanced by adaptively adjusting the learning rate by [AdaGrad](http://www.cs.berkeley.edu/~jduchi/projects/DuchiHaSi10.pdf). The complete Word2Vec model with 900 dimensions can be trained on the 19GB wikipedia corpus (using the words from the validation questions).

![Training](http://deepdist.com/images/training.png)

References
----

J Dean, GS Corrado, R Monga, K Chen, M Devin, QV Le, MZ Mao, M’A Ranzato, A Senior, P Tucker, K Yang, and AY Ng. [Large Scale Distributed Deep Networks](http://research.google.com/archive/large_deep_networks_nips2012.html). NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada, 2012.

T Mikolov, I Sutskever, K Chen, G Corrado, and J Dean. [Distributed Representations of Words and Phrases and their Compositionality](http://arxiv.org/pdf/1310.4546.pdf). In Proceedings of NIPS, 2013.

T Mikolov, K Chen, G Corrado, and J Dean. [Efficient Estimation of Word Representations in Vector Space](http://arxiv.org/pdf/1301.3781.pdf). In Proceedings of Workshop at ICLR, 2013.