Create a custom federated data set in TensorFlow Federated

Tags:

I'd like to adapt the recurrent autoencoder from this blog post to work in a federated environment.

I've modified the model slightly to conform with the example shown in the TFF image classification tutorial.

def create_compiled_keras_model():
  model = tf.keras.models.Sequential([
      tf.keras.layers.LSTM(2, input_shape=(10, 2), name='Encoder'),
      tf.keras.layers.RepeatVector(10, name='Latent'),
      tf.keras.layers.LSTM(2, return_sequences=True, name='Decoder')]
  )

  model.compile(loss='mse', optimizer='adam')
  return model

model = create_compiled_keras_model()

sample_batch = gen(1)
timesteps, input_dim = 10, 2

def model_fn():
  keras_model = create_compiled_keras_model()
  return tff.learning.from_compiled_keras_model(keras_model, sample_batch)

The gen function is defined as follows:

import random

def gen(batch_size):
    seq_length = 10

    batch_x = []
    batch_y = []

    for _ in range(batch_size):
        rand = random.random() * 2 * np.pi

        sig1 = np.sin(np.linspace(0.0 * np.pi + rand, 3.0 * np.pi + rand, seq_length * 2))
        sig2 = np.cos(np.linspace(0.0 * np.pi + rand, 3.0 * np.pi + rand, seq_length * 2))

        x1 = sig1[:seq_length]
        y1 = sig1[seq_length:]
        x2 = sig2[:seq_length]
        y2 = sig2[seq_length:]

        x_ = np.array([x1, x2])
        y_ = np.array([y1, y2])
        x_, y_ = x_.T, y_.T

        batch_x.append(x_)
        batch_y.append(y_)

    batch_x = np.array(batch_x)
    batch_y = np.array(batch_y)

    return batch_x, batch_x #batch_y

So far I've been unable to find any documentation which does not use sample data from the TFF repository.

How can I modify this to create a federated data set and begin training?

446

asked Mar 30 '19 17:03

Adam Hodgson

1 Answers

At a very high-level, to use an arbitrary dataset with TFF the following steps are needed:

Partition the dataset into per client subsets (how to do so is a much larger question)
Create a tf.data.Dataset per client subset
Pass a list of all (or a subset) of the Dataset objects to the federated optimization.

What is happening in the tutorial

The Federated Learning for Image Classification tutorial uses tff.learning.build_federated_averaging_process to build up a federated optimization using the FedAvg algorithm.

In that notebook, the following code is executing one round of federated optimization, where the client datasets are passed to the process' .next method:

   state, metrics = iterative_process.next(state, federated_train_data)

Here federated_train_data is a Python list of tf.data.Dataset, one per client participating in the round.

The ClientData object

The canned datasets provided by TFF (under tff.simulation.datasets) are implemented using the tff.simulation.ClientData interface, which manages the client → dataset mapping and tff.data.Dataset creation.

If you're planning to re-use a dataset, implementing it as a tff.simulation.ClientData may make future use easier.

157

answered Nov 14 '22 07:11

Zachary Garrett

Related questions
                            
                                Python Reading from a file to create a weighted directed graph using networkx
                            
                                Python add audio to video opencv
                            
                                Pydrive google drive automate authentication
                            
                                gcc via homebrew has no --without-multilib option
                            
                                How to handle different task intervals on a single Dag in airflow?
                            
                                Exception " There is no current event loop in thread 'MainThread' " while running over new loop
                            
                                python rq - how to trigger a job when multiple other jobs are finished? Multi job dependency work arround?
                            
                                Python lexical analysis - logical line & compound statements
                            
                                Python (openpyxl) : Put data from one excel file to another (template file) & save it with another name while retaining the template
                            
                                ImportError: No module named 'tensorflow.core'
                            
                                pymysql stopped working : NameError: name 'byte2int' is not defined
                            
                                AttributeError: 'NoneType' object has no attribute 'drivername'
                            
                                Django extract string from [ErrorDetail(string='Test Message', code='invalid')]
                            
                                Pandas/Numpy NaN None comparison
                            
                                Why does the UnboundLocalError occur on the second variable of the flat comprehension?
                            
                                The axis argument to unique is not supported for dtype object
                            
                                How to make a tkinter canvas background transparent?
                            
                                I can't import tensorflow-gpu
                            
                                Run command from one container to another
                            
                                How to add multiple images to a django form asynchronously before form submit

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create a custom federated data set in TensorFlow Federated

Tags:

python-3.x

tensorflow

tensorflow-federated