@@ -38,3 +38,114 @@ We illustrate the use of such sampler to implement an outlier rejection
3838estimator which can be easily used within a
3939:class: `imblearn.pipeline.Pipeline `:
4040:ref: `sphx_glr_auto_examples_plot_outlier_rejections.py `
41+
42+ .. _generators :
43+
44+ Custom generators
45+ -----------------
46+
47+ Imbalanced-learn provides specific generators for TensorFlow and Keras which
48+ will generate balanced mini-batches.
49+
50+ .. _tensorflow_generator :
51+
52+ TensorFlow generator
53+ ~~~~~~~~~~~~~~~~~~~~
54+
55+ The :func: `imblearn.tensorflow.balanced_batch_generator ` allow to generate
56+ balanced mini-batches using an imbalanced-learn sampler which returns indices::
57+
58+ >>> X = X.astype(np.float32)
59+ >>> from imblearn.under_sampling import RandomUnderSampler
60+ >>> from imblearn.tensorflow import balanced_batch_generator
61+ >>> training_generator, steps_per_epoch = balanced_batch_generator(
62+ ... X, y, sample_weight=None, sampler=RandomUnderSampler(),
63+ ... batch_size=10, random_state=42)
64+
65+ The ``generator `` and ``steps_per_epoch `` is used during the training of the
66+ Tensorflow model. We will illustrate how to use this generator. First, we can
67+ define a logistic regression model which will be optimized by a gradient
68+ descent::
69+
70+ >>> learning_rate, epochs = 0.01, 10
71+ >>> input_size, output_size = X.shape[1], 3
72+ >>> import tensorflow as tf
73+ >>> def init_weights(shape):
74+ ... return tf.Variable(tf.random_normal(shape, stddev=0.01))
75+ >>> def accuracy(y_true, y_pred):
76+ ... return np.mean(np.argmax(y_pred, axis=1) == y_true)
77+ >>> # input and output
78+ >>> data = tf.placeholder("float32", shape=[None, input_size])
79+ >>> targets = tf.placeholder("int32", shape=[None])
80+ >>> # build the model and weights
81+ >>> W = init_weights([input_size, output_size])
82+ >>> b = init_weights([output_size])
83+ >>> out_act = tf.nn.sigmoid(tf.matmul(data, W) + b)
84+ >>> # build the loss, predict, and train operator
85+ >>> cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
86+ ... logits=out_act, labels=targets)
87+ >>> loss = tf.reduce_sum(cross_entropy)
88+ >>> optimizer = tf.train.GradientDescentOptimizer(learning_rate)
89+ >>> train_op = optimizer.minimize(loss)
90+ >>> predict = tf.nn.softmax(out_act)
91+ >>> # Initialization of all variables in the graph
92+ >>> init = tf.global_variables_initializer()
93+
94+ Once initialized, the model is trained by iterating on balanced mini-batches of
95+ data and minimizing the loss previously defined::
96+
97+ >>> with tf.Session() as sess:
98+ ... print('Starting training')
99+ ... sess.run(init)
100+ ... for e in range(epochs):
101+ ... for i in range(steps_per_epoch):
102+ ... X_batch, y_batch = next(training_generator)
103+ ... sess.run([train_op, loss], feed_dict={data: X_batch, targets: y_batch})
104+ ... # For each epoch, run accuracy on train and test
105+ ... feed_dict = dict()
106+ ... predicts_train = sess.run(predict, feed_dict={data: X})
107+ ... print("epoch: {} train accuracy: {:.3f}"
108+ ... .format(e, accuracy(y, predicts_train)))
109+ ... # doctest: +ELLIPSIS
110+ Starting training
111+ [...
112+
113+ .. _keras_generator :
114+
115+ Keras generator
116+ ~~~~~~~~~~~~~~~
117+
118+ Keras provides an higher level API in which a model can be defined and train by
119+ calling ``fit_generator `` method to train the model. To illustrate, we will
120+ define a logistic regression model::
121+
122+ >>> import keras
123+ >>> y = keras.utils.to_categorical(y, 3)
124+ >>> model = keras.Sequential()
125+ >>> model.add(keras.layers.Dense(y.shape[1], input_dim=X.shape[1],
126+ ... activation='softmax'))
127+ >>> model.compile(optimizer='sgd', loss='categorical_crossentropy',
128+ ... metrics=['accuracy'])
129+
130+ :func: `imblearn.keras.balanced_batch_generator ` creates a balanced mini-batches
131+ generator with the associated number of mini-batches which will be generated::
132+
133+ >>> from imblearn.keras import balanced_batch_generator
134+ >>> training_generator, steps_per_epoch = balanced_batch_generator(
135+ ... X, y, sampler=RandomUnderSampler(), batch_size=10, random_state=42)
136+
137+ Then, ``fit_generator `` can be called passing the generator and the step::
138+
139+ >>> callback_history = model.fit_generator(generator=training_generator,
140+ ... steps_per_epoch=steps_per_epoch,
141+ ... epochs=10, verbose=0)
142+
143+ The second possibility is to use
144+ :class: `imblearn.keras.BalancedBatchGenerator `. Only an instance of this class
145+ will be passed to ``fit_generator ``::
146+
147+ >>> from imblearn.keras import BalancedBatchGenerator
148+ >>> training_generator = BalancedBatchGenerator(
149+ ... X, y, sampler=RandomUnderSampler(), batch_size=10, random_state=42)
150+ >>> callback_history = model.fit_generator(generator=training_generator,
151+ ... epochs=10, verbose=0)
0 commit comments