Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 505f554

Browse files
pkulzcsguada
authored andcommitted
Internal changes to slim and object detection (tensorflow#4100)
* Adding option for one_box_for_all_classes to the box_predictor PiperOrigin-RevId: 192813444 * Extend to accept different ratios of conv channels. PiperOrigin-RevId: 192837477 * Remove inaccurate caveat from proto file. PiperOrigin-RevId: 192850747 * Add option to set dropout for classification net in weight shared box predictor. PiperOrigin-RevId: 192922089 * fix flakiness in testSSDRandomCropWithMultiClassScores due to randomness. PiperOrigin-RevId: 193067658 * Post-process now works again in train mode. PiperOrigin-RevId: 193087707 * Adding support for reading in logits as groundtruth labels and applying an optional temperature (scaling) before softmax in support of distillation. PiperOrigin-RevId: 193119411 * Add a util function to visualize value histogram as a tf.summary.image. PiperOrigin-RevId: 193137342 * Do not add batch norm parameters to final conv2d ops that predict boxes encodings and class scores in weight shared conv box predictor. This allows us to set proper bias and force initial predictions to be background when using focal loss. PiperOrigin-RevId: 193204364 * Make sure the final layers are also resized proportional to conv_depth_ratio. PiperOrigin-RevId: 193228972 * Remove deprecated batch_norm_trainable field from ssd mobilenet v2 config PiperOrigin-RevId: 193244778 * Updating coco evaluation metrics to allow for a batch of image info, rather than a single image. PiperOrigin-RevId: 193382651 * Update protobuf requirements to 3+ in installation docs. PiperOrigin-RevId: 193409179 * Add support for training keypoints. PiperOrigin-RevId: 193576336 * Fix data augmentation functions. PiperOrigin-RevId: 193737238 * Read the default batch size from config file. PiperOrigin-RevId: 193959861 * Fixing a bug in the coco evaluator. PiperOrigin-RevId: 193974479 * num_gt_boxes_per_image and num_det_boxes_per_image value incorrect. Should be not the expand dim. PiperOrigin-RevId: 194122420 * Add option to evaluate any checkpoint (without requiring write access to that directory and overwriting any existing logs there). PiperOrigin-RevId: 194292198 * PiperOrigin-RevId: 190346687 * - Expose slim arg_scope function to compute keys to enable tessting. - Add is_training=None option to mobinenet arg_scopes. This allows the users to set is_training from an outer scope. PiperOrigin-RevId: 190997959 * Add an option to not set slim arg_scope for batch_norm is_training parameter. This enables users to set the is_training parameter from an outer scope. PiperOrigin-RevId: 191611934 * PiperOrigin-RevId: 191955231 * PiperOrigin-RevId: 193254125 * PiperOrigin-RevId: 193371562 * PiperOrigin-RevId: 194085628
1 parent 5c78b9d commit 505f554

40 files changed

+1429
-322
lines changed

research/object_detection/builders/box_predictor_builder.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,14 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes):
8080
num_classes=num_classes,
8181
conv_hyperparams_fn=conv_hyperparams_fn,
8282
depth=conv_box_predictor.depth,
83-
num_layers_before_predictor=(conv_box_predictor.
84-
num_layers_before_predictor),
83+
num_layers_before_predictor=(
84+
conv_box_predictor.num_layers_before_predictor),
8585
kernel_size=conv_box_predictor.kernel_size,
8686
box_code_size=conv_box_predictor.box_code_size,
87-
class_prediction_bias_init=conv_box_predictor.class_prediction_bias_init
88-
)
87+
class_prediction_bias_init=conv_box_predictor.
88+
class_prediction_bias_init,
89+
use_dropout=conv_box_predictor.use_dropout,
90+
dropout_keep_prob=conv_box_predictor.dropout_keep_probability)
8991
return box_predictor_object
9092

9193
if box_predictor_oneof == 'mask_rcnn_box_predictor':
@@ -113,7 +115,9 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes):
113115
mask_rcnn_box_predictor.mask_prediction_conv_depth),
114116
masks_are_class_agnostic=(
115117
mask_rcnn_box_predictor.masks_are_class_agnostic),
116-
predict_keypoints=mask_rcnn_box_predictor.predict_keypoints)
118+
predict_keypoints=mask_rcnn_box_predictor.predict_keypoints,
119+
share_box_across_classes=(
120+
mask_rcnn_box_predictor.share_box_across_classes))
117121
return box_predictor_object
118122

119123
if box_predictor_oneof == 'rfcn_box_predictor':

research/object_detection/builders/box_predictor_builder_test.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -317,6 +317,7 @@ def test_non_default_mask_rcnn_box_predictor(self):
317317
use_dropout: true
318318
dropout_keep_probability: 0.8
319319
box_code_size: 3
320+
share_box_across_classes: true
320321
}
321322
"""
322323
hyperparams_proto = hyperparams_pb2.Hyperparams()
@@ -338,6 +339,7 @@ def mock_fc_argscope_builder(fc_hyperparams_arg, is_training):
338339
self.assertEqual(box_predictor.num_classes, 90)
339340
self.assertTrue(box_predictor._is_training)
340341
self.assertEqual(box_predictor._box_code_size, 3)
342+
self.assertEqual(box_predictor._share_box_across_classes, True)
341343

342344
def test_build_default_mask_rcnn_box_predictor(self):
343345
box_predictor_proto = box_predictor_pb2.BoxPredictor()

research/object_detection/builders/losses_builder.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,10 @@ def build_faster_rcnn_classification_loss(loss_config):
121121
config = loss_config.weighted_softmax
122122
return losses.WeightedSoftmaxClassificationLoss(
123123
logit_scale=config.logit_scale)
124+
if loss_type == 'weighted_logits_softmax':
125+
config = loss_config.weighted_logits_softmax
126+
return losses.WeightedSoftmaxClassificationAgainstLogitsLoss(
127+
logit_scale=config.logit_scale)
124128

125129
# By default, Faster RCNN second stage classifier uses Softmax loss
126130
# with anchor-wise outputs.
@@ -193,6 +197,11 @@ def _build_classification_loss(loss_config):
193197
return losses.WeightedSoftmaxClassificationLoss(
194198
logit_scale=config.logit_scale)
195199

200+
if loss_type == 'weighted_logits_softmax':
201+
config = loss_config.weighted_logits_softmax
202+
return losses.WeightedSoftmaxClassificationAgainstLogitsLoss(
203+
logit_scale=config.logit_scale)
204+
196205
if loss_type == 'bootstrapped_sigmoid':
197206
config = loss_config.bootstrapped_sigmoid
198207
return losses.BootstrappedSigmoidClassificationLoss(

research/object_detection/builders/losses_builder_test.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,24 @@ def test_build_weighted_softmax_classification_loss(self):
207207
self.assertTrue(isinstance(classification_loss,
208208
losses.WeightedSoftmaxClassificationLoss))
209209

210+
def test_build_weighted_logits_softmax_classification_loss(self):
211+
losses_text_proto = """
212+
classification_loss {
213+
weighted_logits_softmax {
214+
}
215+
}
216+
localization_loss {
217+
weighted_l2 {
218+
}
219+
}
220+
"""
221+
losses_proto = losses_pb2.Loss()
222+
text_format.Merge(losses_text_proto, losses_proto)
223+
classification_loss, _, _, _, _ = losses_builder.build(losses_proto)
224+
self.assertTrue(
225+
isinstance(classification_loss,
226+
losses.WeightedSoftmaxClassificationAgainstLogitsLoss))
227+
210228
def test_build_weighted_softmax_classification_loss_with_logit_scale(self):
211229
losses_text_proto = """
212230
classification_loss {
@@ -442,6 +460,19 @@ def test_build_softmax_loss(self):
442460
self.assertTrue(isinstance(classification_loss,
443461
losses.WeightedSoftmaxClassificationLoss))
444462

463+
def test_build_logits_softmax_loss(self):
464+
losses_text_proto = """
465+
weighted_logits_softmax {
466+
}
467+
"""
468+
losses_proto = losses_pb2.ClassificationLoss()
469+
text_format.Merge(losses_text_proto, losses_proto)
470+
classification_loss = losses_builder.build_faster_rcnn_classification_loss(
471+
losses_proto)
472+
self.assertTrue(
473+
isinstance(classification_loss,
474+
losses.WeightedSoftmaxClassificationAgainstLogitsLoss))
475+
445476
def test_build_softmax_loss_by_default(self):
446477
losses_text_proto = """
447478
"""

research/object_detection/core/box_predictor.py

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -308,7 +308,8 @@ def __init__(self,
308308
mask_prediction_num_conv_layers=2,
309309
mask_prediction_conv_depth=256,
310310
masks_are_class_agnostic=False,
311-
predict_keypoints=False):
311+
predict_keypoints=False,
312+
share_box_across_classes=False):
312313
"""Constructor.
313314
314315
Args:
@@ -341,7 +342,8 @@ def __init__(self,
341342
masks_are_class_agnostic: Boolean determining if the mask-head is
342343
class-agnostic or not.
343344
predict_keypoints: Whether to predict keypoints insde detection boxes.
344-
345+
share_box_across_classes: Whether to share boxes across classes rather
346+
than use a different box for each class.
345347
346348
Raises:
347349
ValueError: If predict_instance_masks is true but conv_hyperparams is not
@@ -362,6 +364,7 @@ def __init__(self,
362364
self._mask_prediction_conv_depth = mask_prediction_conv_depth
363365
self._masks_are_class_agnostic = masks_are_class_agnostic
364366
self._predict_keypoints = predict_keypoints
367+
self._share_box_across_classes = share_box_across_classes
365368
if self._predict_keypoints:
366369
raise ValueError('Keypoint prediction is unimplemented.')
367370
if ((self._predict_instance_masks or self._predict_keypoints) and
@@ -403,10 +406,14 @@ def _predict_boxes_and_classes(self, image_features):
403406
flattened_image_features = slim.dropout(flattened_image_features,
404407
keep_prob=self._dropout_keep_prob,
405408
is_training=self._is_training)
409+
number_of_boxes = 1
410+
if not self._share_box_across_classes:
411+
number_of_boxes = self._num_classes
412+
406413
with slim.arg_scope(self._fc_hyperparams_fn()):
407414
box_encodings = slim.fully_connected(
408415
flattened_image_features,
409-
self._num_classes * self._box_code_size,
416+
number_of_boxes * self._box_code_size,
410417
activation_fn=None,
411418
scope='BoxEncodingPredictor')
412419
class_predictions_with_background = slim.fully_connected(
@@ -415,7 +422,7 @@ def _predict_boxes_and_classes(self, image_features):
415422
activation_fn=None,
416423
scope='ClassPredictor')
417424
box_encodings = tf.reshape(
418-
box_encodings, [-1, 1, self._num_classes, self._box_code_size])
425+
box_encodings, [-1, 1, number_of_boxes, self._box_code_size])
419426
class_predictions_with_background = tf.reshape(
420427
class_predictions_with_background, [-1, 1, self._num_classes + 1])
421428
return box_encodings, class_predictions_with_background
@@ -778,7 +785,9 @@ def __init__(self,
778785
num_layers_before_predictor,
779786
box_code_size,
780787
kernel_size=3,
781-
class_prediction_bias_init=0.0):
788+
class_prediction_bias_init=0.0,
789+
use_dropout=False,
790+
dropout_keep_prob=0.8):
782791
"""Constructor.
783792
784793
Args:
@@ -796,6 +805,8 @@ def __init__(self,
796805
kernel_size: Size of final convolution kernel.
797806
class_prediction_bias_init: constant value to initialize bias of the last
798807
conv2d layer before class prediction.
808+
use_dropout: Whether to apply dropout to class prediction head.
809+
dropout_keep_prob: Probability of keeping activiations.
799810
"""
800811
super(WeightSharedConvolutionalBoxPredictor, self).__init__(is_training,
801812
num_classes)
@@ -805,6 +816,8 @@ def __init__(self,
805816
self._box_code_size = box_code_size
806817
self._kernel_size = kernel_size
807818
self._class_prediction_bias_init = class_prediction_bias_init
819+
self._use_dropout = use_dropout
820+
self._dropout_keep_prob = dropout_keep_prob
808821

809822
def _predict(self, image_features, num_predictions_per_location_list):
810823
"""Computes encoded object locations and corresponding confidences.
@@ -867,6 +880,7 @@ def _predict(self, image_features, num_predictions_per_location_list):
867880
num_predictions_per_location * self._box_code_size,
868881
[self._kernel_size, self._kernel_size],
869882
activation_fn=None, stride=1, padding='SAME',
883+
normalizer_fn=None,
870884
scope='BoxEncodingPredictor')
871885

872886
for i in range(self._num_layers_before_predictor):
@@ -877,11 +891,15 @@ def _predict(self, image_features, num_predictions_per_location_list):
877891
stride=1,
878892
padding='SAME',
879893
scope='ClassPredictionTower/conv2d_{}'.format(i))
894+
if self._use_dropout:
895+
class_predictions_net = slim.dropout(
896+
class_predictions_net, keep_prob=self._dropout_keep_prob)
880897
class_predictions_with_background = slim.conv2d(
881898
class_predictions_net,
882899
num_predictions_per_location * num_class_slots,
883900
[self._kernel_size, self._kernel_size],
884901
activation_fn=None, stride=1, padding='SAME',
902+
normalizer_fn=None,
885903
biases_initializer=tf.constant_initializer(
886904
self._class_prediction_bias_init),
887905
scope='ClassPredictor')

research/object_detection/core/box_predictor_test.py

Lines changed: 58 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,33 @@ def test_get_boxes_with_five_classes(self):
7070
self.assertAllEqual(box_encodings_shape, [2, 1, 5, 4])
7171
self.assertAllEqual(class_predictions_with_background_shape, [2, 1, 6])
7272

73+
def test_get_boxes_with_five_classes_share_box_across_classes(self):
74+
image_features = tf.random_uniform([2, 7, 7, 3], dtype=tf.float32)
75+
mask_box_predictor = box_predictor.MaskRCNNBoxPredictor(
76+
is_training=False,
77+
num_classes=5,
78+
fc_hyperparams_fn=self._build_arg_scope_with_hyperparams(),
79+
use_dropout=False,
80+
dropout_keep_prob=0.5,
81+
box_code_size=4,
82+
share_box_across_classes=True
83+
)
84+
box_predictions = mask_box_predictor.predict(
85+
[image_features], num_predictions_per_location=[1],
86+
scope='BoxPredictor')
87+
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
88+
class_predictions_with_background = box_predictions[
89+
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
90+
init_op = tf.global_variables_initializer()
91+
with self.test_session() as sess:
92+
sess.run(init_op)
93+
(box_encodings_shape,
94+
class_predictions_with_background_shape) = sess.run(
95+
[tf.shape(box_encodings),
96+
tf.shape(class_predictions_with_background)])
97+
self.assertAllEqual(box_encodings_shape, [2, 1, 1, 4])
98+
self.assertAllEqual(class_predictions_with_background_shape, [2, 1, 6])
99+
73100
def test_value_error_on_predict_instance_masks_with_no_conv_hyperparms(self):
74101
with self.assertRaises(ValueError):
75102
box_predictor.MaskRCNNBoxPredictor(
@@ -403,9 +430,14 @@ def _build_arg_scope_with_conv_hyperparams(self):
403430
}
404431
}
405432
initializer {
406-
truncated_normal_initializer {
433+
random_normal_initializer {
434+
stddev: 0.01
435+
mean: 0.0
407436
}
408437
}
438+
batch_norm {
439+
train: true,
440+
}
409441
"""
410442
text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
411443
return hyperparams_builder.build(conv_hyperparams, is_training=True)
@@ -434,6 +466,27 @@ def graph_fn(image_features):
434466
self.assertAllEqual(box_encodings.shape, [4, 320, 1, 4])
435467
self.assertAllEqual(objectness_predictions.shape, [4, 320, 1])
436468

469+
def test_bias_predictions_to_background_with_sigmoid_score_conversion(self):
470+
471+
def graph_fn(image_features):
472+
conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
473+
is_training=True,
474+
num_classes=2,
475+
conv_hyperparams_fn=self._build_arg_scope_with_conv_hyperparams(),
476+
depth=32,
477+
num_layers_before_predictor=1,
478+
class_prediction_bias_init=-4.6,
479+
box_code_size=4)
480+
box_predictions = conv_box_predictor.predict(
481+
[image_features], num_predictions_per_location=[5],
482+
scope='BoxPredictor')
483+
class_predictions = tf.concat(box_predictions[
484+
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND], axis=1)
485+
return (tf.nn.sigmoid(class_predictions),)
486+
image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
487+
class_predictions = self.execute(graph_fn, [image_features])
488+
self.assertAlmostEqual(np.mean(class_predictions), 0.01, places=3)
489+
437490
def test_get_multi_class_predictions_for_five_aspect_ratios_per_location(
438491
self):
439492

@@ -524,19 +577,19 @@ def graph_fn(image_features1, image_features2):
524577
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
525578
'BoxEncodingPredictionTower/conv2d_0/weights'),
526579
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
527-
'BoxEncodingPredictionTower/conv2d_0/biases'),
580+
'BoxEncodingPredictionTower/conv2d_0/BatchNorm/beta'),
528581
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
529582
'BoxEncodingPredictionTower/conv2d_1/weights'),
530583
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
531-
'BoxEncodingPredictionTower/conv2d_1/biases'),
584+
'BoxEncodingPredictionTower/conv2d_1/BatchNorm/beta'),
532585
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
533586
'ClassPredictionTower/conv2d_0/weights'),
534587
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
535-
'ClassPredictionTower/conv2d_0/biases'),
588+
'ClassPredictionTower/conv2d_0/BatchNorm/beta'),
536589
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
537590
'ClassPredictionTower/conv2d_1/weights'),
538591
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
539-
'ClassPredictionTower/conv2d_1/biases'),
592+
'ClassPredictionTower/conv2d_1/BatchNorm/beta'),
540593
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
541594
'BoxEncodingPredictor/weights'),
542595
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'

research/object_detection/core/losses.py

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
Classification losses:
2424
* WeightedSigmoidClassificationLoss
2525
* WeightedSoftmaxClassificationLoss
26+
* WeightedSoftmaxClassificationAgainstLogitsLoss
2627
* BootstrappedSigmoidClassificationLoss
2728
"""
2829
from abc import ABCMeta
@@ -317,6 +318,54 @@ def _compute_loss(self, prediction_tensor, target_tensor, weights):
317318
return tf.reshape(per_row_cross_ent, tf.shape(weights)) * weights
318319

319320

321+
class WeightedSoftmaxClassificationAgainstLogitsLoss(Loss):
322+
"""Softmax loss function against logits.
323+
324+
Targets are expected to be provided in logits space instead of "one hot" or
325+
"probability distribution" space.
326+
"""
327+
328+
def __init__(self, logit_scale=1.0):
329+
"""Constructor.
330+
331+
Args:
332+
logit_scale: When this value is high, the target is "diffused" and
333+
when this value is low, the target is made peakier.
334+
(default 1.0)
335+
336+
"""
337+
self._logit_scale = logit_scale
338+
339+
def _scale_and_softmax_logits(self, logits):
340+
"""Scale logits then apply softmax."""
341+
scaled_logits = tf.divide(logits, self._logit_scale, name='scale_logits')
342+
return tf.nn.softmax(scaled_logits, name='convert_scores')
343+
344+
def _compute_loss(self, prediction_tensor, target_tensor, weights):
345+
"""Compute loss function.
346+
347+
Args:
348+
prediction_tensor: A float tensor of shape [batch_size, num_anchors,
349+
num_classes] representing the predicted logits for each class
350+
target_tensor: A float tensor of shape [batch_size, num_anchors,
351+
num_classes] representing logit classification targets
352+
weights: a float tensor of shape [batch_size, num_anchors]
353+
354+
Returns:
355+
loss: a float tensor of shape [batch_size, num_anchors]
356+
representing the value of the loss function.
357+
"""
358+
num_classes = prediction_tensor.get_shape().as_list()[-1]
359+
target_tensor = self._scale_and_softmax_logits(target_tensor)
360+
prediction_tensor = tf.divide(prediction_tensor, self._logit_scale,
361+
name='scale_logits')
362+
363+
per_row_cross_ent = (tf.nn.softmax_cross_entropy_with_logits(
364+
labels=tf.reshape(target_tensor, [-1, num_classes]),
365+
logits=tf.reshape(prediction_tensor, [-1, num_classes])))
366+
return tf.reshape(per_row_cross_ent, tf.shape(weights)) * weights
367+
368+
320369
class BootstrappedSigmoidClassificationLoss(Loss):
321370
"""Bootstrapped sigmoid cross entropy classification loss function.
322371

0 commit comments

Comments
 (0)