Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 8e73530

Browse files
author
Jonathan Huang
authored
Merge pull request tensorflow#4161 from pkulzc/master
Internal changes to object detection
2 parents 18d05ad + 6305421 commit 8e73530

File tree

7 files changed

+424
-29
lines changed

7 files changed

+424
-29
lines changed

research/object_detection/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,15 @@ reporting an issue.
9090

9191
## Release information
9292

93+
### April 30, 2018
94+
95+
We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.
96+
Compared with other commonly used object detectors, it changes the action classification loss function to per-class Sigmoid loss to handle boxes with multiple labels.
97+
The model is trained on the training split of AVA v2.1 for 1.5M iterations, it achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
98+
For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
99+
100+
<b>Thanks to contributors</b>: Chen Sun, David Ross
101+
93102
### April 2, 2018
94103

95104
Supercharge your mobile phones with the next generation mobile object detector!
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
item {
2+
name: "bend/bow (at the waist)"
3+
id: 1
4+
}
5+
item {
6+
name: "crouch/kneel"
7+
id: 3
8+
}
9+
item {
10+
name: "dance"
11+
id: 4
12+
}
13+
item {
14+
name: "fall down"
15+
id: 5
16+
}
17+
item {
18+
name: "get up"
19+
id: 6
20+
}
21+
item {
22+
name: "jump/leap"
23+
id: 7
24+
}
25+
item {
26+
name: "lie/sleep"
27+
id: 8
28+
}
29+
item {
30+
name: "martial art"
31+
id: 9
32+
}
33+
item {
34+
name: "run/jog"
35+
id: 10
36+
}
37+
item {
38+
name: "sit"
39+
id: 11
40+
}
41+
item {
42+
name: "stand"
43+
id: 12
44+
}
45+
item {
46+
name: "swim"
47+
id: 13
48+
}
49+
item {
50+
name: "walk"
51+
id: 14
52+
}
53+
item {
54+
name: "answer phone"
55+
id: 15
56+
}
57+
item {
58+
name: "carry/hold (an object)"
59+
id: 17
60+
}
61+
item {
62+
name: "climb (e.g., a mountain)"
63+
id: 20
64+
}
65+
item {
66+
name: "close (e.g., a door, a box)"
67+
id: 22
68+
}
69+
item {
70+
name: "cut"
71+
id: 24
72+
}
73+
item {
74+
name: "dress/put on clothing"
75+
id: 26
76+
}
77+
item {
78+
name: "drink"
79+
id: 27
80+
}
81+
item {
82+
name: "drive (e.g., a car, a truck)"
83+
id: 28
84+
}
85+
item {
86+
name: "eat"
87+
id: 29
88+
}
89+
item {
90+
name: "enter"
91+
id: 30
92+
}
93+
item {
94+
name: "hit (an object)"
95+
id: 34
96+
}
97+
item {
98+
name: "lift/pick up"
99+
id: 36
100+
}
101+
item {
102+
name: "listen (e.g., to music)"
103+
id: 37
104+
}
105+
item {
106+
name: "open (e.g., a window, a car door)"
107+
id: 38
108+
}
109+
item {
110+
name: "play musical instrument"
111+
id: 41
112+
}
113+
item {
114+
name: "point to (an object)"
115+
id: 43
116+
}
117+
item {
118+
name: "pull (an object)"
119+
id: 45
120+
}
121+
item {
122+
name: "push (an object)"
123+
id: 46
124+
}
125+
item {
126+
name: "put down"
127+
id: 47
128+
}
129+
item {
130+
name: "read"
131+
id: 48
132+
}
133+
item {
134+
name: "ride (e.g., a bike, a car, a horse)"
135+
id: 49
136+
}
137+
item {
138+
name: "sail boat"
139+
id: 51
140+
}
141+
item {
142+
name: "shoot"
143+
id: 52
144+
}
145+
item {
146+
name: "smoke"
147+
id: 54
148+
}
149+
item {
150+
name: "take a photo"
151+
id: 56
152+
}
153+
item {
154+
name: "text on/look at a cellphone"
155+
id: 57
156+
}
157+
item {
158+
name: "throw"
159+
id: 58
160+
}
161+
item {
162+
name: "touch (an object)"
163+
id: 59
164+
}
165+
item {
166+
name: "turn (e.g., a screwdriver)"
167+
id: 60
168+
}
169+
item {
170+
name: "watch (e.g., TV)"
171+
id: 61
172+
}
173+
item {
174+
name: "work on a computer"
175+
id: 62
176+
}
177+
item {
178+
name: "write"
179+
id: 63
180+
}
181+
item {
182+
name: "fight/hit (a person)"
183+
id: 64
184+
}
185+
item {
186+
name: "give/serve (an object) to (a person)"
187+
id: 65
188+
}
189+
item {
190+
name: "grab (a person)"
191+
id: 66
192+
}
193+
item {
194+
name: "hand clap"
195+
id: 67
196+
}
197+
item {
198+
name: "hand shake"
199+
id: 68
200+
}
201+
item {
202+
name: "hand wave"
203+
id: 69
204+
}
205+
item {
206+
name: "hug (a person)"
207+
id: 70
208+
}
209+
item {
210+
name: "kiss (a person)"
211+
id: 72
212+
}
213+
item {
214+
name: "lift (a person)"
215+
id: 73
216+
}
217+
item {
218+
name: "listen to (a person)"
219+
id: 74
220+
}
221+
item {
222+
name: "push (another person)"
223+
id: 76
224+
}
225+
item {
226+
name: "sing to (e.g., self, a person, a group)"
227+
id: 77
228+
}
229+
item {
230+
name: "take (an object) from (a person)"
231+
id: 78
232+
}
233+
item {
234+
name: "talk to (e.g., self, a person, a group)"
235+
id: 79
236+
}
237+
item {
238+
name: "watch (a person)"
239+
id: 80
240+
}

research/object_detection/g3doc/detection_model_zoo.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ Some remarks on frozen inference graphs:
9191

9292
## Kitti-trained models {#kitti-models}
9393

94-
Model name | Speed (ms) | Pascal [email protected] (ms) | Outputs
94+
Model name | Speed (ms) | Pascal [email protected] | Outputs
9595
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
9696
[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79 | 87 | Boxes
9797

@@ -103,6 +103,13 @@ Model name
103103
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
104104

105105

106+
## AVA v2.1 trained models {#ava-models}
107+
108+
Model name | Speed (ms) | Pascal [email protected] | Outputs
109+
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
110+
[faster_rcnn_resnet101_ava_v2.1](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_ava_v2.1_2018_04_30.tar.gz) | 93 | 11 | Boxes
111+
112+
106113
[^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
107114
[^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocol](evaluation_protocols.md#open-images).
108115

research/object_detection/model_lib.py

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -325,16 +325,16 @@ def tpu_scaffold():
325325
}
326326

327327
eval_metric_ops = None
328-
if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
328+
if mode == tf.estimator.ModeKeys.EVAL:
329329
class_agnostic = (fields.DetectionResultFields.detection_classes
330330
not in detections)
331331
groundtruth = _get_groundtruth_data(detection_model, class_agnostic)
332332
use_original_images = fields.InputDataFields.original_image in features
333-
original_images = (
333+
eval_images = (
334334
features[fields.InputDataFields.original_image] if use_original_images
335335
else features[fields.InputDataFields.image])
336336
eval_dict = eval_util.result_dict_for_single_example(
337-
original_images[0:1],
337+
eval_images[0:1],
338338
features[inputs.HASH_KEY][0],
339339
detections,
340340
groundtruth,
@@ -355,22 +355,21 @@ def tpu_scaffold():
355355
img_summary = tf.summary.image('Detections_Left_Groundtruth_Right',
356356
detection_and_groundtruth)
357357

358-
if mode == tf.estimator.ModeKeys.EVAL:
359-
# Eval metrics on a single example.
360-
eval_metrics = eval_config.metrics_set
361-
if not eval_metrics:
362-
eval_metrics = ['coco_detection_metrics']
363-
eval_metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
364-
eval_metrics, category_index.values(), eval_dict,
365-
include_metrics_per_category=False)
366-
for loss_key, loss_tensor in iter(losses_dict.items()):
367-
eval_metric_ops[loss_key] = tf.metrics.mean(loss_tensor)
368-
for var in optimizer_summary_vars:
369-
eval_metric_ops[var.op.name] = (var, tf.no_op())
370-
if img_summary is not None:
371-
eval_metric_ops['Detections_Left_Groundtruth_Right'] = (
372-
img_summary, tf.no_op())
373-
eval_metric_ops = {str(k): v for k, v in eval_metric_ops.iteritems()}
358+
# Eval metrics on a single example.
359+
eval_metrics = eval_config.metrics_set
360+
if not eval_metrics:
361+
eval_metrics = ['coco_detection_metrics']
362+
eval_metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
363+
eval_metrics, category_index.values(), eval_dict,
364+
include_metrics_per_category=False)
365+
for loss_key, loss_tensor in iter(losses_dict.items()):
366+
eval_metric_ops[loss_key] = tf.metrics.mean(loss_tensor)
367+
for var in optimizer_summary_vars:
368+
eval_metric_ops[var.op.name] = (var, tf.no_op())
369+
if img_summary is not None:
370+
eval_metric_ops['Detections_Left_Groundtruth_Right'] = (
371+
img_summary, tf.no_op())
372+
eval_metric_ops = {str(k): v for k, v in eval_metric_ops.iteritems()}
374373

375374
if use_tpu:
376375
return tf.contrib.tpu.TPUEstimatorSpec(

0 commit comments

Comments
 (0)