Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit e6ce8cd

Browse files
arsalan-mousavianTaylor Robie
authored and
Taylor Robie
committed
* adding implementation of https://arxiv.org/abs/1805.06066 * fixing typos and sentences in README
1 parent 3c37361 commit e6ce8cd

26 files changed

+7642
-0
lines changed

research/cognitive_planning/BUILD

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
package(default_visibility = [":internal"])
2+
3+
licenses(["notice"]) # Apache 2.0
4+
5+
exports_files(["LICENSE"])
6+
7+
package_group(
8+
name = "internal",
9+
packages = [
10+
"//cognitive_planning/...",
11+
],
12+
)
13+
14+
py_binary(
15+
name = "train_supervised_active_vision",
16+
srcs = [
17+
"train_supervised_active_vision.py",
18+
],
19+
)

research/cognitive_planning/README.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# cognitive_planning
2+
3+
**Visual Representation for Semantic Target Driven Navigation**
4+
5+
Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, James Davidson
6+
7+
This is the implementation of semantic target driven navigation training and evaluation on
8+
Active Vision dataset.
9+
10+
ECCV Workshop on Visual Learning and Embodied Agents in Simulation Environments
11+
2018.
12+
13+
<div align="center">
14+
<table style="width:100%" border="0">
15+
<tr>
16+
<td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_fridge_2.gif'></td>
17+
<td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_tv_1.gif'></td>
18+
</tr>
19+
<tr>
20+
<td align="center">Target: Fridge</td>
21+
<td align="center">Target: Television</td>
22+
</tr>
23+
<tr>
24+
<td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_microwave_1.gif'></td>
25+
<td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_couch_1.gif'></td>
26+
</tr>
27+
<tr>
28+
<td align="center">Target: Microwave</td>
29+
<td align="center">Target: Couch</td>
30+
</tr>
31+
</table>
32+
</div>
33+
34+
35+
36+
Paper: [https://arxiv.org/abs/1805.06066](https://arxiv.org/abs/1805.06066)
37+
38+
39+
## 1. Installation
40+
41+
### Requirements
42+
43+
#### Python Packages
44+
45+
```shell
46+
networkx
47+
gin-config
48+
```
49+
50+
### Download cognitive_planning
51+
52+
```shell
53+
git clone --depth 1 https://github.com/tensorflow/models.git
54+
```
55+
56+
## 2. Datasets
57+
58+
### Download ActiveVision Dataset
59+
We used Active Vision Dataset (AVD) which can be downloaded from [here](http://cs.unc.edu/~ammirato/active_vision_dataset_website/). To make our code faster and reduce memory footprint, we created the AVD Minimal dataset. AVD Minimal consists of low resolution images from the original AVD dataset. In addition, we added annotations for target views, predicted object detections from pre-trained object detector on MS-COCO dataset, and predicted semantic segmentation from pre-trained model on NYU-v2 dataset. AVD minimal can be downloaded from [here](https://storage.googleapis.com/active-vision-dataset/AVD_Minimal.zip). Set `$AVD_DIR` as the path to the downloaded AVD Minimal.
60+
61+
### TODO: SUNCG Dataset
62+
Current version of the code does not support SUNCG dataset. It can be added by
63+
implementing necessary functions of `envs/task_env.py` using the public
64+
released code of SUNCG environment such as
65+
[House3d](https://github.com/facebookresearch/House3D) and
66+
[MINOS](https://github.com/minosworld/minos).
67+
68+
### ActiveVisionDataset Demo
69+
70+
71+
If you wish to navigate the environment, to see how the AVD looks like you can use the following command:
72+
```shell
73+
python viz_active_vision_dataset_main -- \
74+
--mode=human \
75+
--gin_config=envs/configs/active_vision_config.gin \
76+
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR'
77+
```
78+
79+
## 3. Training
80+
Right now, the released version only supports training and inference using the real data from Active Vision Dataset.
81+
82+
When RGB image modality is used, the Resnet embeddings are initialized. To start the training download pre-trained Resnet50 check point in the working directory ./resnet_v2_50_checkpoint/resnet_v2_50.ckpt
83+
84+
```
85+
wget http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz
86+
```
87+
### Run training
88+
Use the following command for training:
89+
```shell
90+
# Train
91+
python train_supervised_active_vision.py \
92+
--mode='train' \
93+
--logdir=$CHECKPOINT_DIR \
94+
--modality_types='det' \
95+
--batch_size=8 \
96+
--train_iters=200000 \
97+
--lstm_cell_size=2048 \
98+
--policy_fc_size=2048 \
99+
--sequence_length=20 \
100+
--max_eval_episode_length=100 \
101+
--test_iters=194 \
102+
--gin_config=envs/configs/active_vision_config.gin \
103+
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR' \
104+
--logtostderr
105+
```
106+
107+
The training can be run for different modalities and modality combinations, including semantic segmentation, object detectors, RGB images, depth images. Low resolution images and outputs of detectors pretrained on COCO dataset and semantic segmenation pre trained on NYU dataset are provided as a part of this distribution and can be found in Meta directory of AVD_Minimal.
108+
Additional details are described in the comments of the code and in the paper.
109+
110+
### Run Evaluation
111+
Use the following command for unrolling the policy on the eval environments. The inference code periodically check the checkpoint folder for new checkpoints to use it for unrolling the policy on the eval environments. After each evaluation, it will create a folder in the $CHECKPOINT_DIR/evals/$ITER where $ITER is the iteration number at which the checkpoint is stored.
112+
```shell
113+
# Eval
114+
python train_supervised_active_vision.py \
115+
--mode='eval' \
116+
--logdir=$CHECKPOINT_DIR \
117+
--modality_types='det' \
118+
--batch_size=8 \
119+
--train_iters=200000 \
120+
--lstm_cell_size=2048 \
121+
--policy_fc_size=2048 \
122+
--sequence_length=20 \
123+
--max_eval_episode_length=100 \
124+
--test_iters=194 \
125+
--gin_config=envs/configs/active_vision_config.gin \
126+
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR' \
127+
--logtostderr
128+
```
129+
At any point, you can run the following command to compute statistics such as success rate over all the evaluations so far. It also generates gif images for unrolling of the best policy.
130+
```shell
131+
# Visualize and Compute Stats
132+
python viz_active_vision_dataset_main.py \
133+
--mode=eval \
134+
--eval_folder=$CHECKPOINT_DIR/evals/ \
135+
--output_folder=$OUTPUT_GIFS_FOLDER \
136+
--gin_config=envs/configs/active_vision_config.gin \
137+
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR'
138+
```
139+
## Contact
140+
141+
To ask questions or report issues please open an issue on the tensorflow/models
142+
[issues tracker](https://github.com/tensorflow/models/issues).
143+
Please assign issues to @arsalan-mousavian.
144+
145+
## Reference
146+
The details of the training and experiments can be found in the following paper. If you find our work useful in your research please consider citing our paper:
147+
148+
```
149+
@inproceedings{MousavianECCVW18,
150+
author = {A. Mousavian and A. Toshev and M. Fiser and J. Kosecka and J. Davidson},
151+
title = {Visual Representations for Semantic Target Driven Navigation},
152+
booktitle = {ECCV Workshop on Visual Learning and Embodied Agents in Simulation Environments},
153+
year = {2018},
154+
}
155+
```
156+
157+

research/cognitive_planning/__init__.py

Whitespace-only changes.

research/cognitive_planning/command

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
python train_supervised_active_vision \
2+
--mode='train' \
3+
--logdir=/usr/local/google/home/kosecka/checkin_log_det/ \
4+
--modality_types='det' \
5+
--batch_size=8 \
6+
--train_iters=200000 \
7+
--lstm_cell_size=2048 \
8+
--policy_fc_size=2048 \
9+
--sequence_length=20 \
10+
--max_eval_episode_length=100 \
11+
--test_iters=194 \
12+
--gin_config=robotics/cognitive_planning/envs/configs/active_vision_config.gin \
13+
--gin_params='ActiveVisionDatasetEnv.dataset_root="/usr/local/google/home/kosecka/AVD_minimal/"' \
14+
--logtostderr

0 commit comments

Comments
 (0)