🏘️ LangSuit⋅E

Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments

LangSuit⋅E is a systematic and simulation-free testbed for evaluating embodied capabilities of large language models (LLMs) across different tasks in embodied textual worlds. The highlighted features include:

Embodied Textual Environments: The testbed provides a general simulation-free textual world that supports most embodied tasks, including navigation, manipulation, and communications. The environment is based on Gymnasium and inherits the design patterns.
Embodied Observations and Actions: All agents' observations are designed to be embodied with customizable max_view_distance, max_manipulate_distance, focal_length, etc.
Customizable Embodied Agents: The agents in LangSuit⋅E are fully-customizable w.r.t their action spaces and communicative capabilities, i.e., one can easily adapt the communication and acting strategy from one task to another.
Multi-agent Cooperation: The testbed supports planning, acting and communication among multiple agents, where each agent can be customized to have different configurations.
Human-agent Communication: Besides communication between agents, the testbed supports communication and cooperation between humans and agents.
Full support to LangChain library: The LangSuitE testbed supports full usage of API language models, Open-source language models, tool usages, Chain-of-Thought (CoT) strategies, etc..
Expert Trajectory Generation: We provide expert trajectory generation algorithms for most tasks.

📦 Benchmark and Dataset

We form a benchmark by adapting from existing annotations of simulated embodied engines, a by-product benefit of pursuing a general textual embodied world. Below showcase 6 representative embodied tasks, with variants of the number of rooms, the number of agents, and the action spaces of agents (whether they can communicate with each other or ask humans).

Task	Simulator	# of Scenes	# of Tasks	# of Actions	Multi-Room	Multi-Agent	Communicative
BabyAI	Mini Grid	105	500	6	✓	✗	✗
Rearrange	AI2Thor	120	500	8	✗	✗	✗
IQA	AI2Thor	30	3,000	5	✗	✗	✓
ALFred	AI2Thor	120	506	12	✗	✗	✗
TEACh	AI2Thor	120	200	13	✗	✓	✓
CWAH	Virtual Home	2	50	6	✓	✓	✓

🛠 Getting Started

Installation

Clone this repository

git clone https://github.com/langsuite/langsuite.git
cd langsuite

Create a conda environment with Python3.8+ and install python requirements

conda create -n langsuite python=3.8
conda activate langsuite
pip install -e .

Export your OPENAI_API_KEY by

export OPENAI_API_KEY="your_api_key_here"

or you can customize your APIs by

cp api.config.yml.example api.config.yml

and add or update your API configurations. For a full API agent list, please refer to LangChain Chat Models.

Download the task dataset by

bash ./data/download.sh <data name>

Currently supported datasets include: alfred, babyai, cwah, iqa, rearrange.

Quick Start: CommandLine Interface (Default)

langsuite task <config-file.yml>

Quick Start: Interactive Web UI

Start langsuite server

langsuite serve <config-file.yml>

Start webui

langsuite webui

The user inferface will run on http://localhost:8501/

Task Configuration

task: ExampleTask:Procthor2DEnv
template: ./langsuite/envs/ai2thor/templates/procthor_rearrange.json

env:
  type: Procthor2DEnv

world:
  type: ProcTHORWorld
  id: test_world
  grid_size: 0.25
  asset_path: ./data/asset-database.json
  metadata_path: ./data/ai2thor-object-metadata.json
  receptacles_path: ./data/receptacles.json

agents:
  - type: ChatGPTAgent
    position: 'random'
    inventory_capacity: 1
    focal_length: 10
    max_manipulate_distance: 1
    max_view_distance: 2
    step_size: 0.25
    llm:
      llm_type: ChatOpenAI

Prompt Template

{
    "intro": {
        "default": [
            "You are an autonomous intelligent agent tasked with navigating a vitual home. You will be given a household task. These tasks will be accomplished through the use of specific actions you can issue. [...]"
        ]
    },
    "example": {
        "default": [
            "Task: go to the red box. \nObs:You can see a blue key in front of you; You can see a red box on your right. \nManipulable object: A blue key.\n>Act: turn_right."
        ]
    },
    "InvalidAction": {
        "failure.invalidObjectName": [
            "Feedback: Action failed. There is no the object \"{object}\" in your view space. Please operate the object in sight.\nObs: {observation}"
        ],
        ...
    },
    ...
}

📝 Citation

If you find our work useful, please cite

@misc{langsuite2023,
  author    = {Zilong Zheng, Zixia Jia,  Mengmeng Wang, Wentao Ding, Baichen Tong, Songchun Zhu},
  title     = {LangSuit⋅E: Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments},
  year      = {2023},
  publisher = {GitHub},
  url       = {https://github.com/bigai-nlco/langsuite}
}

For any questions and issues, please contact [email protected].

📄 Acknowledgements

Some of the tasks of LangSuit⋅E are based on the datasets and source-code proposed by previous researchers, including BabyAI, AI2Thor, ALFred, TEAch, CWAH.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
configs		configs
data		data
docs		docs
langsuite		langsuite
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
api.config.yml.example		api.config.yml.example
mkdocs.yml		mkdocs.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏘️ LangSuit⋅E

Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments

Table of Contents

📦 Benchmark and Dataset

🛠 Getting Started

Installation

Quick Start: CommandLine Interface (Default)

Quick Start: Interactive Web UI

Task Configuration

Prompt Template

📝 Citation

📄 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

bigai-nlco/langsuite

Folders and files

Latest commit

History

Repository files navigation

🏘️ LangSuit⋅E

Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments

Table of Contents

📦 Benchmark and Dataset

🛠 Getting Started

Installation

Quick Start: CommandLine Interface (Default)

Quick Start: Interactive Web UI

Task Configuration

Prompt Template

📝 Citation

📄 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages