`lab` CLI tool FAQs and troubleshooting #648

stevsmit · 2024-03-15T17:52:18Z

stevsmit
Mar 15, 2024

`lab` CLI tool FAQs

This page serves as a comprehensive FAQ for lab commands, providing general information about the CLI tool and common issues experienced and how to resolve those problems.

About the `lab` CLI tool

The lab command-line interface (CLI) tool allows users to interact with Merlinite-7b -- an open source, pre-trained Large Language Model (LLM) available through [Hugging Face] (https://huggingface.co/ibm/merlinite-7b). Generally speaking, the lab CLI works as follows:

Users download the lab CLI tool, which allows them to download the Merlinite-7b LLM via the lab download command.
Users chat with the LLM. If they find that a specific knowledge domain of the LLM is lacking, they can add a new skill or knowledge (or build upon existing skills/knowledge) to the model and test it locally.
After adding a new skill or knowledge, they can use the lab generate command, which generates new synthetic training data based on the changes of their local taxonomy repository.
Users can then chat with the LLM to test their changes locally and see the results.
Lastly, they can submit a pull request to the InstructLab taxonomy repository, thereby contributing directly to the Merlinite-7b LLM project.

The following section details common FAQs regarding the lab CLI tool. For more information about the InstructLab project, see <include_link_to_general_faq_here>.

`lab` command FAQs

The following FAQs are common issues encountered when running lab-related commands.

`lab chat` FAQ

Q: When sending messages to the chat using lab chat, what is the best method for adding context?

A: The best way to add context when sending message via lab chat is by utilizing the /c <context> command. This allows you to include relevant context alongside your message.

`lab convert` FAQ

Q: Attempting to use lab convert returns the following error: Error: No such command 'convert'.z. Why?

A: lab convert is unavailable in the stable tag. It must be installed from the [main branch] (https://[email protected]/instruct-lab/cli/blob/main/CONTRIBUTING/FIRST_TIME_CONTRIBUTORS.md#installing-lab-from-source) and not a stable tag.

`lab download` FAQ

Q: I encountered an error while trying to download models using lab download. The error message suggests using the gh CLI and mentions gh auth login. What should I do to resolve this issue?

A: To resolve this, ensure that you have the GitHub CLI (gh) installed on your local environment. You can download and install it from [Github] (https://cli.github.com/).

`lab generate` FAQs

Q: I'm encountering an error when running the lab generate command. Why might this be?

A: In some cases, a general error might be caused by issues with your skills or knowledge YAML configuration file, such as an incorrect space or misused colon. Additionally, in order for the lab generate command to work, your skills or knowledge YAML file must include 3 or more samples.

Q: I'm encountering an issue with the lab generate command, and it appears to be related to authentication. Any insights for troubleshooting this?

A: Ensure that you have a local server running via lab serve.

Q: Why is my lab generate process running slow?

A: For some Mac users, adjusting the GPU memory limit might help expedite the lab generate process. By default, Mac allocates around 60-70% of the total available RAM for GPU tasks. On some Macs, for example, the M1 that has 16GB of RAM, this allocation might be suboptimal. Adjusting this limit to a value closer to 12GB might provide improvements for some users.

Q: Why I am encountering an openai-APITimeoutError:Request timed out error when attempting to run lab generate with a new skill?

A: Ensure that you have a local server running via lab serve.

Q: I'm experiencing a string indices must be integers, not 'str' error when running lab generate. What could be the issue?

A: If you are experiencing a string indices must be integers, not 'str' error, there might be an issue with your directory and configuration setup. You might have accidentally ran the lab init command somewhere in the taxonomy directory, which resulted in the creation of a configuration file (config.yaml) within the same directory where the taxonomy is located. To resolve this, ensure that your configuration file is not located within the taxonomy directory. This causes the program to interpret the configuration file as part of the taxonomy, leading to confusion and errors when running lab generate.

Q: My machine crashed when running lab generate and lab chat simultaneously. Why?

A: Running lab generate and lab chat concurrently might crash your machine due to a llama-cpp-python bug. llama.cpp does not yet support batching requests. For more information see [Is the server [sic] can be run with multiple states?"] (abetlen/llama-cpp-python#257)

Q: I created a knowledge document and ran lab generate, but some answers produced during generation are incorrect. Can I correct these answers before re-training the model?

A: No. You cannot directly correct answers generated during the lab generate process. The synthetic data set produced by the model doesn't allow for manual correction of individual answers. If you notice inaccuracies in the generated content, it's crucial not to submit the YAML file containing these incorrect answers.

To address this issue, you'll need to adjust your YAML file, and introduce changes to improve the quality of the generated content. This process might
include refining the questions, adding more diverse samples, and ensuring a broader range of topics to enhance the overall output quality. After modifying the YAML file, rerun the "lab generate" process and review the results again to assess whether the corrections have improved the accuracy of the generated answers. If it has improved the accuracy, you can submit a PR to be reviewed.

`lab init` FAQ

Q: Is it typical for the seed_tasks.json file created by lab init to contain non-technical content, such as US politics policy analysis and social policy stereotypes? Where does this content originate from?

A: It is not uncommon for the seed_tasks.json file generated by lab init to include non-technical content, seemingly unrelated to the intended task. This occurrence might seem odd, but it stems from the nature of the language model being used, and how it generates synthetic data to expand on the examples that you gave it.

`lab list` FAQ

Q: Why doesn't lab list return a list of my modules?

A: Using lab list lists taxonomy files that have changed since a reference commit (default origin/main) and only returns a diff. lab list just checks if there are any changes pending in the taxonomy repository to determine if there is anything new.

Technical `lab` FAQs

Q: Even though the model size is only 4.1 GB, it takes a a long time to download, even with a fast connection. Is there a way to speed up the download process?

A: Ensure that you have the fastest mirror enabled.

Q: When using pip install, I encountered the following error: Permission denied (publickey). How can I resolve this issue?

A: The error message indicates that authentication is required to access the GitHub repository via SSH, but no valid SSH keypair is available. To resolve this issue, use gh auth login and follow the prompts provided by the Github CLI.

Q: I am interested in using Langchain with InstructLab. How can I do this?

A: You can execute code against a locally running lab serve session using a command like the following example:

from langchain_openai.llms import OpenAI

# Initialize Langchain's OpenAI object
llm = OpenAI(
    openai_api_key="EMPTY",  # Set your OpenAI API key here
    openai_api_base="http://localhost:8000/v1"  # Set the base URL for the local lab serve session
)

# Define user prompt
usr_prompt = "What is the capital of France?"

# Construct the full prompt with user input
prompt = "\n" + usr_prompt + "\n\n"

# Invoke Langchain's OpenAI object with the prompt
response = llm.invoke(prompt)

# Print the response
print(response)

Q: I attempted to install llama-cpp-python using pip via pip install llama-cpp-python, but encountered the following error: Failed building wheel for llama-cpp-python. How can I resolve this issue?

A: To resolve this issue, make sure that you have development tools and cmake installed and configured on your machine. Additionally, ensure that you have gcc-c++ installed.

Q: How can I install llama-cpp-py with OpenCL enabled on Linux?

A: To install llama-cpp-py with OpenCL enabled on Linux, you must

If you have already installed the InstructLab CLI (lab), you must first uninstall it via python -m pip uninstall llama-cpp-python.
Install OpenCL dependencies by running a similar command relevant to your machine: sudo dnf install intel-opencl clblast clblast-devel
Reinstall using the following command: CMAKE_ARGS="-DLLAMA_CLBLAST=on" python -m pip install llama-cpp-python[server] --no-cache-dir

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`lab` CLI tool FAQs and troubleshooting #648

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

lab CLI tool FAQs and troubleshooting #648

Uh oh!

Uh oh!

stevsmit Mar 15, 2024

lab CLI tool FAQs

About the lab CLI tool

lab command FAQs

lab chat FAQ

lab convert FAQ

lab download FAQ

lab generate FAQs

lab init FAQ

lab list FAQ

Technical lab FAQs

Replies: 0 comments

`lab` CLI tool FAQs and troubleshooting #648

stevsmit
Mar 15, 2024

`lab` CLI tool FAQs

About the `lab` CLI tool

`lab` command FAQs

`lab chat` FAQ

`lab convert` FAQ

`lab download` FAQ

`lab generate` FAQs

`lab init` FAQ

`lab list` FAQ

Technical `lab` FAQs