Large language models (LLMs) may not perform well on tasks that require real-time information or mathematical calculations. Function calling solves this problem by introducing external tools, which allows an LLM to answer questions it otherwise could not.
How it works
Function calling enables an LLM to use information from external tools through a multi-step interaction between your application and the model.
Make the first model call
Your application sends a request to the LLM. The request contains the user's question and a list of tools the model can call.
Receive the model's tool call instruction
If the model decides to call an external tool, it returns a JSON instruction. This instruction tells your application which function to run and what input parameters to use.
If the model decides not to call a tool, it returns a response in natural language.
Run the tool in your application
After your application receives the tool instruction, run the tool to obtain its output.
Make the second model call
After you obtain the tool's output, add it to the model's context (messages) and make another model call.
Receive the final response from the model
The model combines the tool's output with the user's question to generate a final response in natural language.
The following diagram shows the workflow.
Getting started
First, you must get your API key and set the API key as an environment variable. If you call the model using the OpenAI SDK or DashScope SDK, you must also install the SDK.
This section uses a weather query scenario to demonstrate how to quickly use function calling.
OpenAI compatible
from openai import OpenAI
from datetime import datetime
import json
import os
import random
client = OpenAI(
# If you use a model in the China (Singapore) region, you must use an API key from that region. Get it here: https://bailian.console.alibabacloud.com/?tab=model#/api-key
# If you have not configured an environment variable, replace the next line with: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# If you use a model in the China (Singapore) region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
# Simulate a user question.
USER_QUESTION = "What's the weather like in Singapore?"
# Define the tool list.
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for when you want to query the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "A city or district, such as Singapore, London, or Shanghai.",
}
},
"required": ["location"],
},
},
},
]
# Simulate a weather query tool.
def get_current_weather(arguments):
weather_conditions = ["Sunny", "Cloudy", "Rainy"]
random_weather = random.choice(weather_conditions)
location = arguments["location"]
return f"Today {location} is {random_weather}."
# Encapsulate the model response function.
def get_response(messages):
completion = client.chat.completions.create(
model="qwen-plus",
messages=messages,
tools=tools,
)
return completion
messages = [{"role": "user", "content": USER_QUESTION}]
response = get_response(messages)
assistant_output = response.choices[0].message
if assistant_output.content is None:
assistant_output.content = ""
messages.append(assistant_output)
# If no tool call is needed, output the content directly.
if assistant_output.tool_calls is None:
print(f"No tool call needed for weather query. Replying directly: {assistant_output.content}")
else:
# Enter the tool calling loop.
while assistant_output.tool_calls is not None:
tool_call = assistant_output.tool_calls[0]
tool_call_id = tool_call.id
func_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"Calling tool [{func_name}] with arguments: {arguments}")
# Run the tool.
tool_result = get_current_weather(arguments)
# Construct the tool return message.
tool_message = {
"role": "tool",
"tool_call_id": tool_call_id,
"content": tool_result, # Keep the original tool output.
}
print(f"Tool returned: {tool_message['content']}")
messages.append(tool_message)
# Call the model again to get a summarized natural language response.
response = get_response(messages)
assistant_output = response.choices[0].message
if assistant_output.content is None:
assistant_output.content = ""
messages.append(assistant_output)
print(f"Final assistant response: {assistant_output.content}")
import OpenAI from 'openai';
// Initialize the client.
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
// If you use a model in the China (Singapore) region, replace the baseURL with: https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
});
// Define the tool list.
const tools = [
{
type: "function",
function: {
name: "get_current_weather",
description: "Useful for when you want to query the weather in a specific city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "A city or district, such as Singapore, London, or Shanghai.",
},
},
required: ["location"],
},
},
},
];
// Simulate a weather query tool.
const getCurrentWeather = (args) => {
const weatherConditions = ["Sunny", "Cloudy", "Rainy"];
const randomWeather = weatherConditions[Math.floor(Math.random() * weatherConditions.length)];
const location = args.location;
return `Today ${location} is ${randomWeather}.`;
};
// Encapsulate the model response function.
const getResponse = async (messages) => {
const response = await openai.chat.completions.create({
model: "qwen-plus",
messages: messages,
tools: tools,
});
return response;
};
const main = async () => {
const input = "What's the weather like in Singapore?";
let messages = [
{
role: "user",
content: input,
}
];
let response = await getResponse(messages);
let assistantOutput = response.choices[0].message;
// Ensure content is not null.
if (!assistantOutput.content) assistantOutput.content = "";
messages.push(assistantOutput);
// Determine if a tool call is needed.
if (!assistantOutput.tool_calls) {
console.log(`No tool call needed for weather query. Replying directly: ${assistantOutput.content}`);
} else {
// Enter the tool calling loop.
while (assistantOutput.tool_calls) {
const toolCall = assistantOutput.tool_calls[0];
const toolCallId = toolCall.id;
const funcName = toolCall.function.name;
const funcArgs = JSON.parse(toolCall.function.arguments);
console.log(`Calling tool [${funcName}] with arguments:`, funcArgs);
// Run the tool.
const toolResult = getCurrentWeather(funcArgs);
// Construct the tool return message.
const toolMessage = {
role: "tool",
tool_call_id: toolCallId,
content: toolResult,
};
console.log(`Tool returned: ${toolMessage.content}`);
messages.push(toolMessage);
// Call the model again to get a natural language summary.
response = await getResponse(messages);
assistantOutput = response.choices[0].message;
if (!assistantOutput.content) assistantOutput.content = "";
messages.push(assistantOutput);
}
console.log(`Final assistant response: ${assistantOutput.content}`);
}
};
// Start the program.
main().catch(console.error);
DashScope
import os
from dashscope import Generation
import dashscope
import json
import random
# If you use a model in the China (Singapore) region, replace the base_http_api_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
# 1. Define the tool list.
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for when you want to query the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "A city or district, such as Singapore, London, or Shanghai.",
}
},
"required": ["location"],
},
},
}
]
# 2. Simulate a weather query tool.
def get_current_weather(arguments):
weather_conditions = ["Sunny", "Cloudy", "Rainy"]
random_weather = random.choice(weather_conditions)
location = arguments["location"]
return f"Today {location} is {random_weather}."
# 3. Encapsulate the model response function.
def get_response(messages):
response = Generation.call(
# If you have not configured an environment variable, replace the next line with: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
model="qwen-plus",
messages=messages,
tools=tools,
result_format="message",
)
return response
# 4. Initialize the conversation history.
messages = [
{
"role": "user",
"content": "What's the weather like in Singapore?"
}
]
# 5. Make the first model call.
response = get_response(messages)
assistant_output = response.output.choices[0].message
messages.append(assistant_output)
# 6. Determine if a tool call is needed.
if "tool_calls" not in assistant_output or not assistant_output["tool_calls"]:
print(f"No tool call needed. Replying directly: {assistant_output['content']}")
else:
# 7. Enter the tool calling loop.
# Loop condition: as long as the latest model response contains a tool call request.
while "tool_calls" in assistant_output and assistant_output["tool_calls"]:
tool_call = assistant_output["tool_calls"][0]
# Parse the tool call information.
func_name = tool_call["function"]["name"]
arguments = json.loads(tool_call["function"]["arguments"])
tool_call_id = tool_call.get("id") # Get the tool_call_id.
print(f"Calling tool [{func_name}] with arguments: {arguments}")
# Run the corresponding tool function.
tool_result = get_current_weather(arguments)
# Construct the tool return message.
tool_message = {
"role": "tool",
"content": tool_result,
"tool_call_id": tool_call_id
}
print(f"Tool returned: {tool_message['content']}")
messages.append(tool_message)
# Call the model again to get a response based on the tool result.
response = get_response(messages)
assistant_output = response.output.choices[0].message
messages.append(assistant_output)
# 8. Output the final natural language response.
print(f"Final assistant response: {assistant_output['content']}")
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.protocol.Protocol;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.tools.FunctionDefinition;
import com.alibaba.dashscope.tools.ToolCallBase;
import com.alibaba.dashscope.tools.ToolCallFunction;
import com.alibaba.dashscope.tools.ToolFunction;
import com.alibaba.dashscope.utils.JsonUtils;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
public class Main {
/**
* Define the local implementation of the tool.
* @param arguments A JSON string from the model that contains the required parameters for the tool.
* @return A string with the result of the tool execution.
*/
public static String getCurrentWeather(String arguments) {
try {
// The parameters provided by the model are in JSON format and need to be parsed manually.
ObjectMapper objectMapper = new ObjectMapper();
JsonNode argsNode = objectMapper.readTree(arguments);
String location = argsNode.get("location").asText();
// Use a random result to simulate a real API call or business logic.
List<String> weatherConditions = Arrays.asList("Sunny", "Cloudy", "Rainy");
String randomWeather = weatherConditions.get(new Random().nextInt(weatherConditions.size()));
return "Today " + location + " is " + randomWeather + ".";
} catch (Exception e) {
// Handle exceptions to ensure program robustness.
return "Could not parse location parameter.";
}
}
public static void main(String[] args) {
try {
// Describe (register) our tool with the model.
String weatherParamsSchema =
"{\"type\":\"object\",\"properties\":{\"location\":{\"type\":\"string\",\"description\":\"A city or district, such as Singapore, London, or Shanghai.\"}},\"required\":[\"location\"]}";
FunctionDefinition weatherFunction = FunctionDefinition.builder()
.name("get_current_weather") // A unique identifier for the tool, which must correspond to the local implementation.
.description("Useful for when you want to query the weather in a specific city.") // A clear description helps the model decide when to use the tool.
.parameters(JsonUtils.parseString(weatherParamsSchema).getAsJsonObject())
.build();
// If you use a model in the China (Singapore) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
String userInput = "What's the weather like in Singapore?";
List<Message> messages = new ArrayList<>();
messages.add(Message.builder().role(Role.USER.getValue()).content(userInput).build());
// Make the first model call. Send the user's request and our defined tool list to the model.
GenerationParam param = GenerationParam.builder()
.model("qwen-plus") // Specify the model to call.
.apiKey(System.getenv("DASHSCOPE_API_KEY")) // Get the API key from an environment variable.
.messages(messages) // Pass the current conversation history.
.tools(Arrays.asList(ToolFunction.builder().function(weatherFunction).build())) // Pass the list of available tools.
.resultFormat(GenerationParam.ResultFormat.MESSAGE)
.build();
GenerationResult result = gen.call(param);
Message assistantOutput = result.getOutput().getChoices().get(0).getMessage();
messages.add(assistantOutput); // Add the model's first response to the conversation history.
// Check the model's response to see if it requested a tool call.
if (assistantOutput.getToolCalls() == null || assistantOutput.getToolCalls().isEmpty()) {
// Scenario A: The model did not call a tool and answered directly.
System.out.println("No tool call needed for weather query. Replying directly: " + assistantOutput.getContent());
} else {
// Scenario B: The model decided to call a tool.
// Use a while loop to handle scenarios where the model calls tools multiple times in a row.
while (assistantOutput.getToolCalls() != null && !assistantOutput.getToolCalls().isEmpty()) {
ToolCallBase toolCall = assistantOutput.getToolCalls().get(0);
// Parse the specific tool call information (function name, parameters) from the model's response.
ToolCallFunction functionCall = (ToolCallFunction) toolCall;
String funcName = functionCall.getFunction().getName();
String arguments = functionCall.getFunction().getArguments();
System.out.println("Calling tool [" + funcName + "] with arguments: " + arguments);
// Run the corresponding local Java method based on the tool name.
String toolResult = getCurrentWeather(arguments);
// Construct a message with the role "tool" that contains the tool's execution result.
Message toolMessage = Message.builder()
.role("tool")
.toolCallId(toolCall.getId())
.content(toolResult)
.build();
System.out.println("Tool returned: " + toolMessage.getContent());
messages.add(toolMessage); // Add the tool's return result to the conversation history.
// Call the model again.
param.setMessages(messages);
result = gen.call(param);
assistantOutput = result.getOutput().getChoices().get(0).getMessage();
messages.add(assistantOutput);
}
// Print the final, summarized response generated by the model.
System.out.println("Final assistant response: " + assistantOutput.getContent());
}
} catch (NoApiKeyException | InputRequiredException e) {
System.err.println("Error: " + e.getMessage());
} catch (Exception e) {
e.printStackTrace();
}
}
}
After you run the code, you obtain the following output:
Calling tool [get_current_weather] with arguments: {'location': 'Singapore'}
Tool returned: Today Singapore is Cloudy.
Final assistant response: In Singapore, it is cloudy today.
Usage notes
Function calling supports two methods for passing tool information:
Method 1: Pass information using the tools parameter (recommended)
In usage notes, make the call by defining tools, creating the messages array, initiating function calling, running the tool function, and summarizing the tool function output with the model.
Method 2: Pass information using a system message
When you use the tools parameter, the server automatically adapts and assembles a suitable prompt template based on the model. Therefore, we recommend that you prioritize using the tools parameter. If you do not want to use the tools parameter when you use the Qwen model, see Pass tool information using system message.
The following sections use the OpenAI-compatible calling method as an example. They demonstrate how to pass tool information using the tools parameter and provide a step-by-step guide to the details of function calling.
Assume that your business scenario involves questions about the weather and the current time.
1. Define tools
Tools are the bridge between the LLM and the outside world. You must first define your tools.
1.1. Create tool functions
Create two tool functions: a weather query tool and a time query tool.
Weather query tool
Accepts the
arguments
parameter. Thearguments
parameter has the format{"location": "the location to query"}
. The tool's output is a string in the format"Today {Location} is {Weather}"
.For demonstration purposes, the weather query tool defined here does not actually query the weather. It randomly selects from "Sunny", "Cloudy", or "Rainy". In a real business scenario, you can replace this with a tool such as the Amap Weather Query API.
Time query tool
The time query tool does not require any input parameters. The tool's output is a string in the format
"Current time: {queried_time}."
.If you are using Node.js, run
npm install date-fns
to install the date-fns package for obtaining the time.
## Step 1: Define tool functions
# Add import for the random module
import random
from datetime import datetime
# Simulate a weather query tool. Example return: "Today Shanghai is Rainy."
def get_current_weather(arguments):
# Define a list of alternative weather conditions
weather_conditions = ["Sunny", "Cloudy", "Rainy"]
# Randomly select a weather condition
random_weather = random.choice(weather_conditions)
# Extract location information from the JSON
location = arguments["location"]
# Return the formatted weather information
return f"Today {location} is {random_weather}."
# A tool to query the current time. Example return: "Current time: 2024-04-15 17:15:18."
def get_current_time():
# Get the current date and time
current_datetime = datetime.now()
# Format the current date and time
formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
# Return the formatted current time
return f"Current time: {formatted_time}."
# Test the tool functions and output the results. You can remove the following four lines of test code when running subsequent steps.
print("Testing tool output:")
print(get_current_weather({"location": "Shanghai"}))
print(get_current_time())
print("\n")
// Step 1: Define tool functions
// Import the time query tool
import { format } from 'date-fns';
function getCurrentWeather(args) {
// Define a list of alternative weather conditions
const weatherConditions = ["Sunny", "Cloudy", "Rainy"];
// Randomly select a weather condition
const randomWeather = weatherConditions[Math.floor(Math.random() * weatherConditions.length)];
// Extract location information from the JSON
const location = args.location;
// Return the formatted weather information
return `Today ${location} is ${randomWeather}.`;
}
function getCurrentTime() {
// Get the current date and time
const currentDatetime = new Date();
// Format the current date and time
const formattedTime = format(currentDatetime, 'yyyy-MM-dd HH:mm:ss');
// Return the formatted current time
return `Current time: ${formattedTime}.`;
}
// Test the tool functions and output the results. You can remove the following four lines of test code when running subsequent steps.
console.log("Testing tool output:")
console.log(getCurrentWeather({location:"Shanghai"}));
console.log(getCurrentTime());
console.log("\n")
After you run the tool, you obtain the following output:
Testing tool output:
Today Shanghai is Cloudy.
Current time: 2025-01-08 20:21:45.
1.2. Create the tools array
Just as a person needs to understand a tool's function, usage conditions, and input parameters before using it, an LLM also requires this information to select tools accurately. You can provide the tool information in the following JSON format.
| For the weather query tool, the tool description is in the following format:
|
Before you initiate function calling, define a tool information array (tools) in your code. This array contains the function name, description, and parameter definition for each tool. This array is passed as a parameter when you initiate the function calling request.
# Paste the following code after the code from Step 1.
## Step 2: Create the tools array
tools = [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful for when you want to know the current time.",
"parameters": {}
}
},
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for when you want to query the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "A city or district, such as Singapore, London, or Shanghai.",
}
},
"required": ["location"]
}
}
}
]
tool_name = [tool["function"]["name"] for tool in tools]
print(f"Created {len(tools)} tools: {tool_name}\n")
// Paste the following code after the code from Step 1.
// Step 2: Create the tools array
const tools = [
{
type: "function",
function: {
name: "get_current_time",
description: "Useful for when you want to know the current time.",
parameters: {}
}
},
{
type: "function",
function: {
name: "get_current_weather",
description: "Useful for when you want to query the weather in a specific city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "A city or district, such as Singapore, London, or Shanghai.",
}
},
required: ["location"]
}
}
}
];
const toolNames = tools.map(tool => tool.function.name);
console.log(`Created ${tools.length} tools: ${toolNames.join(', ')}\n`);
2. Create the messages array
Function calling uses the messages array to pass instructions and context to the LLM. Before you initiate function calling, the messages array must contain a system message and a user message.
System message
Although you described the purpose of the tools and when to use them when you created the tools array, emphasizing when to call the tools in the system message can improve the accuracy of tool calls. In this scenario, you can set the system prompt to:
You are a helpful assistant. If the user asks about the weather, call the 'get_current_weather' function.
If the user asks about the time, call the 'get_current_time' function.
Please answer in a friendly tone.
User message
The user message is used to pass the user's question. For example, if the user asks "What is the weather in Shanghai?", the messages array would be:
# Step 3: Create the messages array
# Paste the following code after the code from Step 2.
messages = [
{
"role": "system",
"content": """You are a helpful assistant. If the user asks about the weather, call the 'get_current_weather' function.
If the user asks about the time, call the 'get_current_time' function.
Please answer in a friendly tone.""",
},
{
"role": "user",
"content": "Shanghai weather"
}
]
print("messages array created.\n")
// Step 3: Create the messages array
// Paste the following code after the code from Step 2.
const messages = [
{
role: "system",
content: "You are a helpful assistant. If the user asks about the weather, call the 'get_current_weather' function. If the user asks about the time, call the 'get_current_time' function. Please answer in a friendly tone.",
},
{
role: "user",
content: "Shanghai weather"
}
];
console.log("messages array created.\n");
Because the available tools include weather and time queries, you can also ask about the current time.
3. Initiate function calling
Pass the created tools
and messages
arrays to the LLM to initiate a function call. The LLM will decide whether to call a tool. If it does, it returns the tool's function name and parameters.
For supported models, see Supported Models. Multimodal models are not currently supported.
# Step 4: Initiate function calling
# Paste the following code after the code from Step 3.
from openai import OpenAI
import os
client = OpenAI(
# If you use a model in the China (Singapore) region, you must use an API key from that region. Get it here: https://bailian.console.alibabacloud.com/?tab=model#/api-key
# If you have not configured an environment variable, replace the next line with: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# If you use a model in the China (Singapore) region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
def function_calling():
completion = client.chat.completions.create(
# Using qwen-plus as an example. You can change the model name as needed. Model list: https://www.alibabacloud.com/help/en/model-studio/getting-started/models
model="qwen-plus",
messages=messages,
tools=tools
)
print("Returned object:")
print(completion.choices[0].message.model_dump_json())
print("\n")
return completion
print("Initiating function calling...")
completion = function_calling()
// Step 4: Initiate function calling
// Paste the following code after the code from Step 3.
import OpenAI from "openai";
const openai = new OpenAI(
{
// If you use a model in the China (Singapore) region, you must use an API key from that region. Get it here: https://bailian.console.alibabacloud.com/?tab=model#/api-key
// If you have not configured an environment variable, replace the next line with: apiKey: "sk-xxx",
apiKey: process.env.DASHSCOPE_API_KEY,
// If you use a model in the China (Singapore) region, replace the baseURL with: https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
}
);
async function functionCalling() {
const completion = await openai.chat.completions.create({
model: "qwen-plus", // Using qwen-plus as an example. You can change the model name as needed. Model list: https://www.alibabacloud.com/help/en/model-studio/getting-started/models
messages: messages,
tools: tools
});
console.log("Returned object:");
console.log(JSON.stringify(completion.choices[0].message));
console.log("\n");
return completion;
}
const completion = await functionCalling();
Because the user asked about the weather in Shanghai, the LLM specifies that the tool function to use is "get_current_weather"
, and the function's input parameter is "{\"location\": \"Shanghai\"}"
.
{
"content": "",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": [
{
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"location\": \"Shanghai\"}",
"name": "get_current_weather"
},
"type": "function",
"index": 0
}
]
}
Note that if the LLM determines that no tool is needed for the question, it replies directly using the content
parameter. For example, if you input "Hello", the tool_calls
parameter is empty, and the returned object is in the following format:
{
"content": "Hello! How can I help you? I'm particularly good at answering questions about the weather or the time.",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": null
}
If thetool_calls
parameter is empty, your program can return thecontent
directly without running the following steps.
If you want the model to select a specific tool every time you initiate function calling, see Forced tool calling.
4. Run the tool function
Running the tool function is the key step that turns the LLM's decision into a real action.
Your computing environment, not the LLM, runs the tool function.
Because the LLM can only output content in string format, you need to parse the string-formatted tool function and input parameters before running the tool function.
Tool function
Create a
function_mapper
that maps the tool function name to the actual function object. This maps the returned tool function string to the actual function object.Input parameters
The input parameters returned by function calling are a JSON string. Use a tool to parse the string into a JSON object to extract the parameter information.
After parsing, pass the parameters to the tool function and run it to obtain the output result.
# Step 5: Run the tool function
# Paste the following code after the code from Step 4.
import json
print("Running the tool function...")
# Get the function name and input parameters from the returned result
function_name = completion.choices[0].message.tool_calls[0].function.name
arguments_string = completion.choices[0].message.tool_calls[0].function.arguments
# Use the json module to parse the parameter string
arguments = json.loads(arguments_string)
# Create a function mapping table
function_mapper = {
"get_current_weather": get_current_weather,
"get_current_time": get_current_time
}
# Get the function object
function = function_mapper[function_name]
# If the input parameters are empty, call the function directly
if arguments == {}:
function_output = function()
# Otherwise, call the function with the parameters
else:
function_output = function(arguments)
# Print the tool's output
print(f"Tool function output: {function_output}\n")
// Step 5: Run the tool function
// Paste the following code after the code from Step 4.
console.log("Running the tool function...");
const function_name = completion.choices[0].message.tool_calls[0].function.name;
const arguments_string = completion.choices[0].message.tool_calls[0].function.arguments;
// Use the JSON module to parse the parameter string
const args = JSON.parse(arguments_string);
// Create a function mapping table
const functionMapper = {
"get_current_weather": getCurrentWeather,
"get_current_time": getCurrentTime
};
// Get the function object
const func = functionMapper[function_name];
// If the input parameters are empty, call the function directly
let functionOutput;
if (Object.keys(args).length === 0) {
functionOutput = func();
} else {
// Otherwise, call the function with the parameters
functionOutput = func(args);
}
// Print the tool's output
console.log(`Tool function output: ${functionOutput}\n`);
After you run the code, you obtain the following output:
Today Shanghai is Cloudy.
In real business scenarios, the core function of many tools is to perform specific actions (such as sending emails or uploading files) rather than querying data. These tools do not output a string after execution. To help the LLM understand the tool's running status, we recommend adding status description messages such as "Email sent successfully" or "Operation failed" when designing such tools.
5. Let the LLM summarize the tool function output
The output format of tool functions is relatively fixed. If you return the output directly to the user, the tone may be stiff and inflexible. If you want the LLM to generate a natural-language response that combines the user's input and the tool's output, you can submit the tool output to the model's context and send another request to the model.
Add the assistant message
After you initiate function calling, you can obtain the assistant message from
completion.choices[0].message
. First, add this message to the messages array.Add the tool message
Add the tool's output to the messages array in the format
{"role": "tool", "content": "tool_output", "tool_call_id": completion.choices[0].message.tool_calls[0].id}
.NoteMake sure the tool's output is in string format.
The
tool_call_id
is a unique identifier that the system generates for each tool call request. The model may request to call multiple tools at once. When you return multiple tool results to the model, thetool_call_id
ensures that the output of each tool corresponds to its calling intent.
# Step 6: Submit the tool output to the LLM
# Paste the following code after the code from Step 5.
messages.append(completion.choices[0].message)
print("Assistant message added.")
messages.append({"role": "tool", "content": function_output, "tool_call_id": completion.choices[0].message.tool_calls[0].id})
print("Tool message added.\n")
// Step 6: Submit the tool output to the LLM
// Paste the following code after the code from Step 5.
messages.push(completion.choices[0].message);
console.log("Assistant message added.")
messages.push({
"role": "tool",
"content": functionOutput,
"tool_call_id": completion.choices[0].message.tool_calls[0].id
});
console.log("Tool message added.\n");
The messages array is now:
[
System Message -- Guides the model's tool calling strategy
User Message -- The user's question
Assistant Message -- The tool call information returned by the model
Tool Message -- The tool's output information (there may be multiple Tool Messages if you use parallel tool calling, as described below)
]
After updating the messages array, run the following code.
# Step 7: Let the LLM summarize the tool output
# Paste the following code after the code from Step 6.
print("Summarizing tool output...")
completion = function_calling()
// Step 7: Let the LLM summarize the tool output
// Paste the following code after the code from Step 6.
console.log("Summarizing tool output...");
const completion_1 = await functionCalling();
You can obtain the response content from content
: "Today, it is cloudy in Shanghai. If you have any other questions, feel free to ask."
{
"content": "Today, it is cloudy in Shanghai. If you have any other questions, feel free to ask.",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": null
}
You have now completed a full function calling process.
Advanced usage
Multi-turn conversation
A user might ask "What's the weather in Singapore?" in the first turn, and then "What about Shanghai?" in the second. If the model's context does not include the information from the first turn, it cannot determine which tool to call. In multi-turn conversation scenarios, we recommend that you maintain the messages array after each turn. Add the new user message to it and then initiate function calling and the subsequent steps. The messages structure is as follows:
[
System Message -- Guides the model's tool calling strategy
User Message -- The user's question
Assistant Message -- The tool call information returned by the model
Tool Message -- The tool's output information
Assistant Message -- The model's summary of the tool call information
User Message -- The user's second-turn question
]
Streaming
To improve user experience and reduce waiting time, you can use streaming output to obtain the tool function name and input parameter information in real time. In this case:
Tool function name: Appears only in the first streamed response object (delta).
Input parameter information: Streams the output.
from openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
# If you use a model in the China (Singapore) region, replace this with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for when you want to query the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "A city or district, such as Singapore, London, or Shanghai.",
}
},
"required": ["location"],
},
},
},
]
stream = client.chat.completions.create(
model="qwen-plus",
messages=[{"role": "user", "content": "London weather?"}],
tools=tools,
stream=True
)
for chunk in stream:
delta = chunk.choices[0].delta
print(delta.tool_calls)
import { OpenAI } from "openai";
const openai = new OpenAI(
{
// If you use a model in the China (Singapore) region, you must use an API key from that region. Get it here: https://bailian.console.alibabacloud.com/?tab=model#/api-key
// If you have not configured an environment variable, replace the next line with: apiKey: "sk-xxx",
apiKey: process.env.DASHSCOPE_API_KEY,
// If you use a model in the China (Singapore) region, replace the baseURL with: https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
}
);
const tools = [
{
"type": "function",
"function": {
"name": "getCurrentWeather",
"description": "Useful for when you want to query the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "A city or district, such as Singapore, London, or Shanghai."
}
},
"required": ["location"]
}
}
}
];
const stream = await openai.chat.completions.create({
model: "qwen-plus",
messages: [{ role: "user", content: "Singapore weather" }],
tools: tools,
stream: true,
});
for await (const chunk of stream) {
const delta = chunk.choices[0].delta;
console.log(delta.tool_calls);
}
After you run the code, you obtain the following output:
[ChoiceDeltaToolCall(index=0, id='call_8f08d2b0fc0c4d8fab7123', function=ChoiceDeltaToolCallFunction(arguments='{"location":', name='get_current_weather'), type='function')]
[ChoiceDeltaToolCall(index=0, id='', function=ChoiceDeltaToolCallFunction(arguments=' "London"}', name=None), type='function')]
None
Run the following code to assemble the input parameter information (arguments
):
tool_calls = {}
for response_chunk in stream:
delta_tool_calls = response_chunk.choices[0].delta.tool_calls
if delta_tool_calls:
for tool_call_chunk in delta_tool_calls:
call_index = tool_call_chunk.index
if call_index not in tool_calls:
tool_calls[call_index] = tool_call_chunk
else:
tool_calls[call_index].function.arguments += tool_call_chunk.function.arguments
print(tool_calls[0].model_dump_json())
const toolCalls = {};
for await (const responseChunk of stream) {
const deltaToolCalls = responseChunk.choices[0]?.delta?.tool_calls;
if (deltaToolCalls) {
for (const toolCallChunk of deltaToolCalls) {
const index = toolCallChunk.index;
if (!toolCalls[index]) {
toolCalls[index] = { ...toolCallChunk };
if (!toolCalls[index].function) {
toolCalls[index].function = { name: '', arguments: '' };
}
}
else if (toolCallChunk.function?.arguments) {
toolCalls[index].function.arguments += toolCallChunk.function.arguments;
}
}
}
}
console.log(JSON.stringify(toolCalls[0]));
You obtain the following output:
{"index":0,"id":"call_16c72bef988a4c6c8cc662","function":{"arguments":"{\"location\": \"London\"}","name":"get_current_weather"},"type":"function"}
When using the LLM to summarize the tool function output, the added assistant message needs to be in the format below. Simply replace the elements in tool_calls
with the content from above.
{
"content": "",
"refusal": None,
"role": "assistant",
"audio": None,
"function_call": None,
"tool_calls": [
{
"id": "call_xxx",
"function": {
"arguments": '{"location": "xx"}',
"name": "get_current_weather",
},
"type": "function",
"index": 0,
}
],
}
Specify the tool calling method
Parallel tool calling
A weather query for a single city requires only one tool call. If an input question requires multiple tool calls, such as "What's the weather in Singapore and Shanghai?" or "What's the weather in London, and what time is it now?", only one piece of tool call information is returned after you initiate function calling. For example, for the question "What's the weather in Singapore and Shanghai?":
{
"content": "",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": [
{
"id": "call_61a2bbd82a8042289f1ff2",
"function": {
"arguments": "{\"location\": \"Singapore\"}",
"name": "get_current_weather"
},
"type": "function",
"index": 0
}
]
}
The result only contains the input parameter information for Singapore. To solve this, when you initiate function calling, you can set the request parameter parallel_tool_calls
to true
. This way, the returned object will contain all the tool functions and input parameters that need to be called.
Parallel tool calling is suitable for tasks that have no dependencies. If dependencies exist between tasks (for example, the input for Tool A depends on the output of Tool B), see Getting started, and use a while loop to implement serial tool calling (calling one tool at a time).
def function_calling():
completion = client.chat.completions.create(
model="qwen-plus", # Using qwen-plus as an example. You can change the model name as needed.
messages=messages,
tools=tools,
# New parameter
parallel_tool_calls=True
)
print("Returned object:")
print(completion.choices[0].message.model_dump_json())
print("\n")
return completion
print("Initiating function calling...")
completion = function_calling()
async function functionCalling() {
const completion = await openai.chat.completions.create({
model: "qwen-plus", // Using qwen-plus as an example. You can change the model name as needed.
messages: messages,
tools: tools,
parallel_tool_calls: true
});
console.log("Returned object:");
console.log(JSON.stringify(completion.choices[0].message));
console.log("\n");
return completion;
}
const completion = await functionCalling();
The tool_calls
array in the returned object contains the input parameter information for both Singapore and Shanghai:
{
"content": "",
"role": "assistant",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Singapore\"}"
},
"index": 0,
"id": "call_c2d8a3a24c4d4929b26ae2",
"type": "function"
},
{
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Shanghai\"}"
},
"index": 1,
"id": "call_dc7f2f678f1944da9194cd",
"type": "function"
}
]
}
Forced tool calling
LLM-generated content has a degree of uncertainty, and sometimes the model may choose the wrong tool to call. If you want the LLM to adopt a user-defined strategy for a certain type of question (such as forcing the use of a certain tool or forcing no tool use), you can modify the tool_choice
parameter. The default value of the tool_choice
parameter is "auto"
, which means the LLM decides on its own how to call tools.
When the LLM summarizes the tool function output, remove the tool_choice
parameter. Otherwise, the API will still return tool call information.
Force a specific tool
If you want function calling to force the use of a specific tool for a certain type of question, you can set the
tool_choice
parameter to{"type": "function", "function": {"name": "the_function_to_call"}}
. The LLM will not participate in tool selection and will only output the input parameter information.Assume that the current scenario only involves weather query questions. You can modify the function_calling code to:
def function_calling(): completion = client.chat.completions.create( model="qwen-plus", messages=messages, tools=tools, tool_choice={"type": "function", "function": {"name": "get_current_weather"}} ) print(completion.model_dump_json()) function_calling()
async function functionCalling() { const response = await openai.chat.completions.create({ model: "qwen-plus", messages: messages, tools: tools, tool_choice: {"type": "function", "function": {"name": "get_current_weather"}} }); console.log("Returned object:"); console.log(JSON.stringify(response.choices[0].message)); console.log("\n"); return response; } const response = await functionCalling();
No matter what question is asked, the tool function in the returned object will be
get_current_weather
.Before you use this strategy, make sure the question is relevant to the selected tool. Otherwise, you may obtain unexpected results.
Force no tools
If you want to never make a tool call regardless of the input question (the returned object contains response
content
and thetool_calls
parameter is empty), you can set thetool_choice
parameter to"none"
, or do not pass thetools
parameter. Thetool_calls
parameter returned by function calling will then always be empty.Assume that no questions in the current scenario require a tool call. You can modify the function_calling code to:
def function_calling(): completion = client.chat.completions.create( model="qwen-plus", messages=messages, tools=tools, tool_choice="none" ) print(completion.model_dump_json()) function_calling()
async function functionCalling() { const completion = await openai.chat.completions.create({ model: "qwen-plus", messages: messages, tools: tools, tool_choice: "none" }); console.log("Returned object:"); console.log(JSON.stringify(completion.choices[0].message)); console.log("\n"); return completion; } const completion = await functionCalling();
Going live
Test tool calling accuracy
Establish an evaluation system:
Build a test dataset that is close to your real business scenarios and define clear evaluation metrics, such as tool selection accuracy, parameter extraction accuracy, and end-to-end success rate.
Optimize prompts
The core tuning method is to optimize system prompts, tool descriptions, and parameter descriptions based on the specific problems exposed during testing (such as wrong tool selection or incorrect parameters).
Upgrade the model
When prompt engineering tuning fails to improve performance, upgrading to a more capable model version (such as
qwen3-max-preview
) is the most direct and effective way to improve metrics.
Dynamically control the number of tools
When an application integrates dozens or even hundreds of tools, providing the entire tool library to the model can cause the following problems:
Performance drop: It will be dramatically harder for the model to select the correct tool from an enormous toolset.
Cost and latency: Many tool descriptions consume an enormous amount of input tokens, leading to increased costs and slower responses.
The solution is to add a tool routing or retrieval layer before calling the model. This layer quickly and accurately filters a small, relevant subset of tools from the complete tool library based on the user's current query, and then provides it to the model.
Common methods for implementing tool routing:
Semantic search
In advance, convert the
description
of all tools into embeddings using an embedding model and store them in a vector store. When a user makes a query, use vector similarity search to recall the top-K most relevant tools.Hybrid search
Combine the "blur" matching capability of semantic search with the "exact match" capability of traditional keywords or metadata tags. You can add
tags
orkeywords
fields to the tools. Performing vector search and keyword filtering at the same time can significantly improve the recall precision for high-frequency or specific scenarios.Lightweight LLM router
For more complex routing logic, you can use a smaller, faster, and cheaper model (such as Qwen-Flash) as a front-end "routing model". Its task is to output a list of relevant tool names based on the user's question.
Best practices
Keep the candidate set concise: Regardless of the method used, the number of tools provided to the main model should not exceed 20. This is the optimal balance between the model's cognitive load, cost, latency, and accuracy.
Layered filtering policy: You can build a funnel-style routing policy. For example, first use a very low-cost keyword or rule matching for the first round of screening to filter out obviously irrelevant tools. Then, perform a semantic search on the remaining tools to improve efficiency and quality.
Tool security principles
When you open up tool execution capabilities to an LLM, you must prioritize security. The core principles are "least privilege" and "human confirmation".
Principle of least privilege: The toolset provided to the model should strictly adhere to the principle of least privilege. By default, tools should be read-only (such as querying weather or searching documents). Avoid directly providing any "write" permissions that involve state changes or resource operations.
Isolate dangerous tools: Do not directly provide dangerous tools to the LLM, such as those that execute arbitrary code (
code interpreter
), operate on the file system (fs.delete
), perform database delete or update operations (db.drop_table
), or involve financial transactions (payment.transfer
).Human-in-the-loop: For all high-privilege or irreversible operations, you must introduce a manual review and confirmation step. The model can generate an operation request, but the final execution "button" must be clicked by a human user. For example, the model can prepare an email, but the user must confirm before is sent.
Improve user experience
The function calling chain is long, and a problem in any link can lead to a poor user experience.
Processing tool failure
Tool execution failures are common. You can adopt the following strategies:
Maximum retries: Set a reasonable retry limit, for example, 3 times, to avoid long user waits or wasted system resources due to continuous failures.
Provide a fallback message: When retries are exhausted or an unresolvable error occurs, you should return a clear and friendly prompt to the user, such as: "Sorry, I can't find the relevant information at the moment. The service might be busy. Please try again later."
Manage processing latency
High latency reduces user satisfaction and can be improved through frontend interaction and backend optimization.
Set a timeout: Set an independent and reasonable timeout for each step of function calling. If a timeout occurs, immediately interrupt the operation and provide feedback.
Provide instant feedback: When you start executing a function call, we recommend displaying a prompt on the interface, such as "Querying the weather for you ..." or "Searching for relevant information ...", to provide real-time feedback on the processing progress.
Supported models
DeepSeek (China (Beijing) region only, including deepseek-r1, deepseek-r1-0528, and deepseek-v3)
QwQ model and the Qwen3 model in thinking mode support function calling, but their usage differs slightly from the models mentioned above. For more information, see Function calling.
Qwen multimodal models are not currently supported.
We recommend that you choose Qwen-Plus because it offers a good balance of performance, speed, and cost.
If you require high response speed and cost control, we recommend the Qwen-Flash series for commercial models and the small-parameter models of the Qwen3 series for open-source models.
If you require high response accuracy, for commercial models, use the Qwen-Max series. For open-source models, use large-parameter models from the Qwen3 series.
Billing
In addition to the tokens in the messages array, the tool descriptions are also added to the prompt as input tokens and are billed accordingly.
Pass tool information using system message
Error codes
If a call fails, see Error messages for troubleshooting.