Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix prompt format for image description in prompt array #1516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 29, 2025

Conversation

vinlim
Copy link
Contributor

@vinlim vinlim commented Jul 28, 2025

Fix the prompt format from

const prompt = [
    `Please provide a functional, objective description of the provided image in no more than around 30 words so that someone who could not see it would be able to imagine it. If possible, follow an “object-action-context” framework. The object is the main focus. The action describes what’s happening, usually what the object is doing. The context describes the surrounding environment. If there is text found in the image, do your best to transcribe the important bits, even if it extends the word count beyond 30 words. It should not contain quotation marks, as those tend to cause issues when rendered on the web. If there is no text found in the image, then there is no need to mention it. You should not begin the description with any variation of “The image”.`,
    { type: 'image', content: imageBitmap }
  ];

to updated requirement

const prompt = [{
    role: 'user',
    content: [
      {
        type: 'text',
        value: `Please provide a functional, objective description of the provided image in no more than around 30 words so that someone who could not see it would be able to imagine it. If possible, follow an “object-action-context” framework. The object is the main focus. The action describes what’s happening, usually what the object is doing. The context describes the surrounding environment. If there is text found in the image, do your best to transcribe the important bits, even if it extends the word count beyond 30 words. It should not contain quotation marks, as those tend to cause issues when rendered on the web. If there is no text found in the image, then there is no need to mention it. You should not begin the description with any variation of “The image”.`
      },
      { type: 'image', value: imageBitmap }
    ]
  }];
  return await session.prompt(prompt);

Copy link

google-cla bot commented Jul 28, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@sebastianbenz
Copy link
Collaborator

Thanks for the fix! Can you please sign the CLA?

@vinlim vinlim force-pushed the fix/update-multimodal-prompt branch from b61b2ae to 85d6b33 Compare July 28, 2025 08:34
@vinlim vinlim force-pushed the fix/update-multimodal-prompt branch from 85d6b33 to bec58f5 Compare July 28, 2025 08:42
@vinlim
Copy link
Contributor Author

vinlim commented Jul 28, 2025

No problem. Fixed the Author for CLA purpose. Thanks.

vinlim added 2 commits July 28, 2025 22:11
* Fix syntax error

* Fix prompt format for audio transcription in session.promptStreaming
@vinlim
Copy link
Contributor Author

vinlim commented Jul 29, 2025

The CI/lint failure is resulted by #1515. It can be fixed by merging #1518 first.

@sebastianbenz
Copy link
Collaborator

Thanks a lot!

@sebastianbenz sebastianbenz merged commit 5f6f02b into GoogleChrome:main Jul 29, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants