qwen2-vl
Here are 55 public repositories matching this topic...
Colaboratory上でQwenLM/Qwen2-VLをお試しするサンプル
-
Updated
Sep 4, 2024 - Jupyter Notebook
Qwen2-VL在文旅领域的LLaMA-Factory微调案例 The case for fine-tuning Qwen2-VL in the field of historical literature and museums
-
Updated
Sep 17, 2024
practical projects using LLM, VLM and Diffusion models
-
Updated
Sep 29, 2024 - Jupyter Notebook
This project demonstrates how to use the Qwen2-VL model from Hugging Face for Optical Character Recognition (OCR) and Visual Question Answering (VQA). The model combines vision and language capabilities, enabling users to analyze images and generate context-based responses.
-
Updated
Oct 18, 2024 - Jupyter Notebook
基于多模态大模型的智能搜索助手,通过AI技术实现小红书平台的智能化信息检索和知识整合|An intelligent search assistant based on multimodal large models, enabling smart information retrieval and knowledge integration on the Xiaohongshu platform.
-
Updated
Nov 6, 2024 - Python
An open-source server implementation for inference Qwen2-VL series model using fastapi.
-
Updated
Nov 20, 2024 - Python
This repo contains the winning code for Amazon ML Challenge 2024. The challenge was to develop a Machine Learning model to extract product entity details directly from the product images.
-
Updated
Nov 30, 2024 - Python
"Smart Vision Technology for Quality Control" uses computer vision to automate product inspections, extracting details like product name, quantity, expiry date, and freshness from images. Built for Flipkart Grid 6.0, it enhances accuracy and efficiency in quality control, minimizing manual checks.
-
Updated
Dec 4, 2024 - Jupyter Notebook
-
Updated
Dec 27, 2024 - Python
we finetune unsloth llama model to extract mathematical fomulas in the images with optical character recognition(OCR)
-
Updated
Jan 8, 2025 - Jupyter Notebook
This project performs multimodal document analysis and query retrieval by downloading PDFs, converting pages to images, indexing them for semantic search, and analyzing retrieved images using visual-language models like Qwen2VL and Blip2.
-
Updated
Jan 11, 2025 - Jupyter Notebook
A workshop for collections of multi-modal LLM examples, samples, reference architecture and demos on Amazon SageMaker.
-
Updated
Mar 16, 2025 - Jupyter Notebook
A multimodal RAG application using Qwen 2.5 VL, ColPali, and QdrantDB for text and image-based retrieval.
-
Updated
Mar 20, 2025 - Jupyter Notebook
Messy Handwriting OCR Comparison Between Aya-Vision-8B and Qwen2VL-OCR-2B
-
Updated
Mar 22, 2025 - Python
A Challenging Multi-Modal Mathematical Reasoning Benchmark
-
Updated
Apr 13, 2025 - JavaScript
A AI- Powered Document organizer tool. It displays a small cute robot on the screen. Give it any file and a small description (optional), It will analyse the contents and description and save it on cloud. When needed, just double click on it, enter the description/keywords for the file you are looking for, It will open the best matching file/
-
Updated
Apr 15, 2025 - C
Improve this page
Add a description, image, and links to the qwen2-vl topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the qwen2-vl topic, visit your repo's landing page and select "manage topics."