Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View PRITHIVSAKTHIUR's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@Stranger-Zone @Stranger-Guard

Block or report PRITHIVSAKTHIUR

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Demonstration for the Lightricks LTX-2 Distilled model, enhanced with specialized LoRA adapters for cinematic camera movements (dolly left/right/in/out, jib up/down, static). Generates animated vid…

Python 1 Updated Jan 11, 2026

Demonstration for the Qwen/Qwen-Image-Edit-2511 model, specialized in object manipulation via lazy-loaded LoRA adapters. Supports adding or removing specific elements (e.g., logos, accessories, clo…

Python 1 Updated Jan 5, 2026

Demonstration for the Qwen-Image-Edit-2511 model with lazy-loaded LoRA adapters for advanced single- and multi-image editing. Supports 7 specialized LoRAs including photo-to-anime, multi-angle came…

Python 2 Updated Jan 16, 2026

A Gradio app with Rerun visualization for Microsoft's TRELLIS.2-4B model that generates textured 3D assets (GLB) from text or images using a two-stage pipeline: text-to-image (Z-Image-Turbo) then i…

Python 4 Updated Dec 28, 2025

Experimental demonstration for the Qwen/Qwen-Image-Edit-2511 model with lazy-loaded LoRA adapters supporting multi-image input editing. Users can upload one or more images (gallery format) and appl…

Python 7 2 Updated Dec 27, 2025

Experimental demonstration for the Qwen/Qwen-Image-Edit-2511 model with lazy-loaded LoRA adapters for single-image editing tasks. Features specialized edits like photo-to-anime conversion and multi…

Python 1 Updated Dec 27, 2025

Demonstration for NVIDIA's Nemotron-Parse-v1.1 model, designed for advanced document parsing and OCR. Upload images of documents (e.g., papers, forms) to extract structured content: text, tables (L…

Python 2 Updated Dec 24, 2025

Demonstration for the Qwen/Qwen-Image-Edit-2509 model, featuring lazy-loaded LoRA adapters for fast, specialized image edits like photo-to-anime conversion, angle changes, lighting restoration, ski…

Python 1 Updated Dec 23, 2025

Demonstration for the Qwen/Qwen-Image-Edit-2509 model, enhanced with lazy-loaded LoRA adapters for specialized image editing tasks like texture application, object fusion, material transfer, and li…

Python 1 Updated Dec 22, 2025

A Gradio-based demonstration for the AllenAI SAGE-MM-Qwen3-VL-4B-SFT_RL multimodal model, specialized in video reasoning tasks. Users upload MP4 videos, provide natural language prompts (e.g., "Des…

Python 5 Updated Dec 21, 2025

TRELLIS.2-Text-to-3D is an end-to-end Text-to-3D and Image-to-3D generation app that enables users to create high-quality 3D GLB assets either by generating an image from a text prompt or by upload…

Python 1 Updated Dec 22, 2025

A Gradio-based demonstration for the AllenAI Molmo2-8B multimodal model, enabling image QA, multi-image pointing, video QA, and temporal tracking. Users upload images or videos, provide natural lan…

Python 3 Updated Dec 24, 2025

A Gradio-based demonstration for the Tongyi-MAI/Z-Image-Turbo diffusion pipeline, enhanced with a curated collection of LoRAs (Low-Rank Adaptations) for style transfer and creative image generation…

Python 2 1 Updated Dec 24, 2025

A Gradio-based demonstration for the prithivMLmods/Gliese-CUA-Tool-Call-8B model, specialized in GUI element localization. Users upload UI screenshots, provide task instructions (e.g., "Click on t…

Python 1 Updated Dec 15, 2025

A Gradio-based demonstration for the prithivMLmods/Gliese-CUA-Tool-Call-8B model, a Computer Use Agent (CUA) specialized in GUI understanding and tool-calling actions.

Python 1 Updated Dec 15, 2025

Demo: Herculis-CUA-GUI-Actioner-4B is a Computer Use Agent (CUA) multimodal model designed for GUI understanding, UI localization, and action execution across web, desktop, and mobile environments

Python 1 Updated Dec 14, 2025

Mergekit supports various architectures like Llama, Mistral, and more models, enabling merges on CPU or GPU with low memory needs through lazy tensor loading and out-of-core processing. It handles …

Python 1 Updated Dec 14, 2025

A Gradio-based demonstration for Computer Use Agent (CUA) tasks, supporting multiple vision-language models: Microsoft Fara-7B, ByteDance UI-TARS-1.5-7B, Hcompany Holo2-4B, and Uniphore ActIO-UI-7B…

Python 3 Updated Dec 24, 2025

A Gradio-based demonstration for the Microsoft Fara-7B model, designed as a computer use agent. Users upload UI screenshots (e.g., desktop or app interfaces), provide task instructions (e.g., "Clic…

Python 5 Updated Dec 8, 2025

A Gradio-based demo for end-to-end vision-to-speech inference: Extract text or descriptions from images using Qwen2.5-VL-7B-Instruct, then convert to natural speech audio via Microsoft VibeVoice-Re…

Python 3 1 Updated Dec 8, 2025

A Gradio-based demonstration application for the Tencent HunyuanOCR model, focused on optical character recognition (OCR) tasks such as text detection, extraction, and coordinate formatting from im…

Python 3 Updated Dec 4, 2025

A Gradio-based demo application for comparing state-of-the-art OCR models: DeepSeek-OCR, Dots.OCR, HunyuanOCR, and Nanonets-OCR2-3B.

Python 9 2 Updated Nov 28, 2025

A production-ready tool for libraries to retrieve digital copies from Google Books.

Python 10 1 Updated Jan 5, 2026

A web-based sketching application that allows users to draw or sketch ideas on a canvas and transform them into generated images using Google's Gemini AI models.

TypeScript 5 Updated Nov 23, 2025

Qwen-Image-Edit-2509-LoRAs-Fast-Fusion is a fast, interactive web application built with Gradio that enables advanced image editing using the Qwen/Qwen-Image-Edit-2509 model from Alibaba's Qwen tea…

Python 5 3 Updated Dec 12, 2025

SAM3 Image Segmentation is a user-friendly web application built with Gradio that leverages the Segment Anything Model 3 (SAM3) from Meta AI to perform zero-shot instance segmentation on images usi…

Python 6 1 Updated Nov 22, 2025

Qwen3-VL-4B-Instruct model from Alibaba's Qwen series for multimodal tasks involving images and text. It enables users to upload an image and perform various vision-language tasks, such as querying…

Python 5 Updated Nov 18, 2025

This demonstrates the process of adapting a large scale pretrained model, MetaCLIP 2, for fine tuning a specific downstream task: image classification.

Jupyter Notebook 2 Updated Nov 15, 2025

Qwen-Image-Edit-2509-LoRAs-Fast is a high-performance, user-friendly web application built with Gradio that leverages the advanced Qwen/Qwen-Image-Edit-2509 model from Hugging Face for seamless ima…

Python 14 2 Updated Dec 23, 2025
Next