-
TU Wien
- Vienna, Austria
-
10:49
(UTC +01:00) - https://linty5.github.io/
Highlights
- Pro
Stars
TransNet: A deep network for fast detection of common shot transitions
TransNet V2: Shot Boundary Detection Neural Network
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection - CVPR NAS 2023
Large-scale, Fast and Accurate Shot Boundary Detection through Spatio-temporal Convolutional Neural Networks
ClipShots is the first large-scale dataset for shot boundary detection collected from Youtube and Weibo covering more than 20 categories, including sports, TV shows, animals, etc.
Official implementation of "Implicit Neural Representations with Periodic Activation Functions"
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures
A collection of loss functions for medical image segmentation
The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
This is the official repository for our ECCV 2022 paper titled, "The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing"
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
(Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”
[ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
A PyTorch implementation of "TextFuseNet: Scene Text Detection with Richer Fused Features".
A PyTorch implementation of DTrOCR: Decoder-only Transformer for Optical Character Recognition
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
An implementation of "CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model".
PyTorch code and models for the DINOv2 self-supervised learning method.