-
-
Notifications
You must be signed in to change notification settings - Fork 198
OCR scan of screenshot #603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've been thinking about this feature for quite some time but haven't found any simple solution. We need to figure out what external library can be used for this and hot to interact with it. But definitely would a nice addition to ksnip. |
https://github.com/tesseract-ocr/ might be something that could work. |
Exactly! ShareX uses the tesseract OCR engine. And it works for them. |
@fnkabit what do you think about this feature? Would be nice to have something like this in the application but I personally haven't worked with OCR libraries yet. Do you have any experience? |
@DamirPorobic . Sure, would like to work on this. |
@DamirPorobic To answer your question, I don't have any experience working with OCR. |
Maybe @hum4nizer can describe how it works in ShareX but for beginners I had in mind to have a button in the file menu that triggers OCR and displays a dialog with all text it has found, something in that direction, later on we can get more fancy. Regarding the code, I think it would be nice to have a nice separation, hide the OCR stuff behind an adapter and an interface. Also, maybe we should consider that Tesseract might not be always available when building ksnip so cmake should check for it when building and when not found the option in the filemenu should be grayed out, something like that. This is probably a larger feature, maybe start small and see how it works. |
@DamirPorobic Sounds good. |
Not far from what I had in mind. Would be cool to be able to do something like this. |
@DamirPorobic Thanks ! I will implement this next week. |
Hi! Im really excited about this new feature. Good luck with the implementation. |
@hum4nizer Thank you ! Sorry guys, didn't have time to start working on this; have been really busy the last two weeks. |
I'm hoping to look into this for the next minor release. An issue that I'm still thinking about is what way to go, either implement it in ksnip or call an external tool from ksnip. What worries me with implementing in knsip directly is the size of such OCR software, it seems to be much larger then ksnip. |
Maybe a plugin approach could be doable, something like done here https://doc.qt.io/qt-5/qtwidgets-tools-echoplugin-example.html |
There's https://ocr.space/OCRAPI Each user could register her/his own API key |
I had more of a local version in mind, without an API. You trigger OCR and get a dialog window with all the text that you can then copy or whatever. |
I've contributed the API option because you said:
When installing the tesseract package under Ubuntu it takes 16,3 MB (tesseract-ocr, tesseract-ocr-eng, tesseract-ocr-osd). If I add tesseract-ocr-fra (French language), it takes 1 145 KB ksnip-1.9.1.deb is 710 KB There's also https://github.com/PaddlePaddle/PaddleOCR, it says on the github page:
I haven't used https://ocr.space/OCRAPI directly. I know about it because I'm using the Copyfish browser extension (https://addons.mozilla.org/fr/firefox/addon/copyfish-ocr-software/) which works well for my needs (copy text from images or PDF files while I'm browsing). |
It looks way better in the command line test for sure! Good luck with the development. I'm really looking forward for this feature. Thanks! |
This is implemented now, I have to write some tests but in general can be tested now. Let me know what you think. |
This look really amazing, how can I test it? When I click the OCR button on Options sm26@sm26-Latitude-3420:~$ ksnip -v OCR used:ksnip-plugin-ocr-0.1.0-continuous.deb |
That should be actually working. Do you see any message text there saying that the text is being processed? |
@DamirPorobic nope, I don't see any message. where should I look? the windows of OCR is a text box I can edit, but it doesn't have anything in there ATM. |
There, before the inner text box comes where you write, there should be a label saying something like "Processing text..." and when done, the label is hidden and the text box comes up. Strange, doesn't look right. One more thing, can you try some black text on white background? Maybe with few sentences so the process takes a few seconds more. |
Ok, thanks, must have a look into the code, it seems to have a bug |
hey all, how do I build the ksnip-plugin-ocr? Theres no make file. And no instructions to do so in Windows. Thanks! |
There are prebuild binaries, also for windows https://github.com/ksnip/ksnip-plugin-ocr/releases Building it locally is quit cumbersome due to a lot of dependencies of OCR. If you still want to build it locally, you can see how the pipeline builds it https://github.com/ksnip/ksnip-plugin-ocr/blob/master/.github/workflows/windows.yml |
Okay thanks! that's good enough. For some reason I hadn't seen it. I thought it was TAR ball or something. |
Is your feature request related to a problem? Please describe.
No. It is not related to a problem.
Describe the solution you'd like
I would like the feature to OCR scan the screenshot when a screenshot is taken to extract text from the picture.
Additional context
Thanks a bunch for a awesome piece of software!
The text was updated successfully, but these errors were encountered: