- Jul 2024
-
scholar.google.com scholar.google.com
-
Reversible_Object-Oriented Intertgfeters
Tags
Annotators
URL
-
-
-
wie wärs mit selbsthilfe?!
diese passive "wir sind konsumenten" scheisse ist doch genau das problem...
ich hab mir das print buch gekauft für 22 euro, hab den buchrücken aufgeschnitten mit ner kreissäge, und hab die 208 seiten durch meinen ADF scanner gejagt (Brother ADS-3000N, 150eur gebraucht). ohne vorbereitung ist das vielleicht ne halbe stunde arbeit. dann noch die scans rotieren, croppen, leveln, und durch tesseract jagen. für tesseract braucht man ne schnelle CPU.
aktuell tu ich die hocr dateien von tesseract korrekturlesen, später werd ich ne pdf draus machen und über libgen.rs auf annas-archive.org hochladen - ein problem weniger.
hocr dateien hab ich hochgeladen auf https://github.com/milahu/enteignung - vielleicht mag wer helfen beim korrekturlesen, dann gehts 1 oder 2 tage schneller.
mann mann mann... als "IT insider" bin ich so gelangweilt von den normies, die beim thema IT vor 20 jahren stehen geblieben sind, kein plan haben von linux, git, python, torproject, monero, ... aber hauptsache scheisse labern in telegram >: (
-
- Mar 2024
-
www.youtube.com www.youtube.com
-
ChatGPT Vision: The Best Way to Transform Your Paper Notes Into Digital Text
Upload a photo into ChatGPT and request it to transcribe the photo into text. Better than OCR? It creates meaning out of surrounding context; even though words may be wrong.
Tags
Annotators
URL
-
- Nov 2023
-
docdrop.org docdrop.org
-
Can be used to create optical character recognition on .pdf documents and return documents with selectable/machine readable text.
Tags
Annotators
URL
-
- Sep 2023
-
docdrop.org docdrop.org
Tags
Annotators
URL
-
- Jan 2023
-
docdrop.org docdrop.org
-
- Oct 2022
-
-
Worried about paper cards being lost or destroyed .t3_y77414._2FCtq-QzlfuN-SwVMUZMM3 { --postTitle-VisitedLinkColor: #9b9b9b; --postTitleLink-VisitedLinkColor: #9b9b9b; --postBodyLink-VisitedLinkColor: #989898; } I am loving using paper index cards. I am, however, worried that something could happen to the cards and I could lose years of work. I did not have this work when my notes were all online. are there any apps that you are using to make a digital copy of the notes? Ideally, I would love to have a digital mirror, but I am not willing to do 2x the work.
u/LBHO https://www.reddit.com/r/antinet/comments/y77414/worried_about_paper_cards_being_lost_or_destroyed/
As a firm believer in the programming principle of DRY (Don't Repeat Yourself), I can appreciate the desire not to do the work twice.
Note card loss and destruction is definitely a thing folks have worried about. The easiest thing may be to spend a minute or two every day and make quick photo back ups of your cards as you make them. Then if things are lost, you'll have a back up from which you can likely find OCR (optical character recognition) software to pull your notes from to recreate them if necessary. I've outlined some details I've used in the past. Incidentally, opening a photo in Google Docs will automatically do a pretty reasonable OCR on it.
I know some have written about bringing old notes into their (new) zettelkasten practice, and the general advice here has been to only pull in new things as needed or as heavily interested to ease the cognitive load of thinking you need to do everything at once. If you did lose everything and had to restore from back up, I suspect this would probably be the best advice for proceeding as well.
Historically many have worried about loss, but the only actual example of loss I've run across is that of Hans Blumenberg whose zettelkasten from the early 1940s was lost during the war, but he continued apace in another dating from 1947 accumulating over 30,000 cards at the rate of about 1.5 per day over 50 some odd years.
-
- Sep 2022
-
paperwebsite.com paperwebsite.com
Tags
Annotators
URL
-
-
tesseract.projectnaptha.com tesseract.projectnaptha.com
-
- Aug 2022
-
www.reddit.com www.reddit.com
-
Digitizing and compressing notes - Question
reply to: https://www.reddit.com/r/antinet/comments/wv9hvq/digitizing_and_compressing_notes_question/
I've got a process I still use, though less frequently, that does both photos as well as optical character recognition (OCR) to digitize the words: https://boffosocko.com/2021/12/20/handwriting-my-website-with-a-digital-amanuensis/ The comments have some rich commentary with related ideas as well.
-
- Jun 2022
-
pdf.abbyy.com pdf.abbyy.com
-
I've used ABBY FineReader (best on Windows) and it was much better at correcting OCR than Adobe Acrobat. —Dana Conard
-
-
vision.cornell.edu vision.cornell.edu
-
COCO-Text: Dataset for Text Detection and Recognition
- 63K images
- 145K text instances
- Feature labels: machine printed / handwritten. Legible / illegible, English / non-English script
See also the COCO-Text V2 site.
Tags
Annotators
URL
-
- Feb 2022
-
-
Free All-in-one PDF tools A reliable, intuitive and productive PDF Software
-
- Dec 2021
- Nov 2021
-
www.myscript.com www.myscript.comMyScript1
- Jul 2021
-
textsniper.app textsniper.app
-
A paid Apple based tool for text recognition and extraction
Tags
Annotators
URL
-
-
Local file Local file
-
T.LUCRETICARI
Not going to be the prettiest version, but at least somewhat OCR'd for annotating!
-
-
inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
-
Titi Lucreti Cari De Rerum Natura Libri SexWith a Translation and NotesVolume 1Edited by H. A. J. Munro Lucretius
Testing out the OCR functionality of docdrop.org.
I'm noticing that the pdf fingerprint of this text somehow matches that of other texts as there are a lot of non-related annotations on this page.
Is docdrop doing something squirrelly with the fingerprint @dwhly?
-
- Feb 2021
-
web.hypothes.is web.hypothes.is
- Jan 2021
-
dev.clariah.nl dev.clariah.nl
-
Apart from a basic segmenter taken from OCRopus a trainable line extractor is in the process of being implemented. Full trainability of layout analysis is of utmost importance to a truly universal OCR system, as text layout and its semantics varies widely across time and space, e.g. hand-crafted methods for printed Latin text are unlikely to work reliably on Arabic text or manuscripts with extensive interlinear annotation.
wip implementation of line segmentation in kraken
Tags
Annotators
URL
-
-
www.morethantechnical.com www.morethantechnical.com
-
nice recipe for quickly turning a scanned PDF into a searchable one
-
- Oct 2020
-
myscript.com myscript.com
-
MyScript MathPad
This looks like something I could integrate into my workflow.
-
- Jul 2020
- Apr 2020
-
web.hypothes.is web.hypothes.is
-
Adobe AcrobatPro.
gImageReader is an excellent open source alternative. It runs both on Windows and Linux, and it provides a simple (yet powerful) frontend GUI to Google's robust open source OCR engine, Tesseract.
I think an open source tool as this is a better fit to the open annotation ecosystem, based on libre software and standards, that Hypothesis promotes, instead of a proprietary (and expensive) tool such as Adobe AcrobatPro.
Tags
Annotators
URL
-
-
tesseract-ocr.github.io tesseract-ocr.github.io
-
tessdoc Tesseract documentation
-
- Apr 2019
-
www.archimag.com www.archimag.com
- Sep 2015
-
groups.google.com groups.google.com
Tags
Annotators
URL
-
- Aug 2015
-
-
$?
♀♀
-
<^S
♂♂
-