daiR: OCR with Google Document AI in R

daiR is an R package for Google Document AI, a powerful server-based OCR processor with support for over 60 languages. The package provides an interface for the Document AI API and comes with additional tools for output file parsing and text reconstruction. See the daiR website for more details.

Use

Quick OCR short documents:

## NOT RUN
library(daiR)
response <- dai_sync("file.pdf")
text <- text_from_dai_response(response)
cat(text)

Batch process asynchronously via Google Storage:

## NOT RUN
library(googleCloudStorageR)
library(purrr)
my_files <- c("file1.pdf", "file2.pdf", "file3.pdf")
map(my_files, gcs_upload)
dai_async(my_files)
contents <- gcs_list_objects()
output_files <- grep("json$", contents$name, value = TRUE)
map(output_files, ~ gcs_get_object(.x, saveToDisk = file.path(tempdir(), .x)))
sample_text <- text_from_dai_file(file.path(tempdir(), output_files[1]))
cat(sample_text)

Turn images of tables into R dataframes:

## NOT RUN:
response <- dai_sync_tab("tables.pdf")
dfs <- tables_from_dai_response(response)

Requirements

Google Document AI is a paid service that requires a Google Cloud account and a Google Storage bucket. I recommend using Mark Edmondson's googleCloudStorageR package in combination with daiR.

Installation

Install the latest development version from Github:

devtools::install_github("hegghammer/daiR")

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
R		R
build		build
inst		inst
man		man
tests		tests
vignettes		vignettes
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
MD5		MD5
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

daiR: OCR with Google Document AI in R

Use

Requirements

Installation

About

Uh oh!

Releases

Packages

Languages

License

cran/daiR

Folders and files

Latest commit

History

Repository files navigation

daiR: OCR with Google Document AI in R

Use

Requirements

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages