Features

1. Extra Accuracy

The "Extra Accuracy" feature enhances data extraction and structuring by leveraging the combined capabilities of the Language Model (LLM). This two-step process ensures not only precise text detection but also intelligent structuring and contextual understanding of the extracted data. It’s particularly valuable when dealing with complex or handwritten documents.

Extra Accuracy for Text Detection and Initial Parsing

Extra accuracy processes single-page images/documents to detect and extract textual content using Optical Character Recognition (OCR).
It outputs a JSON file containing:

All detected text, along with its bounding boxes (position in the image).
Logical hierarchy, including words, lines, paragraphs, and blocks.
Detected tables, key-value pairs, and other structural elements (where applicable).
Confidence scores for text accuracy and structural layout.
Language detection for the extracted text.

The final output is a polished, structured dataset that aligns with your specific use case, such as forms, invoices, or reports.

What Can Extra Accuracy Do?

Text Detection

Extracts all text from the document, supporting both machine-printed and handwritten content.
Provides text hierarchy (e.g., words grouped into lines, paragraphs, or blocks).

Bounding Box Information

Each text element comes with positional data (bounding boxes) to map text locations in the image.

Language Detection

Identifies the language(s) present in the document to improve recognition accuracy.

JSON Output

Encapsulates all parsed data in a machine-readable JSON format.
Includes confidence scores, bounding boxes, and hierarchy.

Single-Page Image Processing

Optimized for individual pages.
Batch processing requires separate requests for each page.

Confidence Score

Words: Assigns confidence scores to each detected word for accuracy analysis.

Example cURL Command

curl --location 'https://prod-ml.fracto.tech/upload-file-smart-ocr' \
--header 'x-api-key: API_KEY' \
--form 'file=@"/Users/gagankapoor/Downloads/e98078f86f705553_4014823993.pdf"' \
--form 'parserApp=PARSER_ID' \
--form 'extra_accuracy="true"' \
--form 'qr_range="2"' \
--form 'page_range="1"' \
--form 'model="v2"'

Changes when using extra accuracy -

Extra_accuracy key is to be sent in the parameter as true or false. Eg -

curl --location 'https://dev-ml.fracto.tech/upload-multiple-files-smart-ocr' \
--header 'x-api-key: API_KEY' \
--form 'extra_accuracy="true"' \

Identifier against parsing job is populated ‘1’ value if extra_accuracy was true in the API call

2. Model

The Model Parameter feature allows users to choose between two distinct processing models tailored to their needs. This provides flexibility in balancing processing speed and capability, depending on the complexity of the task or the desired level of detail in the output.

Available Models

`v2` – Minimum (Mini Model)

Description:
A lightweight model designed for faster processing and minimal resource usage. Ideal for tasks requiring quick, basic text extraction and structuring without advanced contextual refinement.

Use Cases:

Basic OCR tasks (e.g., extracting text from simple invoices, receipts, or forms).
Low-priority or high-volume processing where minimal detail suffices.

`v1` – Regular (Enhanced Model)

Description:
A robust model powered by LLM capable of advanced contextual understanding and structuring. Designed for complex documents or scenarios requiring high accuracy and detailed outputs.

Use Cases:

Advanced document processing (e.g., legal contracts, detailed invoices, or multi-field forms).
Scenarios requiring enhanced structuring, like extracting tables or contextual relationships.

How It Works

Model Selection

Users specify the desired model as a parameter when initiating the API call.

v2 – Minimum
v1 – Regular

Note: There is no change in the output structure of the identifier regardless of the model selected.

Example cURL Command

curl --location 'https://prod-ml.fracto.tech/upload-file-smart-ocr' \
--header 'x-api-key: API_KEY' \
--form 'file=@"/Users/gagankapoor/Downloads/CHENNAI-10.pdf"' \
--form 'parserApp= PARSER_ID' \
--form 'model="v1"'

3. Page Range

The Page Range feature introduces flexibility in processing multi-page documents by allowing users to specify exactly which pages to process.

Page Range Input

Users define the pages to process using the following formats:

Single Page: Specify a single page number (e.g., 3).
Page Range: Define a range of consecutive pages (e.g., 1-5).
Comma-Separated List: List non-consecutive pages to process (e.g., 1,3,5).

Output

The processed data for the specified pages is compiled into a single JSON file.

Selective Processing

Focus on specific pages instead of processing the entire document, saving time and resources.

Flexible Input

Supports single pages, consecutive ranges, and non-sequential lists, giving users full control over the pages they want to process.

Example cURL Command

curl --location 'https://prod-ml.fracto.tech/upload-file-smart-ocr' \
--header 'x-api-key: API_KEY' \
--form 'file=@"/Users/gagankapoor/Downloads/CHENNAI-10.pdf"' \
--form 'parserApp=PARSER_ID' \
--form 'page_range="1-5"'

4. Notes

The Notes feature introduces the ability to include custom user-defined strings in the API request, which are then stored in a dedicated notes column.

Example cURL Command

curl --location 'https://prod-ml.fracto.tech/upload-file-smart-ocr' \
--header 'x-api-key: API_KEY' \
--form 'file=@"/Users/gagankapoor/Downloads/CHENNAI-10.pdf"' \
--form 'parserApp=PARSER_ID' \
--form 'notes="Testing"'

1. Extra Accuracy​

Extra Accuracy for Text Detection and Initial Parsing​

What Can Extra Accuracy Do?​

Text Detection​

Bounding Box Information​

Language Detection​

JSON Output​

Single-Page Image Processing​

Confidence Score​

Example cURL Command​

Changes when using extra accuracy -​

2. Model​

Available Models​

v2 – Minimum (Mini Model)​

v1 – Regular (Enhanced Model)​

How It Works​

Model Selection​

Example cURL Command​

3. Page Range​

Page Range Input​

Output​

Selective Processing​

Flexible Input​

Example cURL Command​

4. Notes​

Example cURL Command​

1. Extra Accuracy

Extra Accuracy for Text Detection and Initial Parsing

What Can Extra Accuracy Do?

Text Detection

Bounding Box Information

Language Detection

JSON Output

Single-Page Image Processing

Confidence Score

Example cURL Command

Changes when using extra accuracy -

2. Model

Available Models

`v2` – Minimum (Mini Model)

`v1` – Regular (Enhanced Model)

How It Works

Model Selection

Example cURL Command

3. Page Range

Page Range Input

Output

Selective Processing

Flexible Input

Example cURL Command

4. Notes

Example cURL Command