Read Text
Description
The Read Text activity extracts text content from a Microsoft Word document.
The extracted content can be returned as plain text, HTML, or RTF.
This activity does not modify the document.
Common Capabilities
Process Data Support
This activity supports dynamic configuration using variables from the Process Data drawer.
You can bind values from Model Data, Form Data, System Data, Enterprise Variables, and Activity Outputs.
Learn more → Using Process Data
Execution Control
Controls how this activity behaves when multiple incoming execution paths converge.
Learn more → Execution Control
On Error
Defines how the workflow behaves if this activity encounters a runtime error.
Design-Time Configuration
Display Text
The label displayed for this activity on the workflow canvas.
Word File
The Word document to read.
Password
Optional password used to open a password-protected document.
Read Options
Text Format
Specifies the format of the extracted content.
Available options:
- PlainText – Returns text without formatting.
- Html – Returns content as HTML markup.
- Rtf – Returns content in Rich Text Format (RTF).
See TextFormat.
Page Selection
Page Lookup Mode
Specifies whether content is extracted from all pages, a single page, or a range of pages.
Available options:
- All
- Single
- Range
Default: All
See PageLookupMode.
Lookup Page By
Available when Page Lookup Mode is Single.
Specifies how the page is identified.
Available options:
- Page Index
- Page Number
See LookupPageBy.
Page Number
The printed page number to extract.
Examples include:
125AaIIIiv
Available when:
- Page Lookup Mode = Single
- Lookup Page By = Page Number
Page Index
The zero-based page index to extract.
Available when:
- Page Lookup Mode = Single
- Lookup Page By = Page Index
From Page Index
The starting zero-based page index.
Available when Page Lookup Mode is Range.
To Page Index
The ending zero-based page index.
Available when Page Lookup Mode is Range.
Cross-Platform Considerations
Page-based text extraction depends on document pagination, which can differ between Windows and Linux because page layout calculations are platform-dependent.
For the most consistent results across platforms, extract content from the entire document whenever possible.
Outputs
Text
The extracted text content in the selected format.