Word Class Documentation
classWordNamespace:datalogics_interface
Detailed Description
Each word contains a sequence of characters in one or more styles (see Style).
Referenced by
Uses types
Constructor & Destructor Documentation
Member Function Documentation
get_attributes
WordAttributeFlagsget_attributes()Returns:
A set of WordAttributeFlags containing information on the types of characters in the word.Gets a set of summary flags containing information on the types of characters in a word.
get_char_quads
std::vector< Quad >get_char_quads()Returns:
A list containing the Quads found in the word, one for each individual character.Gets a list of Quads occupied by the individual characters in a word.
get_is_last_word_in_region
boolget_is_last_word_in_region()Returns:
true if the word is the last word in a region, false if it is not.This can be useful for determining visual line breaks in tagged PDFs. In tagged PDF documents, WordAttributeFlags.LastWordOnLine is set according to the tags in the document, so that flag cannot be used to determine visual line breaks.
get_quads
std::vector< Quad >get_quads()Returns:
A list containing the Quads found in the word.The quad's height is the height of the font's bounding box, not the height of the tallest character used in the word. The font's bounding box is determined by the glyphs in the font that extend farthest above and below the baseline; it often extends somewhat above the top of 'A' and below the bottom of 'y'.
The quad's width is determined from the characters actually present in the word.
For example, the quads for the words "AWAY" and "away" have the same height, but generally do not have the same width unless the font is a mono-spaced font (a font in which all characters have the same width).
Despite the names of the fields in an Quad (TopLeft for top left, BottomLeft for bottom left, and so forth) the corners of Quad do not necessarily have these positions.
get_style_transitions
std::vector< StyleTransition >get_style_transitions()Returns:
The list of StyleTransition objects.Every word has at least one style transition, at character position zero in the word.
get_text
std::stringget_text()Returns:
The text of the Word.The string to return includes any word break characters (such as space characters) that follow the word, but not any that precede the word.
operator=
Parameters
: Word &&