Acro Support Layer: ASText

ASText Enumerations

ASScripts

Header: ASExpT.h:3903

Description

An enumeration of writing scripts. Not all of these scripts are supported on all platforms.

Value options for ASScript.

Enum Constants

`kASRomanScript`	Roman.
`kASJapaneseScript`	Japanese.
`kASTraditionalChineseScript`	Traditional Chinese.
`kASKoreanScript`	Korean.
`kASArabicScript`	Arabic.
`kASHebrewScript`	Hebrew.
`kASGreekScript`	Greek.
`kASCyrillicScript`	Cyrillic.
`kASRightLeftScript`	RightLeft.
`kASDevanagariScript`	Devanagari.
`kASGurmukhiScript`	Gurmukhi.
`kASGujaratiScript`	Gujarati.
`kASOriyaScript`	Oriya.
`kASBengaliScript`	Bengali.
`kASTamilScript`	Tamil.
`kASTeluguScript`	Telugu.
`kASKannadaScript`	Kannada.
`kASMalayalamScript`	Malayalam.
`kASSinhaleseScript`	Sinhalese.
`kASBurmeseScript`	Burmese.
`kASKhmerScript`	Khmer
`kASThaiScript`	Thai
`kASLaotianScript`	Laotian.
`kASGeorgianScript`	Georgian.
`kASArmenianScript`	Armenian.
`kASSimplifiedChineseScript`	Simplified Chinese.
`kASTibetanScript`	Tibetan.
`kASMongolianScript`	Mongolian.
`kASGeezScript`	Ge'ez.
`kASEastEuropeanRomanScript`	East European Roman.
`kASVietnameseScript`	Vietnamese.
`kASExtendedArabicScript`	Extended Arabic.
`kASEUnicodeScript`	Unicode.
`kASDontKnowScript=-1`	Unknown.

ASTextOptions

Header: ASExtraExpT.h:58

Enum Constants

`kASTextFilterIdentity`	Does nothing.
`kASTextFilterLineEndings`	Normalizes line endings (equivalent to ASTextNormalizeEndOfLine()).
`kASTextFilterUpperCase`	Makes all text upper case. DEPRECATED: Case is not a reliably localizable concept. Do not use this.
`kASTextFilterLowerCase`	Makes all text lower case. DEPRECATED: Case is not a reliably localizable concept. Do not use this.
`kASTextFilterXXXDebug`	Changes any ASText to "XXX" (for debugging).
`kASTextFilterUpperCaseDebug`	Makes all text except `scanf` format strings upper case.
`kASTextFilterLowerCaseDebug`	Makes all text except `scanf` format strings lower case.
`kASTextFilterRemoveAmpersands`	Removes stand-alone ampersands, and turns `&` `&` into `&`
`kASTextFilterNormalizeFullWidthASCIIVariants`	Changes any full width ASCII variants to their lower-ASCII version. For example, `0xFF21` (full width `'A'`) becomes `0x0041` (ASCII `'A'`)
`kASTextRemoveLineEndings`	Removes line endings and replaces them with spaces.
`kASTextFilterRsvd1=1000`	Reserved. Do not use.
`kASTextFilterUnknown=-1`	An invalid filter type.

ASText Typedefs

ASConstText: - An opaque object holding constant encoded text.
ASCountryCode: -
ASHostEncoding: - An integer specifying the host encoding for text.
ASLanguageCode: -
ASScript: - An enumeration of writing scripts.
ASText: - An opaque object holding encoded text.
ASTextFilterType: - Constants that specify filter types used to modify text objects.
ASUTF16Val: - Holds a single 16-bit value from a UTF-16 encoded Unicode string.
ASUTF32Val: -
ASUTF8Val: - An ASUTF8Val holds a single 8-bit value from a UTF-8 encoded Unicode string.
ASUniChar: -
ASUnicodeChar: - An ASUnicodeChar is large enough to hold any Unicode character (at least 21 bits wide).
ASUnicodeFormat: - Describes the various Unicode formats you can place into and read out of an ASText object.

ASConstText

Header: ASExpT.h:1482

Description

An opaque object holding constant encoded text.

Syntax

typedef const struct _t_ASTextRec *ASConstText;

Returned From

Used By

ASCountryCode

Header: ASExtraExpT.h:52

Syntax

typedef ASUns16 ASCountryCode;

Returned From

Used By

ASHostEncoding

Header: ASExpT.h:3874

Description

An integer specifying the host encoding for text. On Mac OS, it is a script code. On Windows, it is a CHARSET id. In UNIX, Acrobat currently only supports English, so the only valid ASHostEncoding is 0 (Roman). See ASScript.

Syntax

typedef ASInt32 ASHostEncoding;

Returned From

Used By

ASLanguageCode

Header: ASExtraExpT.h:54

Syntax

typedef ASUns16 ASLanguageCode;

Returned From

Used By

ASScript

Header: ASExpT.h:3974

Description

An enumeration of writing scripts. Not all of these scripts are supported on all platforms.

For value options see ASScripts.

Syntax

typedef ASInt32 ASScript;

Returned From

Used By

ASText

Header: ASExpT.h:1476

Description

An opaque object holding encoded text.

An ASText object represents a Unicode string. ASText objects can also be used to convert between Unicode and various platform-specific text encodings, as well as conversions between various Unicode formats such as UTF-16 or UTF-8. Since it is common for a Unicode string to be repeatedly converted to or from the same platform-specific text encoding, ASText objects are optimized for this operation. For example, they can cache both the Unicode and platform-specific text strings.

There are several ways of creating an ASText object depending on the type and format of the original text data. The following terminology is used throughout this API to describe the various text formats:

Text Format	Description
Encoded	A multi-byte string terminated with a single `0` character and coupled with a specific host encoding indicator. On Mac OS, the text encoding is specified using a script code. On Windows, the text encoding is specified using a `CHARSET` code. On UNIX the only valid host encoding indicator is `0`, which specifies text in the platform's default Roman encoding. On all platforms, Asian text is typically specified using multi-byte strings.
ScriptText	A multi-byte string terminated with a single `0` character and coupled with an ASScript code. This is merely another way of specifying the Encoded case; the ASScript code is converted to a host encoding using ASScriptToHostEncoding().
Unicode	Text specified using UTF-16 or UTF-8. In the UTF-16 case, the bytes can be in either big-endian format or the endian-ness that matches the platform, and are always terminated with a single ASUns16 `0` value. In the UTF-8 case, the text is always terminated with a trailing `0` byte. Unicode usage in this case is straight Unicode without the `0xFE` `0xFF` prefix or language and country codes that can be encoded inside a PDF document.
PDText	A string of text pulled out of a PDF document. This will either be a big-endian Unicode string pre-appended with the bytes `0xFE` `0xFF`, or a string in PDFDocEncoding. In this case, the Unicode string may have embedded language and country identifiers. ASText objects strip language and country information out of the PDText string and track them separately. See below for more details.

ASText objects can also be used to accomplish encoding and format conversions; you can request a string in any of the formats specified above. In all cases the ASText code attempts to preserve all characters. For example, if you attempt to concatenate two strings in separate host encodings, the implementation may convert both to Unicode and perform the concatenation in Unicode space.

When creating a new ASText object or putting new data into an existing object, the implementation will always copy the supplied data into the ASText object. The original data is yours to do with as you wish (and release if necessary).

The size of ASText data is always specified in bytes. For example, the len argument to ASTextFromSizedUnicode() specifies the number of bytes in the string, not the number of Unicode characters.

Host encoding and Unicode strings are always terminated with a NULL character (which consists of one NULL byte for host encoded strings and two NULL bytes for Unicode strings). You cannot create a string with an embedded NULL character, even using the calls which take an explicit length parameter.

The Getxxx calls return pointers to data held by the ASText object. You cannot free or manipulate this data directly. The GetxxxCopy calls return data you can manipulate and that you are responsible for freeing.

An ASText object can have language and country codes associated with it. A language code is a 2-character ISO 639 language code. A country code is a 2- character ISO 3166 country code. In both cases the 2-character codes are packed into an ASUns16 value: the first character is packed in bits 8-15, and the second character is packed in bits 0-7. These language and country codes can be encoded into a UTF-16 variant of PDText encoding using an escape sequence. See the description of "Common Data Structures" in ISO 32000-1:2008, Document Management-Portable Document Format-Part 1: PDF 1.7, section 7.9, page 84.

You can find this document on the web store of the International Standards Organization (ISO).

The ASText calls will automatically parse the language and country codes embedded inside a UTF-16 PDText object, and will also author appropriate escape sequences to embed the language and country codes (if present) when generating a UTF-16 PDText object.

Syntax

typedef struct _t_ASTextRec *ASText;

Returned From

Used By

Used In

ASTextFilterType

Header: ASExtraExpT.h:102

Description

Constants that specify filter types used to modify text objects.

Syntax

typedef ASEnum16 ASTextFilterType;

Used By

ASUTF16Val

Header: ASExpT.h:3892

Description

Holds a single 16-bit value from a UTF-16 encoded Unicode string. It is typically used to point to the beginning of an UTF-16 string. For example: ASUTF16Val *utf16String =...

This data type is not large enough to hold any arbitrary Unicode character. Use ASUnicodeChar to pass individual Unicode characters.

Syntax

typedef ASUns16 ASUTF16Val;

Returned From

Used By

Used In

ASUTF32Val

Header: ASExpT.h:3880

Syntax

typedef ASUns32 ASUTF32Val;

Used In

ASUTF8Val

Header: ASExpT.h:3897

Description

An ASUTF8Val holds a single 8-bit value from a UTF-8 encoded Unicode string.

Syntax

typedef ASUns8 ASUTF8Val;

ASUniChar

Header: ASExtraExpT.h:50

Syntax

typedef ASUTF16Val ASUniChar;

ASUnicodeChar

Header: ASExpT.h:3879

Description

An ASUnicodeChar is large enough to hold any Unicode character (at least 21 bits wide).

Syntax

typedef ASUns32 ASUnicodeChar;

ASUnicodeFormat

Header: ASExpT.h:3865

Description

Describes the various Unicode formats you can place into and read out of an ASText object.

For value options see UTFOptions.

Syntax

typedef ASEnum16 ASUnicodeFormat;

Used By

ASText Callback Signatures

ASTextEvalProc

Header: ASExtraExpT.h:405

Syntax

ASText ASTextEvalProc(ASCab params);

ASText Functions

ASHostMBLen: - Determines whether the given byte is a lead byte of a multi-byte character, and how many tail bytes follow.
ASIsValidUTF8: - Tests whether the bytes in the string conform to the Unicode UTF-8 encoding form.
ASScriptFromHostEncoding: - Converts from a host encoding type to an ASScript value.
ASScriptToHostEncoding: - Converts from an ASScript code to a host encoding type.
ASTextCaseSensitiveCmp: - Compares two ASConstText objects, ignoring language and country information.
ASTextCat: - Concatenates the from text to the end of the to text, altering to but not from.
ASTextCatMany: - Concatenates a series of ASText objects to the end of the to object.
ASTextCmp: - Compares two ASText objects.
ASTextCopy: - Copies the text in from to to, along with the country and language.
ASTextDestroy: - Frees all memory associated with the text object.
ASTextDup: - Creates a new ASText object that contains the same text/country/language as the one passed in.
ASTextEval: - Replaces percent-quoted expressions in the text object with the result of their evaluation, using key/value pairs in the ASCab.
ASTextFilter: - Runs the specified filter on a text object, modifying the text as specified.
ASTextFromEncoded: - Creates a new text object from a NULL-terminated multi-byte string in the specified host encoding.
ASTextFromInt32: - Creates a new string from an ASInt32 by converting the number to its decimal representation without punctuation or leading zeros.
ASTextFromPDText: - Creates a new string from some PDF text taken out of a PDF file.
ASTextFromScriptText: - Creates a new string from a NULL-terminated multi-byte string of the specified script.
ASTextFromSizedEncoded: - Creates a new text object from a multi-byte string of the specified length in the specified host encoding.
ASTextFromSizedPDText: - Creates a new string from some PDF text taken out of a PDF file.
ASTextFromSizedScriptText: - Creates a new text object from the specified multi-byte string of the specified script.
ASTextFromSizedUnicode: - Creates a new text object from the specified Unicode string.
ASTextFromUnicode: - Creates a new string from a NULL-terminated Unicode string.
ASTextFromUns32: - Creates a new string from an ASUns32 by converting it to a decimal representation without punctuation or leading zeros.
ASTextGetBestEncoding: - Returns the best host encoding for representing the text.
ASTextGetBestScript: - Returns the best host script for representing the text.
ASTextGetCountry: - Retrieves the country associated with an ASText object.
ASTextGetEncoded: - Returns a NULL-terminated string in the given encoding.
ASTextGetEncodedCopy: - Returns a copy of a string in a specified encoding.
ASTextGetLanguage: - Retrieves the language code associated with an ASText object.
ASTextGetPDTextCopy: - Returns the text in a form suitable for storage in a PDF file.
ASTextGetScriptText: - Converts the Unicode string in the ASText object to the appropriate script, and returns a pointer to the converted text.
ASTextGetScriptTextCopy: - Converts the Unicode string in the ASText object to the appropriate script and returns a pointer to the converted text.
ASTextGetUnicode: - Returns a pointer to a string in kUTF16HostEndian format (see ASUnicodeFormat).
ASTextGetUnicodeCopy: - Returns a pointer to a NULL-terminated string in the specified Unicode format.
ASTextIsEmpty: - Used to determine whether the ASText object contains no text.
ASTextMakeEmpty: - Removes the contents of an ASText (turns it into an empty string).
ASTextMakeEmptyClear: - Removes the contents of an ASText object (converts it into an empty string).
ASTextNew: - Creates a new text object containing no text.
ASTextNormalizeEndOfLine: - Replaces all end-of-line characters within the ASText object with the correct end-of-line character for the current platform.
ASTextReplace: - Replaces all occurrences of toReplace in src with the text specified in replacement.
ASTextReplaceASCII: - Replaces all occurrences of toReplace in src with the text specified in replacement.
ASTextReplaceBadChars: - Replaces all occurrences of characters contained in the list pszBadCharList in the text with the specified replacement character.
ASTextSetCountry: - Sets the language codes associated with a piece of text.
ASTextSetEncoded: - Replaces the contents of an existing ASText object with a NULL-terminated multi-byte string in the specified host encoding.
ASTextSetLanguage: - Sets the language codes associated with a piece of text.
ASTextSetPDText: - Alters an existing string from some PDF text taken out of a PDF file.
ASTextSetScriptText: - Alters an existing string from a NULL-terminated multi-byte string of the specified script.
ASTextSetSizedEncoded: - Alters an existing string from a multi-byte string in the specified host encoding and of the specified length.
ASTextSetSizedPDText: - Replaces the contents of an existing ASText object with PDF text taken out of a PDF file.
ASTextSetSizedScriptText: - Replaces the contents of an existing ASText object with the specified multi-byte string of the specified script.
ASTextSetSizedUnicode: - Replaces the contents of an existing ASText object with the specified Unicode string.
ASTextSetUnicode: - Alters an existing string from a NULL-terminated Unicode string.
ASUCS_GetPasswordFromUnicode: - Converts user input of a password to a form that can be used by Acrobat to open a file.

ASHostMBLen

Header: ASProcs.h:2045

Description

Determines whether the given byte is a lead byte of a multi-byte character, and how many tail bytes follow.

When parsing a string in a host encoding, you must keep in mind that the string could be in a variable length multi-byte encoding. In such an encoding (for example, Shift-JIS) the number of bytes required to represent a character varies on a character-by-character basis. To parse such a string you must start at the beginning and, for each byte, determine whether that byte represents a character or is the first byte of a multi-byte character. If the byte is a lead byte for a multi-byte character, you must also compute how many bytes will follow the lead byte to make up the entire character. Currently the API provides a call (PDHostMBLen()) that performs these computations, but only if the encoding in question is the operating system encoding (as returned by PDGetHostEncoding()). ASHostMBLen() allows you to determine this for any byte in any host encoding.

Note: ASHostMBLen() cannot confirm whether the required number of trailing bytes actually follow the first byte. If you are parsing a multi-byte string, make sure your code will stop at the first NULL (zero) byte even if it appears immediately after the lead byte of a multi-byte character.

Related Methods

PDGetHostEncoding PDHostMBLen

Syntax

ASInt32 ASHostMBLen(ASHostEncoding encoding, ASUns8 byte);

Parameters

`encoding`	The host encoding type.
`byte`	The first byte of a multi-byte character.

Returns

The number of additional bytes required to form the character. For example, if the encoding is a double-byte encoding, the return value will be 1 for a two-byte character and 0 for a one-byte character. For Roman encodings, the return value will always be 0.

ASIsValidUTF8

Header: ASExtraProcs.h:2301

Description

Tests whether the bytes in the string conform to the Unicode UTF-8 encoding form. The method does not test whether the string is NULL-terminated.

Syntax

ASBool ASIsValidUTF8(const ASUns8 *cIn, ASCount cInLen);

Parameters

`cIn`	The string.
`cInLen`	The length of the string in bytes, not including the `NULL` byte at the end.

Returns

true if the bytes in the string conform to the Unicode UTF-8 encoding form, false otherwise.

ASScriptFromHostEncoding

Header: ASExtraProcs.h:56

Description

Converts from a host encoding type to an ASScript value. On Windows, the host encoding is a CHARSET id. On Mac OS, the host encoding is a script code.

Related Methods

ASScriptToHostEncoding

Syntax

ASScript ASScriptFromHostEncoding(ASHostEncoding osScript);

Parameters

osScript

The host encoding type.

Returns

The new ASScript value.

ASScriptToHostEncoding

Header: ASExtraProcs.h:45

Description

Converts from an ASScript code to a host encoding type. On Windows, the host encoding is a CHARSET id. On Mac OS, the host encoding is a script code.

Related Methods

ASScriptFromHostEncoding

Syntax

ASHostEncoding ASScriptToHostEncoding(ASScript asScript);

Parameters

asScript

The script value.

Returns

The new host encoding type.

ASTextCaseSensitiveCmp

Header: ASExtraProcs.h:2316

Description

Compares two ASConstText objects, ignoring language and country information. The comparison is case-sensitive.

Various exceptions may be raised.

Related Methods

ASTextCmp

Syntax

ASInt32 ASTextCaseSensitiveCmp(ASConstText str1, ASConstText str2);

Parameters

`str1`	First text object.
`str2`	Second text object.

Returns

Returns a negative number if str1 < str2, a positive number if str1 > str2, and 0 if they are equal.

ASTextCat

Header: ASExtraProcs.h:559

Description

Concatenates the from text to the end of the to text, altering to but not from. It does not change the language or country of to unless it has no language or country, in which case it acquires the language and country of from.

Syntax

void ASTextCat(ASText to, ASConstText from);

Parameters

`to`	IN/OUT The encoded text to which `from` is appended.
`from`	IN/OUT The encoded text to be appended to `to`.

ASTextCatMany

Header: ASExtraProcs.h:572

Description

Concatenates a series of ASText objects to the end of the to object. Be sure to provide NULL as the last argument to the call.

Various exceptions may be raised.

Syntax

void ASTextCatMany(ASText to, ...);

Parameters

`to`	IN/OUT The ASText object to which the subsequent ASText arguments are concatenated.

ASTextCmp

Header: ASExtraProcs.h:614

Description

Compares two ASText objects. This routine can be used to sort text objects using the default collating rules of the underlying operating system before presenting them to the user. The comparison is case-sensitive. The results are suitable for displaying a sorted list of strings to the user in his chosen language and according to the rules of the platform on which the application is running. The results can vary based on the platform and user locale. If you want to compare strings in a way that is consistent across locales and platforms (but not suitable for displaying sorted strings to a user) see ASTextCaseSensitiveCmp().

Various exceptions may be raised.

Related Methods

ASTextCaseSensitiveCmp

Syntax

ASInt32 ASTextCmp(ASConstText str1, ASConstText str2);

Parameters

`str1`	The first text object.
`str2`	The second text object.

Returns

A negative number if str1 < str2, a positive number if str1 > str2, and 0 if they are equal.

ASTextCopy

Header: ASExtraProcs.h:581

Description

Copies the text in from to to, along with the country and language.

Syntax

void ASTextCopy(ASText to, ASConstText from);

Parameters

`to`	IN/OUT The destination text object.
`from`	IN/OUT The source text object.

ASTextDestroy

Header: ASExtraProcs.h:214

Description

Frees all memory associated with the text object.

Syntax

void ASTextDestroy(ASText str);

Parameters

str

IN/OUT A text object.

ASTextDup

Header: ASExtraProcs.h:591

Description

Creates a new ASText object that contains the same text/country/language as the one passed in.

Syntax

ASText ASTextDup(ASConstText str);

Parameters

str

A text object.

Returns

An ASText object.

Exceptions

genErrBadParm

is raised if str is NULL.

ASTextEval

Header: ASExtraProcs.h:2129

Description

Replaces percent-quoted expressions in the text object with the result of their evaluation, using key/value pairs in the ASCab. For example, for a text value containing the string "%keyone%%keytwo%", the value is replaced with the concatenation of the values of the keys keyone and keytwo in the ASCab passed in.

Syntax

void ASTextEval(ASText theText, ASCab params);

Parameters

`theText`	A text object containing percent-quoted expressions to replace.
`params`	The ASCab containing the key/value pairs to use for text replacement.

Returns

None.

Exceptions

genErrBadParm

if theText is NULL.

ASTextFilter

Header: ASExtraProcs.h:2353

Description

Runs the specified filter on a text object, modifying the text as specified.

Syntax

void ASTextFilter(ASText text, ASTextFilterType filter);

Parameters

`text`	A text object modified by the method.
`filter`	The filter to run on the text object.

Returns

None.

Exceptions

genErrBadParm

if text is NULL or if an invalid filter is specified.

ASTextFromEncoded

Header: ASExtraProcs.h:134

Description

Creates a new text object from a NULL-terminated multi-byte string in the specified host encoding.

Related Methods

ASTextFromSizedEncoded

Syntax

ASText ASTextFromEncoded(const char *str, ASHostEncoding encoding);

Parameters

`str`	The input string.
`encoding`	The host encoding.

Returns

An ASText object.

ASTextFromInt32

Header: ASExtraProcs.h:1561

Description

Creates a new string from an ASInt32 by converting the number to its decimal representation without punctuation or leading zeros.

Related Methods

ASTextFromUns32

Syntax

ASText ASTextFromInt32(ASInt32 num);

Parameters

num

A number of type ASInt32.

Returns

An ASText object.

ASTextFromPDText

Header: ASExtraProcs.h:190

Description

Creates a new string from some PDF text taken out of a PDF file. This is either a UTF-16 string with the 0xFEFF prepended to the front or a PDFDocEncoding string. In either case the string is expected to have the appropriate NULL termination. If the PDText is in UTF-16, it may have embedded language and country information; this will cause the ASText object to have its language and country codes set to the values found in the string.

Related Methods

ASTextFromSizedPDText

Syntax

ASText ASTextFromPDText(const char *str);

Parameters

str

A string.

Returns

An ASText object.

ASTextFromScriptText

Header: ASExtraProcs.h:160

Description

Creates a new string from a NULL-terminated multi-byte string of the specified script. This is a wrapper around ASTextFromEncoded(); the script is converted to a host encoding using ASScriptToHostEncoding().

Related Methods

ASTextFromSizedScriptText

Syntax

ASText ASTextFromScriptText(const char *str, ASScript script);

Parameters

`str`	A string.
`script`	The specified script.

Returns

An ASText object.

ASTextFromSizedEncoded

Header: ASExtraProcs.h:147

Description

Creates a new text object from a multi-byte string of the specified length in the specified host encoding.

Related Methods

ASTextFromEncoded

Syntax

ASText ASTextFromSizedEncoded(const char *str, ASTArraySize len, ASHostEncoding encoding);

Parameters

`str`	A string.
`len`	The length in bytes.
`encoding`	The specified host encoding.

Returns

An ASText object.

Exceptions

genErrBadParm

is raised if len < 0.

ASTextFromSizedPDText

Header: ASExtraProcs.h:207

Description

Creates a new string from some PDF text taken out of a PDF file. This is either a UTF-16 string with the 0xFEFF prepended to the front or a PDFDocEncoding string. If the PDText is in UTF-16, it may have embedded language and country information; this will cause the ASText object to have its language and country codes set to the values found in the string. The length parameter specifies the size, in bytes, of the string. The string must not contain embedded NULL characters.

Related Methods

ASTextFromPDText

Syntax

ASText ASTextFromSizedPDText(const char *str, ASTArraySize length);

Parameters

`str`	A string.
`length`	The length in bytes.

Returns

An ASText object.

ASTextFromSizedScriptText

Header: ASExtraProcs.h:174

Description

Creates a new text object from the specified multi-byte string of the specified script. This is a wrapper around ASTextFromEncoded(); the script is converted to a host encoding using ASScriptToHostEncoding().

Related Methods

ASTextFromScriptText

Syntax

ASText ASTextFromSizedScriptText(const char *str, ASTArraySize len, ASScript script);

Parameters

`str`	A string.
`len`	The length in bytes.
`script`	The specified script.

Returns

An ASText object.

ASTextFromSizedUnicode

Header: ASExtraProcs.h:123

Description

Creates a new text object from the specified Unicode string. This string is not expected to have 0xFE 0xFF prepended, or country/language identifiers.

The string cannot contain an embedded NULL character.

Related Methods

ASTextFromUnicode

Syntax

ASText ASTextFromSizedUnicode(const ASUTF16Val *ucs, ASUnicodeFormat format, ASTArraySize len);

Parameters

`ucs`	The Unicode string
`format`	The Unicode format of `ucs`.
`len`	The length of `ucs` in bytes.

Returns

An ASText object.

Exceptions

genErrBadParm

is raised if len < 0.

ASTextFromUnicode

Header: ASExtraProcs.h:106

Description

Creates a new string from a NULL-terminated Unicode string. This string is not expected to have 0xFE 0xFF prepended, or country/language identifiers.

Related Methods

ASTextFromSizedUnicode

Syntax

ASText ASTextFromUnicode(const ASUTF16Val *ucs, ASUnicodeFormat format);

Parameters

`ucs`	A Unicode string.
`format`	The Unicode format used by `ucs`.

Returns

An ASText object.

ASTextFromUns32

Header: ASExtraProcs.h:1572

Description

Creates a new string from an ASUns32 by converting it to a decimal representation without punctuation or leading zeros.

Related Methods

ASTextFromInt32

Syntax

ASText ASTextFromUns32(ASUns32 num);

Parameters

num

IN/OUT A value of type ASUns32.

Returns

An ASText object.

ASTextGetBestEncoding

Header: ASExtraProcs.h:492

Description

Returns the best host encoding for representing the text. The best host encoding is the one that is least likely to lose characters during the conversion from Unicode to host. If the string can be represented accurately in multiple encodings (for example, it is low-ASCII text that can be correctly represented in any host encoding), ASTextGetBestEncoding() returns the preferred encoding based on the preferredEncoding parameter.

Various exceptions may be raised.

Related Methods

ASTextGetBestScript

Syntax

ASHostEncoding ASTextGetBestEncoding(ASConstText str, ASHostEncoding preferredEncoding);

Parameters

`str`	An ASText string.
`preferredEncoding`	The preferred encoding. There is no default.

Returns

The text encoding.

Example

// If you prefer to use the application's language encoding:

ASHostEncoding bestEncoding = ASTextGetBestEncoding(text, AVAppGetLanguageEncoding());

// If you prefer to use the operating system encoding:

ASHostEncoding bestEncoding = ASTextGetBestEncoding(text, (ASHostEncoding)PDGetHostEncoding());

// If you want to favor Roman encodings:

ASHostEncoding hostRoman = ASScriptToHostEncoding(kASRomanScript);

ASHostEncoding bestEncoding = ASTextGetBestEncoding(text, hostRoman);

ASTextGetBestScript

Header: ASExtraProcs.h:506

Description

Returns the best host script for representing the text. The functionality is similar to ASTextGetBestEncoding(), with resulting host encoding converted to a script code using ASScriptFromHostEncoding().

Related Methods

ASTextGetBestEncoding

Syntax

ASScript ASTextGetBestScript(ASConstText str, ASScript preferredScript);

Parameters

`str`	IN/OUT An ASText string.
`preferredScript`	IN/OUT The preferred host script. There is no default.

Returns

The best host script.

ASTextGetCountry

Header: ASExtraProcs.h:516

Description

Retrieves the country associated with an ASText object.

Related Methods

ASTextSetCountry

Syntax

ASCountryCode ASTextGetCountry(ASConstText text);

Parameters

text

IN/OUT An ASText object.

Returns

The country code.

ASTextGetEncoded

Header: ASExtraProcs.h:388

Description

Returns a NULL-terminated string in the given encoding. The memory to which this string points is owned by the ASText object and may not be valid after additional operations are performed on the object.

Various exceptions may be raised.

Related Methods

ASTextGetEncodedCopy

Syntax

const char *ASTextGetEncoded(ASConstText str, ASHostEncoding encoding);

Parameters

`str`	IN/OUT An ASText object.
`encoding`	IN/OUT The specified host encoding.

Returns

A pointer to a NULL-terminated string corresponding to the text in str.

ASTextGetEncodedCopy

Header: ASExtraProcs.h:402

Description

Returns a copy of a string in a specified encoding.

Related Methods

ASTextGetEncoded

Syntax

char *ASTextGetEncodedCopy(ASConstText str, ASHostEncoding encoding);

Parameters

`str`	An ASText object.
`encoding`	The specified encoding.

Returns

A copy of the text in str. The client owns the resulting information and is responsible for freeing it using ASfree().

Exceptions

genErrNoMemory

is raised if memory could not be allocated for the copy.

ASTextGetLanguage

Header: ASExtraProcs.h:538

Description

Retrieves the language code associated with an ASText object.

Related Methods

ASTextSetLanguage

Syntax

ASLanguageCode ASTextGetLanguage(ASConstText text);

Parameters

text

An ASText object.

Returns

The language code.

ASTextGetPDTextCopy

Header: ASExtraProcs.h:461

Description

Returns the text in a form suitable for storage in a PDF file. If the text can be represented using PDFDocEncoding, it is; otherwise it is represented in big-endian UTF-16 format with 0xFE 0xFF prepended to the front and any country/language codes embedded in an escape sequence right after 0xFE 0xFF.

You can determine if the string is Unicode by inspecting the first two bytes. The Unicode case is used if the string has a language and country code set. The resulting string is NULL-terminated as appropriate. That is, one NULL byte is used for PDFDocEncoding, two are used for UTF-16.

Various exceptions may be raised.

Syntax

char *ASTextGetPDTextCopy(ASConstText str, ASTArraySize *len);

Parameters

`str`	A string.
`len`	The length in bytes of the resulting string, not counting the `NULL` bytes at the end.

Returns

A string copy. The client owns the resulting information and is responsible for freeing it with ASfree().

ASTextGetScriptText

Header: ASExtraProcs.h:420

Description

Converts the Unicode string in the ASText object to the appropriate script, and returns a pointer to the converted text. The memory to which it points is owned by the ASText object and must not be altered or destroyed by the client. The memory may also become invalid after subsequent operations are applied to the ASText object.

Various exceptions may be raised.

Related Methods

ASTextGetScriptTextCopy

Syntax

const char *ASTextGetScriptText(ASConstText str, ASScript script);

Parameters

`str`	IN/OUT A string.
`script`	IN/OUT The writing script.

Returns

A string.

ASTextGetScriptTextCopy

Header: ASExtraProcs.h:437

Description

Converts the Unicode string in the ASText object to the appropriate script and returns a pointer to the converted text. The memory to which it points is owned by the client, which is responsible for freeing it using ASfree().

Syntax

char *ASTextGetScriptTextCopy(ASConstText str, ASScript script);

Parameters

`str`	A string.
`script`	A writing script.

Returns

A string copy. The client owns the resulting information.

Exceptions

genErrNoMemory

is raised if memory could not be allocated for the copy.

ASTextGetUnicode

Header: ASExtraProcs.h:351

Description

Returns a pointer to a string in kUTF16HostEndian format (see ASUnicodeFormat). The memory to which this string points is owned by the ASText object, and may not be valid after additional operations are performed on the object.

The Unicode text returned will not have 0xFE 0xFF prepended or any language or country codes.

Related Methods

ASTextGetUnicodeCopy

Syntax

const ASUTF16Val *ASTextGetUnicode(ASConstText str);

Parameters

str

A string.

Returns

See above.

ASTextGetUnicodeCopy

Header: ASExtraProcs.h:371

Description

Returns a pointer to a NULL-terminated string in the specified Unicode format. The memory to which this string points is owned by the client, which can modify it at will and is responsible for destroying it using ASfree.

The Unicode text returned will not have 0xFE 0xFF prepended or any language or country codes.

Related Methods

ASTextGetUnicode

Syntax

ASUTF16Val *ASTextGetUnicodeCopy(ASConstText str, ASUnicodeFormat format);

Parameters

`str`	A string.
`format`	The Unicode format.

Returns

A string copy. The client owns the resulting information.

Exceptions

genErrNoMemory

is raised if memory could not be allocated for the copy.

ASTextIsEmpty

Header: ASExtraProcs.h:1540

Description

Used to determine whether the ASText object contains no text. For example, it determines if retrieving Unicode text would yield a 0-length string.

Syntax

ASBool ASTextIsEmpty(ASConstText str);

Parameters

str

A string.

Returns

Returns true if the ASText object contains no text.

ASTextMakeEmpty

Header: ASExtraProcs.h:1579

Description

Removes the contents of an ASText (turns it into an empty string).

Syntax

void ASTextMakeEmpty(ASText str);

ASTextMakeEmptyClear

Header: ASExtraProcs.h:2429

Description

Removes the contents of an ASText object (converts it into an empty string). It clears the released storage (for security strings).

Syntax

void ASTextMakeEmptyClear(ASText str);

ASTextNew

Header: ASExtraProcs.h:94

Description

Creates a new text object containing no text.

Syntax

ASText ASTextNew(void);

Returns

An ASText object.

Exceptions

genErrNoMemory

ASTextNormalizeEndOfLine

Header: ASExtraProcs.h:1550

Description

Replaces all end-of-line characters within the ASText object with the correct end-of-line character for the current platform. For example, on Windows, \\r and \\n are replaced with \\r\\n.

Syntax

void ASTextNormalizeEndOfLine(ASText text);

Parameters

text

An object of type ASText.

ASTextReplace

Header: ASExtraProcs.h:631

Description

Replaces all occurrences of toReplace in src with the text specified in replacement. This uses an ASText string to indicate the toReplace string; ASTextReplaceASCII() uses a low ASCII Roman string to indicate the text to replace.

Various exceptions may be raised.

Syntax

void ASTextReplace(ASText src, ASConstText toReplace, ASConstText replacement);

Parameters

`src`	Source text.
`toReplace`	Text in source text to replace.
`replacement`	Text used in replacement.

ASTextReplaceASCII

Header: ASExtraProcs.h:654

Description

Replaces all occurrences of toReplace in src with the text specified in replacement. ASTextReplace() uses an ASText string to indicate the toReplace string; this uses a low-ASCII Roman string to indicate the text to replace.

This call is intended for formatting strings for the user interface. For example, it can be used for replacing a known sequence such as '%1' with other text. Be sure to use only low ASCII characters, which are safe on all platforms. Avoid using backslash and currency symbols.

Various exceptions may be raised.

Related Methods

ASTextReplace ASTextReplaceBadChars

Syntax

void ASTextReplaceASCII(ASText src, const char *toReplace, ASConstText replacement);

Parameters

`src`	The ASText object containing the text.
`toReplace`	The text to replace.
`replacement`	The replacement text.

ASTextReplaceBadChars

Header: ASExtraProcs.h:1598

Description

Replaces all occurrences of characters contained in the list pszBadCharList in the text with the specified replacement character.

Various exceptions may be raised.

Related Methods

ASTextReplace ASTextReplaceASCII

Syntax

void ASTextReplaceBadChars(ASText str, const char *pszBadCharList, char replaceChar);

Parameters

`str`	The text in which to replace characters.
`pszBadCharList`	A list of characters to replace, in sorted order with no duplicates.
`replaceChar`	The character with which to replace any character appearing in the list.

ASTextSetCountry

Header: ASExtraProcs.h:529

Description

Sets the language codes associated with a piece of text. ASText objects can have country and language codes associated with them. These can be explicitly set or parsed from the Unicode form of PDText strings.

Related Methods

ASTextGetCountry

Syntax

void ASTextSetCountry(ASText text, ASCountryCode country);

Parameters

`text`	IN/OUT An ASText object.
`country`	IN/OUT Country code.

ASTextSetEncoded

Header: ASExtraProcs.h:257

Description

Replaces the contents of an existing ASText object with a NULL-terminated multi-byte string in the specified host encoding.

Related Methods

ASTextSetSizedEncoded

Syntax

void ASTextSetEncoded(ASText str, const char *text, ASHostEncoding encoding);

Parameters

`str`	IN/OUT An ASText object to hold the string.
`text`	IN/OUT A pointer to the text string.
`encoding`	IN/OUT The type of encoding.

Exceptions

genErrBadParm

is raised if text is NULL.

ASTextSetLanguage

Header: ASExtraProcs.h:548

Description

Sets the language codes associated with a piece of text.

Related Methods

ASTextGetLanguage

Syntax

void ASTextSetLanguage(ASText text, ASLanguageCode language);

Parameters

`text`	IN/OUT An ASText object.
`language`	IN/OUT The language code.

ASTextSetPDText

Header: ASExtraProcs.h:316

Description

Alters an existing string from some PDF text taken out of a PDF file. This is either a big-endian UTF-16 string with the 0xFEFF prepended to the front or a PDFDocEncoding string. In either case the string is expected to have the appropriate NULL termination. If the PDText is in UTF-16, it may have embedded language and country information; this will cause the ASText object to have its language and country codes set to the values found in the string.

Related Methods

ASTextSetSizedPDText

Syntax

void ASTextSetPDText(ASText str, const char *text);

Parameters

`str`	A string.
`text`	A text string.

ASTextSetScriptText

Header: ASExtraProcs.h:285

Description

Alters an existing string from a NULL-terminated multi-byte string of the specified script. This is a wrapper around ASTextFromEncoded(); the script is converted to a host encoding using ASScriptToHostEncoding().

Related Methods

ASTextSetSizedScriptText

Syntax

void ASTextSetScriptText(ASText str, const char *text, ASScript script);

Parameters

`str`	IN/OUT A string.
`text`	IN/OUT A pointer to the text string.
`script`	IN/OUT The writing script.

ASTextSetSizedEncoded

Header: ASExtraProcs.h:272

Description

Alters an existing string from a multi-byte string in the specified host encoding and of the specified length. This text does not need to be NULL-terminated, and no NULL (zero) bytes should appear in the characters passed in.

Related Methods

ASTextSetEncoded

Syntax

void ASTextSetSizedEncoded(ASText str, const char *text, ASTArraySize len, ASHostEncoding encoding);

Parameters

`str`	IN/OUT A string.
`text`	IN/OUT A pointer to the text string.
`len`	IN/OUT The length of the text string.
`encoding`	IN/OUT The host encoding type.

Exceptions

genErrBadParm

is raised if text is NULL.

ASTextSetSizedPDText

Header: ASExtraProcs.h:335

Description

Replaces the contents of an existing ASText object with PDF text taken out of a PDF file. This is either a big-endian UTF-16 string with the 0xFEFF prepended to the front or a PDFDocEncoding string. In either case the length parameter indicates the number of bytes in the string. The string should not be NULL-terminated and must not contain any NULL characters. If the PDText is in UTF-16, it may have embedded language and country information; this will cause the ASText object to have its language and country codes set to the values found in the string.

Related Methods

ASTextSetPDText

Syntax

void ASTextSetSizedPDText(ASText str, const char *text, ASTArraySize length);

Parameters

`str`	A string.
`text`	A pointer to a text string.
`length`	The length of the text string.

ASTextSetSizedScriptText

Header: ASExtraProcs.h:300

Description

Replaces the contents of an existing ASText object with the specified multi-byte string of the specified script. This is a wrapper around ASTextFromSizedEncoded(); the script is converted to a host encoding using ASScriptToHostEncoding().

Syntax

void ASTextSetSizedScriptText(ASText str, const char *text, ASTArraySize len, ASScript script);

Parameters

`str`	IN/OUT A string.
`text`	IN/OUT A pointer to the text string.
`len`	IN/OUT The length of the text string.
`script`	IN/OUT The writing script.

Exceptions

genErrBadParm

is raised if text is NULL.

ASTextSetSizedUnicode

Header: ASExtraProcs.h:243

Description

Replaces the contents of an existing ASText object with the specified Unicode string. This string is not expected to have 0xFE 0xFF prepended or embedded country/language identifiers.

The string cannot contain a NULL character.

Related Methods

ASTextSetUnicode

Syntax

void ASTextSetSizedUnicode(ASText str, const ASUTF16Val *ucsValue, ASUnicodeFormat format, ASTArraySize len);

Parameters

`str`	(Filled by the method) A string.
`ucsValue`	A Unicode string.
`format`	The Unicode format.
`len`	The length of the string in bytes.

ASTextSetUnicode

Header: ASExtraProcs.h:226

Description

Alters an existing string from a NULL-terminated Unicode string. This string is not expected to have 0xFE 0xFF prepended or embedded country/language identifiers.

Related Methods

ASTextSetSizedUnicode

Syntax

void ASTextSetUnicode(ASText str, const ASUTF16Val *ucsValue, ASUnicodeFormat format);

Parameters

`str`	(Filled by the method) A string.
`ucsValue`	A Unicode string.
`format`	The Unicode format.

ASUCS_GetPasswordFromUnicode

Header: ASExtraProcs.h:2441

Description

Converts user input of a password to a form that can be used by Acrobat to open a file.

Syntax

void ASUCS_GetPasswordFromUnicode(ASUTF16Val *inPassword, void **outPassword, ASBool useUTF);

Parameters

`inPassword`	IN A host-endian, 16-bit `NULL`-terminated Unicode string.
`outPassword`	OUT A location to store a pointer to an allocated `char` `*` `NULL`-terminated string.
`useUTF`	IN A flag for controlling the conversion. Prior to Acrobat 9.0, passwords were converted from host code-page encoding (8-bit mode) to `PDFDocEncoding`. If `useUTF` `==` `false`, this routine does the same, starting from 16-bit Unicode. With encryption, Acrobat 9.0 and later allows Unicode passwords, normalized and converted to UTF-8 encoding. If `useUTF` `==` `true`, such a Unicode password is what is returned.