Convert between different charsets. More...
Defines | |
#define | CHUNK_ALLOC 4 |
Functions | |
static void | _iconv_close (iconv_t *cd) |
static const char * | collate2charset (int sql_collate, int lcid) |
static int | lookup_canonic (const CHARACTER_SET_ALIAS aliases[], const char *charset_name) |
static int | skip_one_input_sequence (iconv_t cd, const TDS_ENCODING *charset, const char **input, size_t *input_size) |
Move the input sequence pointer to the next valid position. | |
void | tds7_srv_charset_changed (TDSSOCKET *tds, int sql_collate, int lcid) |
static int | tds_canonical_charset (const char *charset_name) |
Determine canonical iconv character set. | |
const char * | tds_canonical_charset_name (const char *charset_name) |
Determine canonical iconv character set name. | |
size_t | tds_iconv (TDSSOCKET *tds, const TDSICONV *conv, TDS_ICONV_DIRECTION io, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft) |
Wrapper around iconv(3). | |
void | tds_iconv_close (TDSSOCKET *tds) |
size_t | tds_iconv_fread (iconv_t cd, FILE *stream, size_t field_len, size_t term_len, char *outbuf, size_t *outbytesleft) |
Read a data file, passing the data through iconv(). | |
void | tds_iconv_free (TDSSOCKET *tds) |
TDSICONV * | tds_iconv_from_collate (TDSSOCKET *tds, int sql_collate, int lcid) |
Get iconv information from a LCID (to support different column encoding under MSSQL2K). | |
static TDSICONV * | tds_iconv_get_info (TDSSOCKET *tds, const char *canonic_charset) |
Get a iconv info structure, allocate and initialize if needed. | |
static void | tds_iconv_info_close (TDSICONV *char_conv) |
static int | tds_iconv_info_init (TDSICONV *char_conv, const char *client_name, const char *server_name) |
Open iconv descriptors to convert between character sets (both directions). | |
void | tds_iconv_open (TDSSOCKET *tds, const char *charset) |
void | tds_srv_charset_changed (TDSSOCKET *tds, const char *charset) |
const char * | tds_sybase_charset_name (const char *charset_name) |
Determine the name Sybase uses for a character set, given a canonical iconv name. | |
size_t | tds_sys_iconv (iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft) |
int | tds_sys_iconv_close (iconv_t cd) |
iconv_t | tds_sys_iconv_open (const char *tocode, const char *fromcode) |
Inputs are FreeTDS canonical names, no other. |
Convert between different charsets.
Set up the initial iconv conversion descriptors.
When the socket is allocated, three TDSICONV structures are attached to iconv. They have fixed meanings:
Other designs that use less data are possible, but these three conversion needs are very often needed. By reserving them, we avoid searching the array for our most common purposes.
To solve different iconv names and portability problems FreeTDS maintains a list of aliases each charset.
First we discover the names of our minimum required charsets (UTF-8, ISO8859-1 and UCS2). Later, as and when it's needed, we try to discover others.
There is one list of canonic names (GNU iconv names) and two sets of aliases (one for other iconv implementations and another for Sybase). For every canonic charset name we cache the iconv name found during discovery.
static int skip_one_input_sequence | ( | iconv_t | cd, | |
const TDS_ENCODING * | charset, | |||
const char ** | input, | |||
size_t * | input_size | |||
) | [static] |
Move the input sequence pointer to the next valid position.
Used when an input character cannot be converted.
static int tds_canonical_charset | ( | const char * | charset_name | ) | [static] |
Determine canonical iconv character set.
const char* tds_canonical_charset_name | ( | const char * | charset_name | ) |
Determine canonical iconv character set name.
size_t tds_iconv | ( | TDSSOCKET * | tds, | |
const TDSICONV * | conv, | |||
TDS_ICONV_DIRECTION | io, | |||
const char ** | inbuf, | |||
size_t * | inbytesleft, | |||
char ** | outbuf, | |||
size_t * | outbytesleft | |||
) |
Wrapper around iconv(3).
Same parameters, with slightly different behavior.
tds | state information for the socket and the TDS protocol | |
io | Enumerated value indicating whether the data are being sent to or received from the server. | |
conv | information about the encodings involved, including the iconv(3) conversion descriptors. | |
inbuf | address of pointer to the input buffer of data to be converted. | |
inbytesleft | address of count of bytes in inbuf. | |
outbuf | address of pointer to the output buffer. | |
outbytesleft | address of count of bytes in outbuf. |
number | of irreversible conversions performed. -1 on error, see iconv(3) documentation for a description of the possible values of errno. |
If a character in inbuf cannot be converted because no such cbaracter exists in the outbuf character set, we emit messages similar to the ones Sybase emits when it fails such a conversion. The message varies depending on the direction of the data. On a read error, we emit Msg 2403, Severity 16 (EX_INFO): "WARNING! Some character(s) could not be converted into client's character set. Unconverted bytes were changed to question marks ('?')." On a write error we emit Msg 2402, Severity 16 (EX_USER): "Error converting client characters into server's character set. Some character(s) could not be converted." and return an error code. Client libraries relying on this routine should reflect an error back to the application.
Check for variable multibyte non-UTF-8 input character set.
Use more robust error message generation.
For reads, cope with outbuf encodings that don't have the equivalent of an ASCII '?'.
Support alternative to '?' for the replacement character.
size_t tds_iconv_fread | ( | iconv_t | cd, | |
FILE * | stream, | |||
size_t | field_len, | |||
size_t | term_len, | |||
char * | outbuf, | |||
size_t * | outbytesleft | |||
) |
Read a data file, passing the data through iconv().
static int tds_iconv_info_init | ( | TDSICONV * | char_conv, | |
const char * | client_name, | |||
const char * | server_name | |||
) | [static] |
Open iconv descriptors to convert between character sets (both directions).
1. Look up the canonical names of the character sets. 2. Look up their widths. 3. Ask iconv to open a conversion descriptor. 4. Fail if any of the above offer any resistance.
const char* tds_sybase_charset_name | ( | const char * | charset_name | ) |
Determine the name Sybase uses for a character set, given a canonical iconv name.
iconv_t tds_sys_iconv_open | ( | const char * | tocode, | |
const char * | fromcode | |||
) |
Inputs are FreeTDS canonical names, no other.
No alias list is consulted.