lenguaje - manejo de archivos en c ejemplos

fopen para todo, ¿es esto posible? (2)

Como sugerí en un comentario, debería echarle un vistazo a la libicu que es una biblioteca C de plataforma cruzada para el manejo de Unicode, creada por IBM. Proporciona soporte adicional para C ++ y Java con una clase String muy poderosa. Se usa en muchos lugares, como Android e iOS, por lo que es muy estable y maduro.

Solía programar ventanas, pero quiero intentar hacer una aplicación multiplataforma. Y tengo algunas preguntas, si no te importa:

Pregunta 1

¿Hay alguna forma de abrir el archivo UNICODE / ASCII y detectar automáticamente su codificación utilizando ANSI desnudo C. MSDN dice que fopen () puede cambiar entre varios formatos UNICODE (utf-8, utf-16, UNICODE BI / LI) si voy a usar indicador "ccs = UNICODE". Se ha encontrado experimentalmente que el cambio de UNICODE a ASCII no está sucediendo, pero al tratar de resolver este problema, descubrí que los archivos de texto Unicode tienen algunos prefijos como 0xFFFE, 0xFEFF o 0xFEBB.

FILE *file; { __int16 isUni; file = _tfopen(filename, _T("rb")); fread(&(isUni),1,2,file); fclose(file); if( isUni == (__int16)0xFFFE || isUni == (__int16)0xFEFF || isUni == (__int16)0xFEBB) file = _tfopen(filename, _T("r,ccs=UNICODE")); else file = _tfopen(filename, _T("r")); }

Entonces, ¿puedo hacer algo así como esta multiplataforma y no tan feo?

Pregunta 2

Puedo hacer algo como esto en Windows, pero ¿funcionará en Linux?

file = fopen(filename, "r"); fwscanf(file,"%lf",buffer);

Si no, ¿existe algún tipo de función ANSI C para convertir cadenas ASCII a Unicode? Quiero trabajar con cadenas Unicode en mi programa.

Pregunta 3

Además, necesito enviar cadenas Unicode a la consola. Hay setlocale (*) en windows, pero ¿qué debería hacer en Linux? Parece que la consola ya está allí.

Pregunta 4

En términos generales, quiero trabajar con Unicode en mi programa, pero tuve algunos problemas extraños:

f = fopen("inc.txt","rt"); fwprintf(f,L"Текст"); // converted successfully fclose(f); f = fopen("inc_u8.txt","rt, ccs = UNICODE"); fprintf(f,"text"); // failed to convert fclose(f);

PD: ¿Hay algún buen libro sobre programación multiplataforma, algo con comparación de código de Windows y programas de Linux? Y algunos libros sobre formas de usar Unicode, métodos prácticos, eso es. No quiero sumergirme en la simple historia de UNICODE BI / LI, estoy interesado en bibliotecas específicas de C / C ++.

Pregunta 1:

Sí, puede detectar la marca de orden de bytes, que es la secuencia de bytes que descubrió, SI SU ARCHIVO TIENE UNA.
Una búsqueda en Google y hará el resto. En cuanto a "no tan feo": puedes refactorizar / embellecer tu código, por ejemplo, escribir una función para determinar la lista de materiales, y hacerlo al principio, luego llamar a fopen o _tfopen según sea necesario. Luego puede refactorizar eso nuevamente y escribir su propia función fopen. Pero seguirá siendo feo.

Pregunta 2:

Sí, pero las funciones Unicode no siempre se llaman igual en Linux que en Windows.
Use define. Tal vez escriba su propio TCHAR.H

Pregunta 3:

#include <locale.h> setlocale(LC_ALL, "en.UTF-8")

hombre 3 setlocale

Pregunta 4:
Solo usa fwprintf.
El otro no es un estándar.

Puede usar el kit de herramientas wxWidgets.
Utiliza unicode y usa clases que tienen implementaciones para lo mismo en Windows y en Linux y Unix y Mac.

La mejor pregunta para usted es cómo convertir ASCII a Unicode y viceversa. Eso dice así:

std::string Unicode2ASCII( std::wstring wstrStringToConvert ) { size_t sze_StringLength = wstrStringToConvert.length() ; if(0 == sze_StringLength) return "" ; char* chrarry_Buffer = new char[ sze_StringLength + 1 ] ; wcstombs( chrarry_Buffer, wstrStringToConvert.c_str(), sze_StringLength ) ; // Unicode2ASCII, const wchar_t* C-String 2 mulibyte C-String chrarry_Buffer[sze_StringLength] = ''/0'' ; std::string strASCIIstring = chrarry_Buffer ; delete chrarry_Buffer ; return strASCIIstring ; } std::wstring ASCII2Unicode( std::string strStringToConvert ) { size_t sze_StringLength = strStringToConvert.length() ; if(0 == sze_StringLength) return L"" ; wchar_t* wchrarry_Buffer = new wchar_t[ sze_StringLength + 1 ] ; mbstowcs( wchrarry_Buffer, strStringToConvert.c_str(), sze_StringLength ) ; // Unicode2ASCII, const. mulibyte C-String 2 wchar_t* C-String wchrarry_Buffer[sze_StringLength] = L''/0'' ; std::wstring wstrUnicodeString = wchrarry_Buffer ; delete wchrarry_Buffer ; return wstrUnicodeString ; }

Editar: Aquí algunas ideas sobre las funciones Unicode disponibles en Linux (wchar.h):

__BEGIN_NAMESPACE_STD /* Copy SRC to DEST. */ extern wchar_t *wcscpy (wchar_t *__restrict __dest, __const wchar_t *__restrict __src) __THROW; /* Copy no more than N wide-characters of SRC to DEST. */ extern wchar_t *wcsncpy (wchar_t *__restrict __dest, __const wchar_t *__restrict __src, size_t __n) __THROW; /* Append SRC onto DEST. */ extern wchar_t *wcscat (wchar_t *__restrict __dest, __const wchar_t *__restrict __src) __THROW; /* Append no more than N wide-characters of SRC onto DEST. */ extern wchar_t *wcsncat (wchar_t *__restrict __dest, __const wchar_t *__restrict __src, size_t __n) __THROW; /* Compare S1 and S2. */ extern int wcscmp (__const wchar_t *__s1, __const wchar_t *__s2) __THROW __attribute_pure__; /* Compare N wide-characters of S1 and S2. */ extern int wcsncmp (__const wchar_t *__s1, __const wchar_t *__s2, size_t __n) __THROW __attribute_pure__; __END_NAMESPACE_STD #ifdef __USE_XOPEN2K8 /* Compare S1 and S2, ignoring case. */ extern int wcscasecmp (__const wchar_t *__s1, __const wchar_t *__s2) __THROW; /* Compare no more than N chars of S1 and S2, ignoring case. */ extern int wcsncasecmp (__const wchar_t *__s1, __const wchar_t *__s2, size_t __n) __THROW; /* Similar to the two functions above but take the information from the provided locale and not the global locale. */ # include <xlocale.h> extern int wcscasecmp_l (__const wchar_t *__s1, __const wchar_t *__s2, __locale_t __loc) __THROW; extern int wcsncasecmp_l (__const wchar_t *__s1, __const wchar_t *__s2, size_t __n, __locale_t __loc) __THROW; #endif /* Special versions of the functions above which take the locale to use as an additional parameter. */ extern long int wcstol_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, int __base, __locale_t __loc) __THROW; extern unsigned long int wcstoul_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, int __base, __locale_t __loc) __THROW; __extension__ extern long long int wcstoll_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, int __base, __locale_t __loc) __THROW; __extension__ extern unsigned long long int wcstoull_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, int __base, __locale_t __loc) __THROW; extern double wcstod_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, __locale_t __loc) __THROW; extern float wcstof_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, __locale_t __loc) __THROW; extern long double wcstold_l (__const wchar_t *__restrict __nptr, wchar_t **__restrict __endptr, __locale_t __loc) __THROW; /* Copy SRC to DEST, returning the address of the terminating L''/0'' in DEST. */ extern wchar_t *wcpcpy (wchar_t *__restrict __dest, __const wchar_t *__restrict __src) __THROW; /* Copy no more than N characters of SRC to DEST, returning the address of the last character written into DEST. */ extern wchar_t *wcpncpy (wchar_t *__restrict __dest, __const wchar_t *__restrict __src, size_t __n) __THROW; #endif /* use GNU */ /* Wide character I/O functions. */ #ifdef __USE_XOPEN2K8 /* Like OPEN_MEMSTREAM, but the stream is wide oriented and produces a wide character string. */ extern __FILE *open_wmemstream (wchar_t **__bufloc, size_t *__sizeloc) __THROW; #endif #if defined __USE_ISOC95 || defined __USE_UNIX98 __BEGIN_NAMESPACE_STD /* Select orientation for stream. */ extern int fwide (__FILE *__fp, int __mode) __THROW; /* Write formatted output to STREAM. This function is a possible cancellation point and therefore not marked with __THROW. */ extern int fwprintf (__FILE *__restrict __stream, __const wchar_t *__restrict __format, ...) /* __attribute__ ((__format__ (__wprintf__, 2, 3))) */; /* Write formatted output to stdout. This function is a possible cancellation point and therefore not marked with __THROW. */ extern int wprintf (__const wchar_t *__restrict __format, ...) /* __attribute__ ((__format__ (__wprintf__, 1, 2))) */; /* Write formatted output of at most N characters to S. */ extern int swprintf (wchar_t *__restrict __s, size_t __n, __const wchar_t *__restrict __format, ...) __THROW /* __attribute__ ((__format__ (__wprintf__, 3, 4))) */; /* Write formatted output to S from argument list ARG. This function is a possible cancellation point and therefore not marked with __THROW. */ extern int vfwprintf (__FILE *__restrict __s, __const wchar_t *__restrict __format, __gnuc_va_list __arg) /* __attribute__ ((__format__ (__wprintf__, 2, 0))) */; /* Write formatted output to stdout from argument list ARG. This function is a possible cancellation point and therefore not marked with __THROW. */ extern int vwprintf (__const wchar_t *__restrict __format, __gnuc_va_list __arg) /* __attribute__ ((__format__ (__wprintf__, 1, 0))) */; /* Write formatted output of at most N character to S from argument list ARG. */ extern int vswprintf (wchar_t *__restrict __s, size_t __n, __const wchar_t *__restrict __format, __gnuc_va_list __arg) __THROW /* __attribute__ ((__format__ (__wprintf__, 3, 0))) */; /* Read formatted input from STREAM. This function is a possible cancellation point and therefore not marked with __THROW. */ extern int fwscanf (__FILE *__restrict __stream, __const wchar_t *__restrict __format, ...) /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; /* Read formatted input from stdin. This function is a possible cancellation point and therefore not marked with __THROW. */ extern int wscanf (__const wchar_t *__restrict __format, ...) /* __attribute__ ((__format__ (__wscanf__, 1, 2))) */; /* Read formatted input from S. */ extern int swscanf (__const wchar_t *__restrict __s, __const wchar_t *__restrict __format, ...) __THROW /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; # if defined __USE_ISOC99 && !defined __USE_GNU / && (!defined __LDBL_COMPAT || !defined __REDIRECT) / && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K) # ifdef __REDIRECT /* For strict ISO C99 or POSIX compliance disallow %as, %aS and %a[ GNU extension which conflicts with valid %a followed by letter s, S or [. */ extern int __REDIRECT (fwscanf, (__FILE *__restrict __stream, __const wchar_t *__restrict __format, ...), __isoc99_fwscanf) /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; extern int __REDIRECT (wscanf, (__const wchar_t *__restrict __format, ...), __isoc99_wscanf) /* __attribute__ ((__format__ (__wscanf__, 1, 2))) */; extern int __REDIRECT_NTH (swscanf, (__const wchar_t *__restrict __s, __const wchar_t *__restrict __format, ...), __isoc99_swscanf) /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; # else extern int __isoc99_fwscanf (__FILE *__restrict __stream, __const wchar_t *__restrict __format, ...); extern int __isoc99_wscanf (__const wchar_t *__restrict __format, ...); extern int __isoc99_swscanf (__const wchar_t *__restrict __s, __const wchar_t *__restrict __format, ...)