texto - manejo de archivos en c# ejemplos

Cómo analizar un archivo de texto con C# (7)

Al igual que ya se mencionó, recomendaría utilizar expresiones regulares (en System.Text) para realizar este tipo de trabajo.

En combinación con una herramienta sólida como RegexBuddy , está considerando manejar situaciones complejas de análisis de registros de texto, así como obtener resultados rápidamente. La herramienta lo hace realmente fácil.

Espero que ayude.

Por formato de texto, quise decir algo más complicado.

Al principio comencé a agregar manualmente las 5000 líneas del archivo de texto en el que estoy haciendo esta pregunta, a mi proyecto.

El archivo de texto tiene 5000 líneas con diferente longitud. Por ejemplo:

1 1 ITEM_ETC_GOLD_01 골드(소) xxx xxx xxx_TT_DESC 0 0 3 3 5 0 180000 3 0 1 0 0 255 1 1 0 0 0 0 0 0 0 0 0 0 -1 0 -1 0 -1 0 -1 0 -1 0 0 0 0 0 0 0 100 0 0 0 xxx item/etc/drop_ch_money_small.bsr xxx xxx xxx 0 2 0 0 1 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 표현할 골드의 양(param1이상) -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx 0 0 1 4 ITEM_ETC_HP_POTION_01 HP 회복 약초 xxx SN_ITEM_ETC_HP_POTION_01 SN_ITEM_ETC_HP_POTION_01_TT_DESC 0 0 3 3 1 1 180000 3 0 1 1 1 255 3 1 0 0 1 0 60 0 0 0 1 21 -1 0 -1 0 -1 0 -1 0 -1 0 0 0 0 0 0 0 100 0 0 0 xxx item/etc/drop_ch_bag.bsr item/etc/hp_potion_01.ddj xxx xxx 50 2 0 0 1 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 120 HP회복양 0 HP회복양(%) 0 MP회복양 0 MP회복양(%) -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx 0 0 1 5 ITEM_ETC_HP_POTION_02 HP 회복약 (소) xxx SN_ITEM_ETC_HP_POTION_02 SN_ITEM_ETC_HP_POTION_02_TT_DESC 0 0 3 3 1 1 180000 3 0 1 1 1 255 3 1 0 0 1 0 110 0 0 0 2 39 -1 0 -1 0 -1 0 -1 0 -1 0 0 0 0 0 0 0 100 0 0 0 xxx item/etc/drop_ch_bag.bsr item/etc/hp_potion_02.ddj xxx xxx 50 2 0 0 2 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 220 HP회복양 0 HP회복양(%) 0 MP회복양 0 MP회복양(%) -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx 0 0

El texto entre el primer carácter (1) y el segundo carácter (1/4/5) no es un espacio en blanco, es una pestaña. No hay espacios en blanco en ese archivo de texto.

Lo que quiero:

Quiero obtener el segundo entero (En las tres líneas que publiqué arriba, los segundos enteros son 1,4 y 5) y la cadena en el medio de cada línea que indica la ruta (Comienza con "elemento /" y termina con el extensión de archivo ".ddj").

Mi problema:

Cuando busco "Formato de texto C #", todo lo que obtengo es cómo abrir un archivo de texto y cómo escribir un archivo de texto en C #. No sé cómo buscar texto dentro de un archivo de texto. Tampoco puedo buscar para el primer entero, porque en caso de que sea un pequeño entero como en las tres líneas que publiqué arriba, no podré encontrar la ubicación de corrent porque, por ejemplo, "1" podría existir en una ubicación diferente.

Mi pregunta:

Sería lo mejor si escribo un programa que eliminaría cualquier cosa, pero lo que necesito.

La otra manera en mi mente es buscar directamente dentro de ese archivo, pero como mencioné anteriormente, es posible que obtenga la ubicación incorrecta del segundo entero si es demasiado bajo.

Por favor sugiera algo, no puedo formatear todo esto a mano.

OK, esto es lo que hacemos: abra el archivo, léalo línea por línea y divídalo por pestañas. Luego tomamos el segundo entero y recorremos el resto para encontrar el camino.

StreamReader reader = File.OpenText("filename.txt"); string line; while ((line = reader.ReadLine()) != null) { string[] items = line.Split(''/t''); int myInteger = int.Parse(items[1]); // Here''s your integer. // Now let''s find the path. string path = null; foreach (string item in items) { if (item.StartsWith("item//") && item.EndsWith(".ddj")) path = item; } // At this point, `myInteger` and `path` contain the values we want // for the current line. We can then store those values or print them, // or anything else we like. }

Otra solución, esta vez haciendo uso de expresiones regulares:

using System.Text.RegularExpressions; ... Regex parts = new Regex(@"^/d+/t(/d+)/t.+?/t(item//[^/t]+/.ddj)"); StreamReader reader = FileInfo.OpenText("filename.txt"); string line; while ((line = reader.ReadLine()) != null) { Match match = parts.Match(line); if (match.Success) { int number = int.Parse(match.Group(1).Value); string path = match.Group(2).Value; // At this point, `number` and `path` contain the values we want // for the current line. We can then store those values or print them, // or anything else we like. } }

Esa expresión es un poco compleja, así que aquí está desglosada:

^ Start of string /d+ "/d" means "digit" - 0-9. The "+" means "one or more." So this means "one or more digits." /t This matches a tab. (/d+) This also matches one or more digits. This time, though, we capture it using brackets. This means we can access it using the Group method. /t Another tab. .+? "." means "anything." So "one or more of anything". In addition, it''s lazy. This is to stop it grabbing everything in sight - it''ll only grab as much as it needs to for the regex to work. /t Another tab. (item//[^/t]+/.ddj) Here''s the meat. This matches: "item/<one or more of anything but a tab>.ddj"

Podrías hacer algo como:

using (TextReader rdr = OpenYourFile()) { string line; while ((line = rdr.ReadLine()) != null) { string[] fields = line.Split(''/t''); // THIS LINE DOES THE MAGIC int theInt = Convert.ToInt32(fields[1]); } }

La razón por la que no se encontraron resultados relevantes cuando se busca ''formatear'' es porque la operación que está realizando se llama ''análisis sintáctico''.

Prueba expresiones regulares Puede encontrar un cierto patrón en su texto y reemplazarlo con algo que desee. No puedo darte el código exacto en este momento pero puedes probar tus expresiones usando esto.

http://www.radsoftware.com.au/regexdesigner/

Puede abrir el archivo y usar StreamReader.ReadLine para leer el archivo en línea por línea. Luego puede usar String.Split para dividir cada línea en pedazos (use un delimitador / t) para extraer el segundo número.

Como el número de elementos es diferente, necesitarás buscar en la cadena el patrón ''elemento / *. Ddj''.

Para eliminar un elemento, podría (por ejemplo) mantener todos los contenidos del archivo en la memoria y escribir un nuevo archivo cuando el usuario haga clic en ''Guardar''.

Una forma que he encontrado realmente útil en situaciones como esta es ir a la vieja escuela y usar el proveedor Jet OLEDB, junto con un archivo schema.ini para leer archivos delimitados por tabuladores grandes usando ADO.Net. Obviamente, este método solo es útil si conoce el formato del archivo que se va a importar.

public void ImportCsvFile(string filename) { FileInfo file = new FileInfo(filename); using (OleDbConnection con = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=/"" + file.DirectoryName + "/"; Extended Properties=''text;HDR=Yes;FMT=TabDelimited'';")) { using (OleDbCommand cmd = new OleDbCommand(string.Format ("SELECT * FROM [{0}]", file.Name), con)) { con.Open(); // Using a DataReader to process the data using (OleDbDataReader reader = cmd.ExecuteReader()) { while (reader.Read()) { // Process the current reader entry... } } // Using a DataTable to process the data using (OleDbDataAdapter adp = new OleDbDataAdapter(cmd)) { DataTable tbl = new DataTable("MyTable"); adp.Fill(tbl); foreach (DataRow row in tbl.Rows) { // Process the current row... } } } } }

Una vez que tenga los datos en un formato agradable como una tabla de datos, filtrar los datos que necesita se vuelve bastante trivial.