tutorial - python: lee el archivo desde y hacia líneas de texto específicas

tutorial django (4)

Esto debería ser un comienzo para ti:

started = False collected_lines = [] with open(path, "r") as fp: for i, line in enumerate(fp.readlines()): if line.rstrip() == "Start": started = True print "started at line", i # counts from zero ! continue if started and line.rstrip()=="End": print "end at line", i break # process line collected_lines.append(line.rstrip())

El generador de enumerate toma un generador y enumera las iteraciones. P.ej.

print list(enumerate("a b c".split()))

huellas dactilares

[ (0, "a"), (1,"b"), (2, "c") ]

ACTUALIZAR :

el póster solicitó usar una expresión regular para hacer coincidir líneas como "===" y "======":

import re print re.match("^=+$", "===") is not None print re.match("^=+$", "======") is not None print re.match("^=+$", "=") is not None print re.match("^=+$", "=abc") is not None print re.match("^=+$", "abc=") is not None

No estoy hablando de números de línea específicos porque estoy leyendo varios archivos con el mismo formato pero varían en longitud.
Digamos que tengo este archivo de texto:

Something here... ... ... ... Start #I want this block of text a b c d e f g h i j k l m n End #until this line of the file something here... ... ... ...

Espero que sepas a qué me refiero. Estaba pensando en iterar a través del archivo y luego buscar usando la expresión regular para encontrar el número de línea de "Inicio" y "Finalizar", luego uso el caché de línea para leer desde la línea de inicio hasta la línea de finalización. Pero, ¿cómo obtener el número de línea? ¿Qué función puedo usar?

Puedes usar una expresión regular con bastante facilidad. Puede hacerlo más robusto según sea necesario, a continuación se muestra un ejemplo simple.

>>> import re >>> START = "some" >>> END = "Hello" >>> test = "this is some/nsample text/nthat has the/nwords Hello World/n" >>> m = re.compile(r''%s.*?%s'' % (START,END),re.S) >>> m.search(test).group(0) ''some/nsample text/nthat has the/nwords Hello''

Aquí hay algo que funcionará:

data_file = open("test.txt") block = "" found = False for line in data_file: if found: block += line if line.strip() == "End": break else: if line.strip() == "Start": found = True block = "Start" data_file.close()

Si simplemente desea el bloque de texto entre el inicio y el final, puede hacer algo simple como:

with open(''test.txt'') as input_data: # Skips text before the beginning of the interesting block: for line in input_data: if line.strip() == ''Start'': # Or whatever test is needed break # Reads text until the end of the block: for line in input_data: # This keeps reading the file if line.strip() == ''End'': break print line # Line is extracted (or block_of_lines.append(line), etc.)

De hecho, no es necesario manipular los números de línea para leer los datos entre los marcadores de Inicio y Fin.

La lógica ("leer hasta ...") se repite en ambos bloques, pero es bastante clara y eficiente (otros métodos generalmente implican verificar algún estado [antes del bloque / dentro del bloque / final del bloque alcanzado], que incurre en una penalización de tiempo).