threading - python threads vs processes

Python multi-thread multi-interpreter API de Python (1)

Los intérpretes de sub en Python no están bien documentados o incluso bien soportados. Lo siguiente es lo mejor de mi comprensión. Parece funcionar bien en la práctica.

Hay dos conceptos importantes que se deben entender al tratar con subprocesos y subprocesos en Python. En primer lugar, el intérprete de Python no es realmente multiproceso. Tiene un bloqueo de intérprete global (GIL) que debe adquirirse para realizar casi cualquier operación de Python (hay algunas excepciones raras a esta regla).

En segundo lugar, cada combinación de subprocesos y subprocesos debe tener su propio estado de subprocesos. El intérprete crea un estado de subproceso para cada subproceso administrado por él, pero si desea utilizar Python desde un subproceso no creado por ese intérprete, debe crear un nuevo estado de subproceso.

Primero necesitas crear los sub interpretes:

Inicializar Python

Py_Initialize();

Inicializar el soporte de hilos de Python

Requerido si planea llamar a Python desde múltiples hilos). Esta convocatoria también adquiere la GIL.

PyEval_InitThreads();

Guardar el estado del hilo actual

Podría haber usado PyEval_SaveThread() , pero uno de sus efectos secundarios es liberar la GIL, que luego debe volver a adquirirse.

PyThreadState* _main = PyThreadState_Get();

Crear los sub intérpretes

PyThreadState* ts1 = Py_NewInterpreter(); PyThreadState* ts2 = Py_NewInterpreter();

Restaurar el estado del hilo principal del intérprete

PyThreadState_Swap(_main);

Ahora tenemos dos estados de subprocesos para los sub interpretes. Estos estados de hilo solo son válidos en el hilo donde se crearon. Cada subproceso que quiera usar uno de los subprocesadores debe crear un estado de subproceso para esa combinación de subproceso e intérprete.

Usando un sub interprete de un nuevo hilo

Aquí hay un código de ejemplo para usar un subprocturador en un subproceso nuevo que no fue creado por el intérprete sub. El nuevo subproceso debe adquirir el GIL, crear un nuevo estado de subproceso para la combinación de subproceso e interpretación y convertirlo en el estado de subproceso actual. Al final se debe hacer lo contrario para limpiar.

void do_stuff_in_thread(PyInterpreterState* interp) { // acquire the GIL PyEval_AcquireLock(); // create a new thread state for the the sub interpreter interp PyThreadState* ts = PyThreadState_New(interp); // make ts the current thread state PyThreadState_Swap(ts); // at this point: // 1. You have the GIL // 2. You have the right thread state - a new thread state (this thread was not created by python) in the context of interp // PYTHON WORK HERE // release ts PyThreadState_Swap(NULL); // clear and delete ts PyThreadState_Clear(ts); PyThreadState_Delete(ts); // release the GIL PyEval_ReleaseLock(); }

Ahora cada hilo puede hacer lo siguiente:

Hilo1

do_stuff_in_thread(ts1->interp);

Hilo 2

do_stuff_in_thread(ts1->interp);

Hilo3

do_stuff_in_thread(ts2->interp);

Al llamar a Py_Finalize() destruyen todos los Py_Finalize() . Alternativamente, el puede ser destruido manualmente. Esto debe hacerse en el subproceso principal, utilizando los estados de subproceso creados al crear los intérpretes sub. Al final, haga que el hilo principal del intérprete establezca el estado actual.

// make ts1 the current thread state PyThreadState_Swap(ts1); // destroy the interpreter Py_EndInterpreter(ts1); // make ts2 the current thread state PyThreadState_Swap(ts2); // destroy the interpreter Py_EndInterpreter(ts2); // restore the main interpreter thread state PyThreadState_Swap(_main);

Espero que esto haga las cosas un poco más claras.

Tengo un pequeño ejemplo completo escrito en C ++ en github .

Estoy jugando con la API de C para Python, pero es bastante difícil entender algunos casos de esquina. Podría probarlo, pero parece que es propenso a errores y consume mucho tiempo. Así que vengo aquí para ver si alguien ya ha hecho esto.

La pregunta es, ¿cuál es la forma correcta de administrar un subproceso con subprocesos múltiples, sin relación directa entre los subprocesos y subprocesos?

Py_Initialize(); PyEval_InitThreads(); /* <-- needed? */ _main = PyEval_SaveThread(); /* <-- acquire lock? does it matter? */ /* maybe do I not need it? */ i1 = Py_NewInterpreter(); i2 = Py_NewInterpreter();

¿Uso un mutex? ¿Se requiere usar cerraduras? La función de subprocesos debe ser algo como lo siguiente: (Los subprocesos son no de python, probablemente subprocesos de POSIX)

Hilo1

_save = PyThreadState_Swap(i1); // python work PyThreadState_Restore(_save);

Hilo 2 (casi idéntico)

_save = PyThreadState_Swap(i1); // python work PyThreadState_Restore(_save);

Thread3 (casi idéntico, pero con el sub-intérprete i2 )

_save = PyThreadState_Swap(i2); // python work PyThreadState_Restore(_save);

¿Es esto correcto? ¿Es este el caso general que quiero lograr? ¿Hay condiciones de carrera?

¡Gracias!