funciones - Datos de paso de PostgreSQL de CTE recursivo a la función

pl postgresql (4)

Al volver a leer la pregunta del OP, se me ocurrió esta solución:

fuente

with recursive -- this denotes that at least one CTE is using recursion inputs as ( select 1 as d_source, 4 as d_target ) ,traverse_path(filter, source, target, path, bingo) as ( select bp.node_x, bp.node_x, bp.node_y, bp.node_x || ''.'' || bp.node_y, max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool from best_path bp cross join inputs i where bp.node_x = i.d_source -- filter union select tp.filter, bp.node_x, bp.node_y, path || ''.'' || node_y, max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool from best_path bp cross join inputs i join traverse_path tp on bp.node_x = tp.target and not tp.bingo ) select tp.* from traverse_path tp cross join inputs i -- if Postgresql has Oracle KEEP windowing, -- we don''t need to use WHERE clause here where not tp.bingo or ( tp.bingo and tp.target = d_target )

La cláusula WHERE anterior podría acortarse a:

WHERE not bingo or target = 4

Salida:

FILTER SOURCE TARGET PATH BINGO 1 1 2 1.2 0 1 1 3 1.3 0 1 2 4 1.2.4 1 1 3 6 1.3.6 0 1 3 7 1.3.7 0

Prueba en vivo: http://www.sqlfiddle.com/#!1/cdde6/55

Aquí está el resultado para Source = 2, Target = 5:

FILTER SOURCE TARGET PATH BINGO 2 2 5 2.5 1

Datos:

CREATE TABLE best_path (node_x int, node_y int, strength int); INSERT INTO best_path (node_x, node_y, strength) VALUES (1, 2, 1), (1, 3, 1), (2, 4, 1), (2, 5, 1), (3, 6, 1), (3, 7, 1), (4, 8, 1), (4, 9, 1), (5, 10, 1), (5, 11, 1);

Tengo el siguiente problema: estoy tratando de descubrir todas las rutas posibles desde el nodo fuente ( node_s ) al nodo objetivo ( node_t ).

El formato de la tabla original con los bordes del gráfico es simple: | node_x | node_y | fuerza | , donde "node_x" -> "node_y" es un borde directo con la fuerza del borde como "peso".

La idea es si en cualquier momento durante la exploración de las rutas descubrimos que un nodo entre sus hijos tiene el objetivo node_t , registramos esta ruta y dejamos de explorar rutas desde este nodo; de lo contrario, continuamos la exploración.

La solución simple fue usar el CTE recursivo de PostgreSQL que construye el cierre transitivo del gráfico:

WITH RECURSIVE transitive_closure (source, target, weight, path_string) AS ( --non-recurive term SELECT b.node_x, b.node_y, b.strength, b.node_x || ''.'' || b.node_y || ''.'' AS path_string FROM best_path b WHERE b.node_x = node_s --source_node UNION --recurive term SELECT tc.source, b.node_y, least(tc.weight,b.strength), tc.path_string || b.node_y || ''.'' AS path_string FROM best_path AS b JOIN transitive_closure AS tc ON b.node_x = tc.target WHERE tc.path_string NOT LIKE ''%'' || b.node_y || ''.%'' ) SELECT * FROM transitive_closure tc WHERE tc.target = node_t --target_node

El código anterior descubrirá todas las rutas posibles desde el nodo de origen node_s . Solo después de la construcción del cierre transitivo, puedo seleccionar las filas de rutas necesarias desde el nodo de origen al nodo de destino (ver la última instrucción SELECT).

Ejemplo:

La tabla best_path tiene los siguientes datos:

node_x | node_y | strength --------+--------+---------- 1 | 2 | 1 1 | 3 | 1 2 | 4 | 1 2 | 5 | 1 3 | 6 | 1 3 | 7 | 1 4 | 8 | 1 4 | 9 | 1 5 | 10 | 1 5 | 11 | 1

consulta:

encuentre las rutas desde el nodo de origen = 1, al nodo de destino = 4

resultado:

source | target | strength | path_string --------+--------+----------+------------ 1 | 2 | 1 | 1.2. 1 | 3 | 1 | 1.3. 1 | 4 | 1 | 1.2.4. 1 | 5 | 1 | 1.2.5. 1 | 6 | 1 | 1.3.6. 1 | 7 | 1 | 1.3.7. 1 | 8 | 1 | 1.2.4.8. 1 | 9 | 1 | 1.2.4.9. 1 | 10 | 1 | 1.2.5.10. 1 | 11 | 1 | 1.2.5.11.

Esto no es lo que necesito. Dado que ya hay un borde directo desde el nodo 2 al nodo 4 (objetivo), no necesito las rutas 1.2.5., 1.2.4.8., 1.2.4.9., 1.2.5.10., 1.2.5.11., La exploración del camino para el nodo 2 debe detenerse en el punto en que se descubrió el camino de 2 a 4.

En resumen, no quiero descubrir las rutas del nodo si ya tiene un borde directo al nodo objetivo. Esto significa que en un término recursivo de CTE, me gustaría tener alguna condición que diga lo siguiente, el pseudocódigo sigue:

WITH RECURSIVE transitive_closure (source, target, weight, path_string) AS ( --non-recurive term (as before) SELECT b.node_x, b.node_y, b.strength, b.node_x || ''.'' || b.node_y || ''.'' AS path_string FROM best_path b WHERE b.node_x = node_s --source_node UNION --recurive term SELECT tc.source, b.node_y, least(tc.weight,b.strength), tc.path_string || b.node_y || ''.'' AS path_string FROM best_path AS b JOIN transitive_closure AS tc ON b.node_x = tc.target WHERE tc.path_string NOT LIKE ''%'' || b.node_y || ''.%'' AND b.node_y = node_t --will select only rows with direct edge to target UNION (second union is not allowed in CTE) SELECT those rows which do not have direct edge to target AND which parents did not contribute to constructing the query above. i.e. b.node_x = tc.target where there is no direct edge between b.node_x to node_t )

Como resultado de una consulta para encontrar las rutas del nodo fuente = 1, al nodo objetivo = 4, me gustaría tener lo siguiente:

source | target | strength | path_string --------+--------+----------+------------ 1 | 2 | 1 | 1.2. 1 | 3 | 1 | 1.3. 1 | 4 | 1 | 1.2.4. 1 | 6 | 1 | 1.3.6. 1 | 7 | 1 | 1.3.7.

¡Gracias de antemano por tu ayuda!

Ya lo he intentado de muchas maneras: por ejemplo, las condiciones en las cláusulas FROM / WHERE, tratando de pasar el CTE a la función, pero sin éxito.

Cualquier sugerencia será apreciada.

Tengo mi propia función recursiva que logra lo que quiero, sin embargo, es muy lenta en una enorme cantidad de datos; y el CTE de PostgreSQL aparentemente está bien optimizado, por lo que me gustaría profundizar en él un poco más.

Solución optimizada, no más cláusula WHERE sobre el resultado final; a pesar de la solución específica de Postgresql, es decir, usamos DISTINCT ON para seleccionar una fila específica:

Teniendo en cuenta esta información:

Consulta, primera etapa, muestra detrás de escena (fuente = 1, objetivo = 11):

with recursive -- this denotes that at least one CTE is using recursion inputs as ( select 1 as d_source, 11 as d_target ) ,traverse_path(filter, source, target, path, bingo, distincter) as ( select bp.node_x, bp.node_x, bp.node_y, bp.node_x || ''.'' || bp.node_y ,max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool ,coalesce( nullif( max((bp.node_y = i.d_target)::int) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , 0) ,bp.node_y) from best_path bp cross join inputs i where bp.node_x = i.d_source -- filter union select tp.filter, bp.node_x, bp.node_y, path || ''.'' || node_y ,max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool ,coalesce( nullif( max((bp.node_y = i.d_target)::int) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , 0) ,bp.node_y) from best_path bp cross join inputs i join traverse_path tp on bp.node_x = tp.target and not tp.bingo ) select tp.* from traverse_path tp

Salida para source = 1, target = 11: http://www.sqlfiddle.com/#!1/db290/56

FILTER SOURCE TARGET PATH BINGO DISTINCTER 1 1 2 1.2 0 2 1 1 3 1.3 0 3 1 2 4 1.2.4 0 4 1 2 5 1.2.5 0 5 1 3 6 1.3.6 0 6 1 3 7 1.3.7 0 7 1 4 8 1.2.4.8 0 8 1 4 9 1.2.4.9 0 9 1 5 11 1.2.5.11 1 1 1 5 10 1.2.5.10 1 1 1 5 12 1.2.5.12 1 1

Como podemos ver, el objetivo 11 es la primera fila entre las fuentes de 5. ¿Cómo sucedió esto? En nuestra columna DISTINCTER, usamos ORDER BY en MAX, esta es una de las raras ocasiones en que un ORDER en ventanas MAX tiene sentido. Simplemente lo usamos para ordenar nuestro resultado. Intenté colocar el ORDER BY al final de la consulta, pero la base de datos se queja de que no puede usar ORDER en CTE. CTE prohíbe colocar una cláusula ORDER BY. Pero como todos sabemos, podemos influir en la clasificación en la cláusula OVER() , así es como se clasifican nuestros resultados. En el resultado anterior, entre la fuente de 5, el número 11 ordena antes de 10 y 12, ya que 11 es nuestro objetivo. Entonces, cuando hacemos un DISTINCT ON (función específica de Postgresql), Postgres recogerá esa primera línea, por lo tanto, recuperará el objetivo 11.

Esta es nuestra consulta final, optimizada, aunque específica de Postgresql:

with recursive -- this denotes that at least one CTE is using recursion inputs as ( select 1 as d_source, 11 as d_target ) ,traverse_path(filter, source, target, path, bingo) as ( select distinct on ( bp.node_x, coalesce( nullif( max((bp.node_y = i.d_target)::int) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , 0) ,bp.node_y) ) bp.node_x, bp.node_x, bp.node_y, bp.node_x || ''.'' || bp.node_y ,max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool from best_path bp cross join inputs i where bp.node_x = i.d_source -- filter union select distinct on ( bp.node_x, coalesce( nullif( max((bp.node_y = i.d_target)::int) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , 0) ,bp.node_y) ) tp.filter, bp.node_x, bp.node_y, path || ''.'' || node_y ,max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool from best_path bp cross join inputs i join traverse_path tp on bp.node_x = tp.target and not tp.bingo ) select tp.* from traverse_path tp

Salida para source = 1, target = 11 http://www.sqlfiddle.com/#!1/db290/55

FILTER SOURCE TARGET PATH BINGO 1 1 2 1.2 0 1 1 3 1.3 0 1 2 4 1.2.4 0 1 2 5 1.2.5 0 1 3 6 1.3.6 0 1 3 7 1.3.7 0 1 4 8 1.2.4.8 0 1 4 9 1.2.4.9 0 1 5 11 1.2.5.11 1

Salida para source = 1, target = 4: http://www.sqlfiddle.com/#!1/db290/53

FILTER SOURCE TARGET PATH BINGO 1 1 2 1.2 0 1 1 3 1.3 0 1 2 4 1.2.4 1 1 3 6 1.3.6 0 1 3 7 1.3.7 0

Salida para source = 2, target = 5: http://www.sqlfiddle.com/#!1/db290/54

FILTER SOURCE TARGET PATH BINGO 2 2 5 2.5 1

Otro enfoque, en lugar del enfoque BINGO. Utilice continue_search como la lógica para continuar el travesal. Y use CADA función agregada para determinar si necesitamos continuar atravesando el gráfico.

Así es como funciona: http://www.sqlfiddle.com/#!1/db290/84

Consulta final: http://www.sqlfiddle.com/#!1/db290/85

Es interesante notar que TODOS son muy parecidos a los ingleses:

Contraste esto:

,max((bp.node_y = i.d_target)::int) over(partition by bp.node_x) ::bool

Con usar CADA:

,every(bp.node_y <> i.d_target) over(partition by bp.node_x)

¿Cuál es más fácil de leer?

Aquí está el resultado que ilustra el principio de usar TODOS para facilitar DISTINCT ON:

Fuente = 1, Objetivo = 5. Tenga en cuenta que 5 se ordena primero antes que otros números que pertenecen a la misma fuente, esto luego será recogido por DISTINCT ON.

FILTER SOURCE TARGET PATH CONTINUE_SEARCH DISTINCTER 1 1 2 1.2 1 2 1 1 3 1.3 1 3 1 2 5 1.2.5 0 0 1 2 4 1.2.4 0 0 1 3 6 1.3.6 1 6 1 3 7 1.3.7 1 7

Aquí está la consulta que hace ese principio:

with recursive -- this denotes that at least one CTE is using recursion inputs as ( select 1 as d_source, 5 as d_target ) ,traverse_path(filter, source, target, path, continue_search, distincter) as ( select bp.node_x, bp.node_x, bp.node_y, concat(bp.node_x , ''.'' , bp.node_y ) ,every(bp.node_y <> i.d_target) over(partition by bp.node_x) ,coalesce( cast(nullif( every(bp.node_y <> i.d_target) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , true) as int) ,bp.node_y) from best_path bp cross join inputs i where bp.node_x = i.d_source -- filter union select tp.filter, bp.node_x, bp.node_y, concat(path , ''.'' , node_y) ,every(bp.node_y <> i.d_target) over(partition by bp.node_x) ,coalesce( cast(nullif( every(bp.node_y <> i.d_target) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , true) as int) ,bp.node_y) from best_path bp cross join inputs i join traverse_path tp on bp.node_x = tp.target and tp.continue_search ) select tp.* from traverse_path tp

Consulta final:

with recursive -- this denotes that at least one CTE is using recursion inputs as ( select 1 as d_source, 5 as d_target ) ,traverse_path(filter, source, target, path, continue_search) as ( select distinct on ( bp.node_x ,coalesce( cast(nullif( every(bp.node_y <> i.d_target) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , true) as int) ,bp.node_y) ) bp.node_x, bp.node_x, bp.node_y, concat(bp.node_x , ''.'' , bp.node_y ) ,every(bp.node_y <> i.d_target) over(partition by bp.node_x) from best_path bp cross join inputs i where bp.node_x = i.d_source -- filter union select distinct on ( bp.node_x ,coalesce( cast(nullif( every(bp.node_y <> i.d_target) over(partition by bp.node_x order by bp.node_x, bp.node_y = d_target desc, bp.node_y) , true) as int) ,bp.node_y) ) tp.filter, bp.node_x, bp.node_y, concat(path , ''.'' , node_y) ,every(bp.node_y <> i.d_target) over(partition by bp.node_x) from best_path bp cross join inputs i join traverse_path tp on bp.node_x = tp.target and tp.continue_search ) select tp.* from traverse_path tp

Salida:

FILTER SOURCE TARGET PATH CONTINUE_SEARCH 1 1 2 1.2 1 1 1 3 1.3 1 1 2 5 1.2.5 0 1 3 6 1.3.6 1 1 3 7 1.3.7 1

Puede hacer que la búsqueda de la ruta sea más eficiente si comienza desde abajo. Comience por los niños. Si comienza en el padre, implica atravesar a todos los niños; mientras que si buscó desde el niño, solo tiene un padre, por lo tanto, no perderá el tiempo buscando la ruta entre el origen y el destino.

with recursive find_parent(source, target, recentness) as ( select source, target, 0 from tbl where target = 9 union all select i.source, i.target, fp.recentness + 1 from tbl i join find_parent fp on i.target = fp.source ), construct_path(source, target, recentness, path) as ( select source, target, recentness, source || ''.'' || target from find_parent where recentness = (select max(recentness) from find_parent) union select dd.source, dd.target, dd.recentness, cp.path || ''.'' || dd.target from find_parent dd join construct_path cp on dd.recentness = cp.recentness - 1 ) select source, target, path from construct_path order by recentness desc

Salida:

SOURCE TARGET PATH 1 2 1.2 2 4 1.2.4 4 9 1.2.4.9

Prueba en vivo: http://www.sqlfiddle.com/#!1/13e6b/1

Esto está optimizado, corte la recursión en el padre si ya encuentra el específico (fuente).

Fuente = 2

Objetivo = 9

with recursive find_parent(source, target, recentness) as ( select source, target, 0 from tbl where target = 9 union all select i.source, i.target, fp.recentness + 1 from tbl i join find_parent fp on i.target = fp.source -- despite the name, this target is another one''s source and i.target <> 2 ) ,construct_path(source, target, recentness, path) as ( select source, target, recentness, source || ''.'' || target from find_parent where recentness = (select max(recentness) from find_parent) union select dd.source, dd.target, dd.recentness, cp.path || ''.'' || dd.target from find_parent dd join construct_path cp on dd.recentness = cp.recentness - 1 ) select source, target, path from construct_path order by recentness desc

Salida:

SOURCE TARGET PATH 2 4 2.4 4 9 2.4.9

Prueba en vivo: http://www.sqlfiddle.com/#!1/13e6b/16

Tabla temporal para la prueba:

CREATE TEMP TABLE best_path (node_x int, node_y int, strength int); INSERT INTO best_path VALUES (1, 2, 1) ,(1, 3, 1) ,(2, 4, 1) ,(2, 5, 1) ,(3, 6, 1) ,(3, 7, 1) ,(4, 8, 1) ,(4, 9, 1) ,(5, 10, 1) ,(5, 11, 1);

Consulta:
_{modificado para acomodar el comentario sobre 2 - 5}

WITH RECURSIVE t AS ( -- for readability and convenience: SELECT 1 AS node_s -- source_node , 4 AS node_t -- target_node ) , x AS ( SELECT node_x FROM t, best_path WHERE node_y = node_t ) , a AS ( SELECT b.node_x , b.node_y , b.strength AS weight , b.node_x || ''.'' || b.node_y || ''.'' AS path FROM t, best_path b LEFT JOIN x ON x.node_x = b.node_x WHERE b.node_x = node_s AND (x.node_x IS NULL -- no point with edge to target OR b.node_y = node_t) -- except with target_node UNION ALL SELECT a.node_x , b.node_y , least(a.weight, b.strength) , a.path || b.node_y || ''.'' AS path FROM t, a JOIN best_path AS b ON b.node_x = a.node_y LEFT JOIN x ON x.node_x = b.node_x WHERE a.node_y <> node_t -- arrived at target AND a.path !~~ (''%'' || b.node_y || ''.%'') -- not visited yet AND (x.node_x IS NULL -- no point with edge to target OR b.node_y = node_t) -- except with target_node ) TABLE a;

Produce el resultado solicitado exactamente.

Agregué la condición de salto a la consulta inicial, también, porque podemos alcanzar el objetivo en solo un paso.

Esto pasa a ser muy parecido a mi respuesta a una pregunta anterior similar . Se aplican todas las explicaciones y enlaces. El truco más importante aquí es incluir un CTE x adicional para recolectar los nodos que ...

tener (a) ventaja directa al objetivo.

Para uso repetido en la condición de corte del CTE recursivo. No es ampliamente conocido que puede agregar CTE adicionales (no recursivos) sobre un CTE recursivo independientemente de la palabra clave RECURSIVE .