r - into - División y manipulación de listas anidadas

split a column in r (2)

Estoy tratando de dividir una lista anidada por una variable de grupo. Por favor considere la siguiente estructura:

> str(L1) List of 2 $ names:List of 2 ..$ first: chr [1:5] "john" "lisa" "anna" "mike" ... ..$ last : chr [1:5] "johnsson" "larsson" "johnsson" "catell" ... $ stats:List of 2 ..$ physical:List of 2 .. ..$ age : num [1:5] 14 22 53 23 31 .. ..$ height: num [1:5] 165 176 179 182 191 ..$ mental :List of 1 .. ..$ iq: num [1:5] 102 104 99 87 121

Ahora necesito producir dos listas, que usan L1$names$last to splice, lo que da como resultado L2 y L3 , como se ve a continuación:

L2: Resultado agrupado por L1$names$last

> str(L2) List of 3 $ johnsson:List of 2 ..$ names:List of 1 .. ..$ first: chr [1:2] "john" "anna" ..$ stats:List of 2 .. ..$ physical:List of 2 .. .. ..$ age : num [1:2] 14 53 .. .. ..$ height: num [1:2] 165 179 .. ..$ mental :List of 1 .. .. ..$ iq: num [1:2] 102 99 $ larsson :List of 2 ..$ names:List of 1 .. ..$ first: chr [1:2] "lisa" "steven" ..$ stats:List of 2 .. ..$ physical:List of 2 .. .. ..$ age : num [1:2] 22 31 .. .. ..$ height: num [1:2] 176 191 .. ..$ mental :List of 1 .. .. ..$ iq: num [1:2] 104 121 $ catell :List of 2 ..$ names:List of 1 .. ..$ first: chr "mike" ..$ stats:List of 2 .. ..$ physical:List of 2 .. .. ..$ age : num 23 .. .. ..$ height: num 182 .. ..$ mental :List of 1 .. .. ..$ iq: num 87

L3: cada grupo solo permite una ocurrencia de L1$names$last

List of 2 $ 1:List of 2 ..$ names:List of 2 .. ..$ first: chr [1:3] "john" "lisa" "mike" .. ..$ last : chr [1:3] "johnsson" "larsson" "catell" ..$ stats:List of 2 .. ..$ physical:List of 2 .. .. ..$ age : num [1:3] 14 22 23 .. .. ..$ height: num [1:3] 165 176 182 .. ..$ mental :List of 1 .. .. ..$ iq: num [1:3] 102 104 87 $ 2:List of 2 ..$ names:List of 2 .. ..$ first: chr [1:2] "anna" "steven" .. ..$ last : chr [1:2] "johnsson" "larsson" ..$ stats:List of 2 .. ..$ physical:List of 2 .. .. ..$ age : num [1:2] 53 31 .. .. ..$ height: num [1:2] 179 191 .. ..$ mental :List of 1 .. .. ..$ iq: num [1:2] 99 121

Intenté aplicar esta solución , pero parece que esto no funcionará para las listas anidadas.

Código reproducible:

EDITAR: tenga en cuenta que el conjunto de datos real es bastante grande y está más profundamente anidado que el ejemplo proporcionado.

Por lo general, para modificar listas, querrá usar la recursión. Por ejemplo, considere esta función:

foo <- function(x, idx) { if (is.list(x)) { return(lapply(x, foo, idx = idx)) } return(x[idx]) }

toma una lista como x y una serie de índices idx . Comprobará si x es una lista, y si ese es el caso, se aplicará a todos los subelementos de la lista. Una vez que x ya no es una lista, tomamos los elementos proporcionados por idx . Durante todo el proceso, la estructura de la lista original permanecerá intacta.

Aquí un ejemplo completo. Tenga en cuenta que este código supone que todos los vectores en la lista tienen 5 elementos.

L1 <- list("names" = list("first" = c("john","lisa","anna","mike","steven"),"last" = c("johnsson","larsson","johnsson","catell","larsson")),"stats" = list("physical" = list("age" = c(14,22,53,23,31), "height" = c(165,176,179,182,191)), "mental" = list("iq" = c(102,104,99,87,121)))) L2 <- list("johnsson" = list("names" = list("first" = c("john","anna")),"stats" = list("physical" = list("age" = c(14,53), "height" = c(165,179)), "mental" = list("iq" = c(102,99)))), "larsson" = list("names" = list("first" = c("lisa","steven")),"stats" = list("physical" = list("age" = c(22,31), "height" = c(176,191)), "mental" = list("iq" = c(104,121)))), "catell" = list("names" = list("first" = "mike"),"stats" = list("physical" = list("age" = 23, "height" = 182), "mental" = list("iq" = 87)))) L3 <- list("1" = list("names" = list("first" = c("john","lisa","mike"),"last" = c("johnsson","larsson","catell")),"stats" = list("physical" = list("age" = c(14,22,23), "height" = c(165,176,182)), "mental" = list("iq" = c(102,104,87)))), "2" = list("names" = list("first" = c("anna","steven"),"last" = c("johnsson","larsson")),"stats" = list("physical" = list("age" = c(53,31), "height" = c(179,191)), "mental" = list("iq" = c(99,121))))) # make L2 foo <- function(x, idx) { if (is.list(x)) { return(lapply(x, foo, idx = idx)) } return(x[idx]) } levels <- unique(L1$names$last) L2_2 <- vector("list", length(levels)) names(L2_2) <- levels for (i in seq_along(L2_2)) { idx <- L1$names$last == names(L2_2[i]) L2_2[[i]] <- list(names = foo(L1$names[-2], idx), stats = foo(L1$stats, idx)) } identical(L2, L2_2) str(L2) str(L2_2) # make L3 dups <- duplicated(L1$names$last) L3_2 <- vector("list", 2) names(L3_2) <- 1:2 for (i in 1:2) { if (i == 1) idx <- !dups else idx <- dups L3_2[[i]] <- foo(L1, idx) } identical(L3, L3_2) str(L3) str(L3_2)

Esta no es una respuesta completa, pero espero que ayude.

Vea si esto funciona para L3:

x = data.frame(L1, stringsAsFactors = F) y = x[order(x$names.last),] y$seq = 1 y$seq = ifelse(y$names.last == shift(y$names.last),shift(y$seq)+1,1) y$seq[1] = 1 z = list(list(names=list(first=z[[1]]$names.first, last=z[[1]]$names.last), stats=list(physical = list(age =z[[1]]$stats.physical.age, height= z[[1]]$stats.physical.height), mental=list(iq= z[[1]]$stats.iq))), list(names=list(first=z[[2]]$names.first, last=z[[2]]$names.last), stats=list(physical = list(age =z[[2]]$stats.physical.age, height= z[[2]]$stats.physical.height), mental=list(iq= z[[2]]$stats.iq))))

La última parte ( z ) donde eso se transforma nuevamente en una lista se puede hacer con un bucle. Suponiendo que el mismo nombre no sube demasiado, el ciclo no sería demasiado lento.

Usted dice que está más anidado, en cuyo caso deberá agregar las is.null y tryCatch para tratar los errores.