wide long gather funcion examples data r transform dataframe reshape

long - spread r



RemodelaciĆ³n de una matriz a data.frame (4)

Sí, use adply() :

adply(x, c(1,2,3)) Subject Cond Item Measure1 Measure2 Measure3 1 s1 A 1 -0.93 -0.360 -0.005 2 s2 A 1 0.39 1.043 1.090 3 s3 A 1 0.88 0.330 0.360 4 s4 A 1 0.63 -0.120 0.040 5 s5 A 1 0.86 -0.055 0.090 6 s1 B 1 -0.69 0.070 0.170 7 s2 B 1 1.02 0.670 0.680 8 s3 B 1 0.29 0.480 0.510 9 s4 B 1 0.94 0.002 0.090 10 s5 B 1 0.93 0.008 0.120 11 s1 A 2 -0.01 -0.190 -0.050 12 s2 A 2 0.79 -1.390 0.110 13 s3 A 2 0.32 0.980 0.990 14 s4 A 2 0.14 0.430 0.620 15 s5 A 2 0.13 -0.020 0.130 16 s1 B 2 -0.07 -0.150 0.060 17 s2 B 2 -0.63 -0.080 0.270 18 s3 B 2 0.26 0.740 0.740 19 s4 B 2 0.07 0.960 0.960 20 s5 B 2 0.87 0.440 0.450

Tengo la siguiente estructura de datos (¿un "vector atómico"?) De la salida de daply en plyr , en la que hice que la función devuelva tres medidas diferentes para cada sujeto, condición y artículo.

x = structure(c(-0.93, 0.39, 0.88, 0.63, 0.86, -0.69, 1.02, 0.29, 0.94, 0.93, -0.01, 0.79, 0.32, 0.14, 0.13, -0.07, -0.63, 0.26, 0.07, 0.87, -0.36, 1.043, 0.33, -0.12, -0.055, 0.07, 0.67, 0.48, 0.002, 0.008, -0.19, -1.39, 0.98, 0.43, -0.02, -0.15,-0.08, 0.74, 0.96, 0.44, -0.005, 1.09, 0.36, 0.04, 0.09, 0.17, 0.68, 0.51, 0.09, 0.12, -0.05, 0.11, 0.99, 0.62, 0.13, 0.06, 0.27, 0.74, 0.96, 0.45), .Dim = c(5L, 2L, 2L, 3L), .Dimnames = structure(list(Subject = c("s1", "s2", "s3", "s4", "s5"), Cond = c("A", "B"), Item = c("1", "2"), c("Measure1", "Measure2", "Measure3")), .Names = c("Subject", "Cond", "Item", "")))

Quiero cambiarlo para que se vea así:

Subject Cond Item Measure1 Measure2 Measure3 s1 A 1 -0.93 -0.360 -0.005 s1 A 2 -0.01 -0.19 -0.05 s1 B 1 -0.69 0.070 0.17 s1 B 2 -0.07 -0.15 0.06 s2 A 1 0.39 1.043 1.090 s2 A 2 0.79 -1.39 0.11 s2 B 1 1.02 0.670 0.68 s2 B 2 -0.63 -0.08 0.27

etc.

¿Hay una forma fácil de hacer esto?


df = melt(x) te da algo muy similar a lo que quieres. Entonces podrías calcular las diversas variables de medida de los diferentes niveles de medida.

Usando el paquete "reshape2", intente:

dcast(melt(x), Subject + Cond + Item ~ Var4)


ftable te lleva a donde necesitas estar:

y <- ftable(x) y # # Measure1 Measure2 Measure3 # Subject Cond Item # s1 A 1 -0.930 -0.360 -0.005 # 2 -0.010 -0.190 -0.050 # B 1 -0.690 0.070 0.170 # 2 -0.070 -0.150 0.060 # s2 A 1 0.390 1.043 1.090 # 2 0.790 -1.390 0.110 # B 1 1.020 0.670 0.680 # 2 -0.630 -0.080 0.270 # s3 A 1 0.880 0.330 0.360 # 2 0.320 0.980 0.990 # B 1 0.290 0.480 0.510 # 2 0.260 0.740 0.740 # s4 A 1 0.630 -0.120 0.040 # 2 0.140 0.430 0.620 # B 1 0.940 0.002 0.090 # 2 0.070 0.960 0.960 # s5 A 1 0.860 -0.055 0.090 # 2 0.130 -0.020 0.130 # B 1 0.930 0.008 0.120 # 2 0.870 0.440 0.450

Pero, la mayoría de la gente probablemente prefiera sus datos en un data.frame . El uso de as.data.frame.matrix extrae los valores, pero no los nombres de fila y columna. ftable almacena esa información en los atributos row.vars y col.vars .

attributes(y)$row.vars # $Subject # [1] "s1" "s2" "s3" "s4" "s5" # # $Cond # [1] "A" "B" # # $Item # [1] "1" "2" attributes(y)$col.vars # [[1]] # [1] "Measure1" "Measure2" "Measure3"

Podemos usar esta información para escribir una función que convierta un ftable a un data.frame :

ftable2df <- function(mydata) { ifelse(class(mydata) == "ftable", mydata <- mydata, mydata <- ftable(mydata)) dfrows <- rev(expand.grid(rev(attr(mydata, "row.vars")))) dfcols <- as.data.frame.matrix(mydata) names(dfcols) <- do.call( paste, c(rev(expand.grid(rev(attr(mydata, "col.vars")))), sep = "_")) cbind(dfrows, dfcols) }

Aquí está en uso directamente en su "x" original:

ftable2df(x) # Subject Cond Item Measure1 Measure2 Measure3 # 1 s1 A 1 -0.93 -0.360 -0.005 # 2 s1 A 2 -0.01 -0.190 -0.050 # 3 s1 B 1 -0.69 0.070 0.170 # 4 s1 B 2 -0.07 -0.150 0.060 # 5 s2 A 1 0.39 1.043 1.090 # 6 s2 A 2 0.79 -1.390 0.110 # 7 s2 B 1 1.02 0.670 0.680 # 8 s2 B 2 -0.63 -0.080 0.270 # 9 s3 A 1 0.88 0.330 0.360 # 10 s3 A 2 0.32 0.980 0.990 # 11 s3 B 1 0.29 0.480 0.510 # 12 s3 B 2 0.26 0.740 0.740 # 13 s4 A 1 0.63 -0.120 0.040 # 14 s4 A 2 0.14 0.430 0.620 # 15 s4 B 1 0.94 0.002 0.090 # 16 s4 B 2 0.07 0.960 0.960 # 17 s5 A 1 0.86 -0.055 0.090 # 18 s5 A 2 0.13 -0.020 0.130 # 19 s5 B 1 0.93 0.008 0.120 # 20 s5 B 2 0.87 0.440 0.450


Use as.data.frame.table . A continuación, puede evitar la carga plyr - desanexar plyr cuando desee utilizar dplyr - cycle.

df0 <- as.data.frame.table(x) head(df0) # Subject Cond Item Var4 Freq # 1 s1 A 1 Measure1 -0.93 # 2 s2 A 1 Measure1 0.39 # 3 s3 A 1 Measure1 0.88 # 4 s4 A 1 Measure1 0.63 # 5 s5 A 1 Measure1 0.86 # 6 s1 B 1 Measure1 -0.69 library("tidyr") df1 <- df0 %>% spread(key = Var4, value = Freq) head(df1) # Subject Cond Item Measure1 Measure2 Measure3 # 1 s1 A 1 -0.93 -0.360 -0.005 # 2 s1 A 2 -0.01 -0.190 -0.050 # 3 s1 B 1 -0.69 0.070 0.170 # 4 s1 B 2 -0.07 -0.150 0.060 # 5 s2 A 1 0.39 1.043 1.090 # 6 s2 A 2 0.79 -1.390 0.110