long - Reenredando magia en data.frame

transpose data frame r (3)

Si desea utilizar el paquete de remodelación original por alguna razón:

Shop.Name <- c("Shop1", "Shop1", "Shop2", "Shop3", "Shop3") Items <- c(2,4,3,2,1) Product <- c("Product1", "Product2", "Product1", "Product1", "Product4") (df <- data.frame(Shop.Name, Items, Product)) cast(df, formula = Shop.Name ~ Product, value="Items", fill=0)

Esta pregunta ya tiene una respuesta aquí:

Reformar el marco de datos de tres columnas a la matriz (formato "largo" a "ancho") 2 respuestas

Actualmente estoy aprendiendo a trabajar con data.frame y estoy bastante confundido sobre cómo reordenarlos.

Por el momento, tengo un data.frame que muestra:

columna 1: nombre de una tienda
columna 2: un producto
columna 3: el número de compras de este producto en esta tienda

o visualmente algo como esto:

+---+-----------+-------+----------+--+ | | Shop.Name | Items | Product | | +---+-----------+-------+----------+--+ | 1 | Shop1 | 2 | Product1 | | | 2 | Shop1 | 4 | Product2 | | | 3 | Shop2 | 3 | Product1 | | | 4 | Shop3 | 2 | Product1 | | | 5 | Shop3 | 1 | Product4 | | +---+-----------+-------+----------+--+

Lo que me gustaría lograr es la siguiente estructura "centrada en la tienda":

columna 1: nombre de una tienda
columna 2: Artículos vendidos por producto1
columna 3: Artículos vendidos para product2
columna 4: Artículos vendidos por producto3 ...

Cuando no hay una línea para una tienda / producto específico (debido a que no hay ventas), me gustaría crear un 0.

+---+-------+-------+-------+-------+-------+-----+--+--+ | | Shop | Prod1 | Prod2 | Prod3 | Prod4 | ... | | | +---+-------+-------+-------+-------+-------+-----+--+--+ | 1 | Shop1 | 2 | 4 | 0 | 0 | ... | | | | 2 | Shop2 | 3 | 0 | 0 | 0 | ... | | | | 3 | Shop3 | 2 | 0 | 0 | 1 | ... | | | +---+-------+-------+-------+-------+-------+-----+--+--+

Use dcast de la biblioteca reshape2 :

library(reshape2) > df <- data.frame(Shop.Name=rep(c("Shop1","Shop2","Shop3"),each=3), + Items=rpois(9,5), + Product=c(rep(c("Prod1","Prod2","Prod3","Prod4"),2),"Prod5") + ) > df Shop.Name Items Product 1 Shop1 6 Prod1 2 Shop1 5 Prod2 3 Shop1 6 Prod3 4 Shop2 5 Prod4 5 Shop2 6 Prod1 6 Shop2 6 Prod2 7 Shop3 4 Prod3 8 Shop3 7 Prod4 9 Shop3 5 Prod5 > dcast(df,Shop.Name ~ Product,value.var="Items",fill=0) Shop.Name Prod1 Prod2 Prod3 Prod4 Prod5 1 Shop1 6 5 6 0 0 2 Shop2 6 6 0 5 0 3 Shop3 0 0 4 7 5

Las respuestas hasta ahora funcionan hasta cierto punto, pero no responden completamente a su pregunta. En particular, no abordan la cuestión de un caso en el que no hay tiendas que venden un producto en particular. De su entrada de ejemplo y salida deseada, no había tiendas que vendieran "Producto3". De hecho, "Product3" ni siquiera aparece en su data.frame origen. Además, no abordan la posible situación de tener más de una fila por cada combinación de Tienda + Producto.

Aquí hay una versión modificada de sus datos y las dos soluciones hasta el momento. Agregué otra fila para una combinación de "Shop1" y "Product1". Tenga en cuenta que he convertido sus productos en una variable de factor que incluye los niveles que la variable puede tomar, incluso si ninguno de los casos tiene ese nivel.

mydf <- data.frame( Shop.Name = c("Shop1", "Shop1", "Shop2", "Shop3", "Shop3", "Shop1"), Items = c(2, 4, 3, 2, 1, 2), Product = factor( c("Product1", "Product2", "Product1", "Product1", "Product4", "Product1"), levels = c("Product1", "Product2", "Product3", "Product4")))

dcast de "reshape2"
library(reshape2) dcast(mydf, formula = Shop.Name ~ Product, value="Items", fill=0) # Using Product as value column: use value.var to override. # Aggregation function missing: defaulting to length # Error in .fun(.value[i], ...) : # 2 arguments passed to ''length'' which requires 1
¿Qué? De repente no funciona. Haz esto en su lugar:
dcast(mydf, formula = Shop.Name ~ Product, fill = 0, value.var = "Items", fun.aggregate = sum, drop = FALSE) # Shop.Name Product1 Product2 Product3 Product4 # 1 Shop1 4 4 0 0 # 2 Shop2 3 0 0 0 # 3 Shop3 2 0 0 1
Seamos viejos. cast desde "remodelar"
library(reshape) cast(mydf, formula = Shop.Name ~ Product, value="Items", fill=0) # Aggregation requires fun.aggregate: length used as default # Shop.Name Product1 Product2 Product4 # 1 Shop1 2 1 0 # 2 Shop2 1 0 0 # 3 Shop3 1 0 1
Eh. No es lo que querías de nuevo ... Prueba esto en su lugar:
cast(mydf, formula = Shop.Name ~ Product, value = "Items", fill = 0, add.missing = TRUE, fun.aggregate = sum) # Shop.Name Product1 Product2 Product3 Product4 # 1 Shop1 4 4 0 0 # 2 Shop2 3 0 0 0 # 3 Shop3 2 0 0 1
Volvamos a lo básico. xtabs de la base R
xtabs(Items ~ Shop.Name + Product, mydf) # Product # Shop.Name Product1 Product2 Product3 Product4 # Shop1 4 4 0 0 # Shop2 3 0 0 0 # Shop3 2 0 0 1
O, si prefiere un data.frame (tenga en cuenta que su variable "Shop.Name" se ha convertido a la row.names of the data.frame ):
as.data.frame.matrix(xtabs(Items ~ Shop.Name + Product, mydf)) # Product1 Product2 Product3 Product4 # Shop1 4 4 0 0 # Shop2 3 0 0 0 # Shop3 2 0 0 1