根据 R 数据框中的多个分组列查找多列的平均值。
要根据R数据框中的多个分组列找到多列的平均值,我们可以使用summarise_at函数和mean函数。
例如,如果我们有一个名为df的数据框,其中包含两个分组列(例如G1和G2)以及两个数值列(例如Num1和Num2),那么我们可以使用下面提到的命令根据G1和G2找到Num1和Num2的平均值-
df%%group_by(G1,G2)%%summarise_at(vars("Num1","Num2"),mean)
示例1
以下代码段创建了一个示例数据框-
Gender<-sample(c("Male","Female"),20,replace=TRUE) Class<-sample(c("First","Second","Third"),20,replace=TRUE) Score1<-sample(1:10,20,replace=TRUE) Score2<-sample(1:10,20,replace=TRUE) Score3<-sample(1:10,20,replace=TRUE) df1<-data.frame(Gender,Class,Score1,Score2,Score3) df1
创建了以下数据框
Gender Class Score1 Score2 Score3 1 Female Second 10 9 10 2 Male First 4 8 3 3 Male First 10 6 10 4 Male First 3 6 3 5 Male Second 1 2 8 6 Female Second 9 7 7 7 Female First 5 3 3 8 Male Third 4 4 5 9 Female Third 8 10 2 10 Male First 3 4 10 11 Female Third 9 5 10 12 Male Second 1 8 4 13 Female First 5 3 1 14 Male Second 2 9 10 15 Female Third 8 8 10 16 Female Second 10 1 3 17 Female Second 8 5 4 18 Female First 2 1 2 19 Male Third 3 1 8 20 Female Second 6 5 7
要加载dplyr包并根据上面创建的数据框上的Gender和Class找到Score列的平均值,请将以下代码添加到上面的代码片段中-
Gender<-sample(c("Male","Female"),20,replace=TRUE) Class<-sample(c("First","Second","Third"),20,replace=TRUE) Score1<-sample(1:10,20,replace=TRUE) Score2<-sample(1:10,20,replace=TRUE) Score3<-sample(1:10,20,replace=TRUE) df1<-data.frame(Gender,Class,Score1,Score2,Score3) library(dplyr) df1%%group_by(Gender,Class)%%summarise_at(vars("Score1","Score2","Score3"),mean) # A tibble: 6 x 5 # Groups: Gender [2]输出结果
如果您将上述所有给定的片段作为单个程序执行,它会生成以下输出-
Gender Class Score1 Score2 Score3 <chr <chr <dbl <dbl <dbl 1 Female First 4 2.33 2 2 Female Second 8.6 5.4 6.2 3 Female Third 8.33 7.67 7.33 4 Male First 5 6 6.5 5 Male Second 1.33 6.33 7.33 6 Male Third 3.5 2.5 6.5
示例2
以下代码段创建了一个示例数据框-
Group1<-sample(LETTERS[1:4],20,replace=TRUE) Group2<-sample(letters[1:4],20,replace=TRUE) x1<-sample(1:100,20) x2<-sample(1:100,20) x3<-sample(1:100,20) df2<-data.frame(Group1,Group2,x1,x2,x3) df2
创建了以下数据框
Group1 Group2 x1 x2 x3 1 B c 90 19 95 2 D b 98 90 9 3 D b 14 67 96 4 B d 91 52 98 5 A b 27 83 30 6 A a 29 95 27 7 D d 28 69 80 8 C b 58 72 42 9 B c 41 99 1 10 A a 62 20 49 11 B c 47 87 67 12 C c 71 58 43 13 A d 23 6 89 14 B a 39 13 15 15 D c 22 7 23 16 D c 72 1 61 17 D c 21 55 6 18 B d 48 63 41 19 B a 69 12 18 20 A b 88 86 20
要在上面创建的数据框中找到基于Group1和Group2的x列的平均值,请将以下代码添加到上面的代码段中-
Group1<-sample(LETTERS[1:4],20,replace=TRUE) Group2<-sample(letters[1:4],20,replace=TRUE) x1<-sample(1:100,20) x2<-sample(1:100,20) x3<-sample(1:100,20) df2<-data.frame(Group1,Group2,x1,x2,x3) df2%%group_by(Group1,Group2)%%summarise_at(vars("x1","x2","x3"),mean) # A tibble: 11 x 5 # Groups: Group1 [4]输出结果
如果您将上述所有给定的片段作为单个程序执行,它会生成以下输出-
Group1 Group2 x1 x2 x3 <chr <chr <dbl <dbl <dbl 1 A a 45.5 57.5 38 2 A b 57.5 84.5 25 3 A d 23 6 89 4 B a 54 12.5 16.5 5 B c 59.3 68.3 54.3 6 B d 69.5 57.5 69.5 7 C b 58 72 42 8 C c 71 58 43 9 D b 56 78.5 52.5 10 D c 38.3 21 30 11 D d 28 69 80