如何使用dplyr包基于R数据帧中的另一列查找列中特定字符串的频率?
当我们在R数据帧中有两个或多个分类列,其中字符串作为类别级别或数字作为字符串/整数时,我们可以找到基于另一个的频率。这将有助于我们识别跨列频率,并且可以了解基于另一列的一个类别的分布。要使用dplyr软件包执行此操作,我们可以使用过滤器功能。
示例
请看以下数据帧-
Group<−sample(1:5,20,replace=TRUE) Standard<−sample(c("I","II","III"),20,replace=TRUE) df1<−data.frame(Group,Standard) df1
输出结果
Group Standard 1 3 III 2 5 III 3 5 I 4 3 I 5 2 II 6 4 II 7 3 III 8 2 I 9 1 II 10 4 III 11 3 II 12 4 III 13 4 III 14 4 III 15 4 III 16 4 III 17 5 III 18 3 II 19 5 III 20 1 III
查找标准组的频率-
library(dplyr) df1%>%filter(Standard=="I")%>%count(Group)
输出结果
Group n 1 2 1 2 3 1 3 5 1
示例
df1%>%filter(Standard=="II")%>%count(Group)
输出结果
Group n 1 1 1 2 2 1 3 3 2 4 4 1
示例
df1%>%filter(Standard=="III")%>%count(Group)
输出结果
Group n 1 1 1 2 3 2 3 4 6 4 5 3
让我们看另一个例子-
Class<−sample(c("First","Second","Third"),20,replace=TRUE) Gender<−sample(c("Male","Female"),20,replace=TRUE) df2<−data.frame(Gender,Class) df2
输出结果
Gender Class 1 Female Third 2 Female First 3 Female Second 4 Male Third 5 Male Third 6 Female Second 7 Male First 8 Female Third 9 Female Second 10 Female Second 11 Female First 12 Female Second 13 Male First 14 Female Third 15 Female Third 16 Male Third 17 Male Third 18 Male Second 19 Female Second 20 Male Second df2%>%filter(Class=="Third")%>%count(Gender) Gender n 1 Female 4 2 Male 4 df2%>%filter(Class=="First")%>%count(Gender) Gender n 1 Female 2 2 Male 2 df2%>%filter(Class=="Second")%>%count(Gender) Gender n 1 Female 6 2 Male 2