如何在 R 中根据两个不同的字符列查找整数列的总数?

r programmingserver side programmingprogramming更新于 2025/4/12 4:22:17

基于两个不同的字符列计算整数列的总数仅意味着我们需要为可用数据创建一个列联表。为此,我们可以使用 with 和 tapply 函数。例如,如果我们有一个数据框 df,其中包含两个定义为性别和种族的分类列和一个定义为包的整数列,则可以按如下方式创建列联表:

with(df,tapply(Package,list(gender,ethnicity),sum))

示例

考虑下面的数据框 −

set.seed(777)
Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Group<−sample(c("GP1","GP2","GP3","GP4"),20,replace=TRUE)
Rate<−sample(0:10,20,replace=TRUE)
df1<−data.frame(Class,Group,Rate)
df1

输出

   Class Group Rate
1 First   GP1 7
2 Second  GP2 1
3 Second  GP4 1
4 Second  GP4 0
5 Third   GP2 10
6 Second  GP2 8
7 First   GP1 7
8 First   GP4 4
9 Second  GP1 4
10 Third  GP3 8
11 Second GP2 8
12 First  GP2 4
13 Third  GP2 6
14 Third  GP4 4
15 Third  GP4 5
16 Second GP1 2
17 Second GP1 9
18 Second GP3 2
19 Second GP3 1
20 Third  GP4 10

示例

str(df1)
'data.frame': 20 obs. of 3 variables:
$ Class: chr "First" "Second" "Second" "Second" ...
$ Group: chr "GP1" "GP2" "GP4" "GP4" ...
$ Rate : int 7 1 1 0 10 8 7 4 4 8 ...

Finding the total of Rate based on Class and Group −

with(df1,tapply(Rate,list(Class,Group),sum))
GP1 GP2 GP3 GP4
First  14 4 NA 4
Second 15 17 3 1
Third  NA 16 8 19

我们来看另一个例子 −

示例

Gender<−sample(c("Male","Female"),20,replace=TRUE)
Centering<−sample(c("Yes","No"),20,replace=TRUE)
Percentage<−sample(1:100,20)
df2<−data.frame(Gender,Centering,Percentage)
df2

输出

Gender Centering Percentage
1 Male    No  28
2 Male    No  89
3 Female  Yes 38
4 Male    No  78
5 Male    Yes 19
6 Female  No  46
7 Female  Yes 94
8 Male    No   4
9 Male    Yes 92
10 Male   No  90
11 Male   Yes 66
12 Female No  57
13 Female No  74
14 Female No  48
15 Female Yes 20
16 Male   Yes 51
17 Male   No  82
18 Male   No   7
19 Male   No  53
20 Male   No  55

根据性别和中心化计算百分比总和 −

with(df2,tapply(Percentage,list(Gender,Centering),sum))
No Yes
Female 225 152
Male 486 228

相关文章