如何使用 R 数据框中的两个因子列来查找累积和?

r programmingserver side programmingprogramming更新于 2025/4/11 6:52:17

通常,累积和是针对单个变量计算的,在某些情况下是基于单个分类变量计算的,我们很少需要对两个分类变量进行计算。如果我们想为两个分类变量找到它,那么我们需要将数据框转换为 data.table 对象,并使用 cumsum 函数定义具有累积和的列。

示例

考虑以下数据框:

> set.seed(1361)
> Factor1<-as.factor(sample(LETTERS[1:4],20,replace=TRUE))
> Factor2<-as.factor(sample(c("T1","T2","T3","T4"),20,replace=TRUE))
> Response<-rpois(20,5)
> df1<-data.frame(Factor1,Factor2,Response)
> df1

输出

Factor1 Factor2 Response
1 A T2 9
2 B T1 8
3 B T1 2
4 A T2 3
5 B T1 7
6 B T2 7
7 D T2 7
8 D T4 7
9 C T4 6
10 B T1 6
11 A T2 4
12 A T2 4
13 C T1 7
14 B T3 1
15 A T3 6
16 D T1 3
17 B T1 8
18 D T4 5
19 D T2 3
20 C T1 4

加载 data.table 包:

> library(data.table)

将数据框 df1 转换为 data.table 对象:

> dt1<-data.table(df1)

根据 Factor1 和 Factor2 创建具有累积总和的 CumulativeSums 列:

示例

> dt1[,CumulativeSums:=cumsum(Response),by=list(Factor1,Factor2)]
> dt1

输出

Factor1 Factor2 Response CumulativeSums
1: A T2 9 9
2: B T1 8 8
3: B T1 2 10
4: A T2 3 12
5: B T1 7 17
6: B T2 7 7
7: D T2 7 7
8: D T4 7 7
9: C T4 6 6
10: B T1 6 23
11: A T2 4 16
12: A T2 4 20
13: C T1 7 7
14: B T3 1 1
15: A T3 6 6
16: D T1 3 3
17: B T1 8 31
18: D T4 5 12
19: D T2 3 10
20: C T1 4 11

我们来看另一个例子:

示例

> G1<-as.factor(sample(c("Hot","Cold"),20,replace=TRUE))
> G2<-as.factor(sample(c("Low","Medium","Large"),20,replace=TRUE))
> Y<-sample(1:100,20)
> df2<-data.frame(G1,G2,Y)
> df2

输出

G1 G2 Y
1 Hot Medium 60
2 Cold Low 94
3 Hot Low 22
4 Cold Medium 90
5 Hot Medium 16
6 Hot Large 32
7 Cold Low 44
8 Hot Low 73
9 Hot Medium 99
10 Hot Medium 68
11 Cold Medium 41
12 Cold Large 77
13 Cold Large 48
14 Cold Medium 20
15 Cold Medium 18
16 Cold Low 12
17 Cold Low 30
18 Hot Low 23
19 Cold Medium 26
20 Cold Medium 4

示例

> dt2<-data.table(df2)
> dt2[,CumulativeSums:=cumsum(Y),by=list(G1,G2)]
> dt2

输出

G1 G2 Y CumulativeSums
1: Hot Medium 60 60
2: Cold Low 94 94
3: Hot Low 22 22
4: Cold Medium 90 90
5: Hot Medium 16 76
6: Hot Large 32 32
7: Cold Low 44 138
8: Hot Low 73 95
9: Hot Medium 99 175
10: Hot Medium 68 243
11: Cold Medium 41 131
12: Cold Large 77 77
13: Cold Large 48 125
14: Cold Medium 20 151
15: Cold Medium 18 169
16: Cold Low 12 150
17: Cold Low 30 180
18: Hot Low 23 118
19: Cold Medium 26 195
20: Cold Medium 4 199

相关文章