如何在 R 中按组标准化 data.table 对象的列?
r programmingserver side programmingprogramming更新于 2025/6/24 6:07:17
要按组标准化 data.table 对象的列,我们可以使用 scale 函数,并为分组列提供 by 函数。
例如,如果我们有一个名为 DT 的 data.table 对象,它包含两列,分别为 G 和 Num,其中 G 是分组列,Num 是数值列,那么我们可以使用以下命令 − 按 G 列标准化 Num。
DT[,"Num":=as.vector(scale(Num)),by=G]
示例 1
考虑以下 data.table 对象 −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1
创建以下数据框
Grp Response 1: Female 5.31 2: Male 5.20 3: Female 6.38 4: Male 4.53 5: Female 4.90 6: Female 4.78 7: Male 3.73 8: Female 6.19 9: Male 4.33 10: Male 7.84 11: Male 6.70 12: Female 5.11 13: Male 6.80 14: Male 3.76 15: Male 3.56 16: Male 5.51 17: Female 6.58 18: Female 7.59 19: Male 4.62 20: Female 6.75
要在上述创建的数据框中通过 DT1 中的 Grp 列标准化 Response 列,请将以下代码添加到上述代码片段中 −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1[,"Response":=as.vector(scale(Response)),by=Grp] DT1
输出
如果将上述所有代码片段作为单个程序执行,则会生成以下输出 −
Grp Response 1: Female -0.66313371 2: Male 0.03955265 3: Female 0.43789692 4: Male -0.43061348 5: Female -1.08502396 6: Female -1.20850403 7: Male -0.99200587 8: Female 0.24238681 9: Male -0.57096158 10: Male 1.89214752 11: Male 1.09216337 12: Female -0.86893383 13: Male 1.16233742 14: Male -0.97095365 15: Male -1.11130175 16: Male 0.25709220 17: Female 0.64369704 18: Female 1.68298763 19: Male -0.36745684 20: Female 0.81862714
示例 2
以下代码片段创建了一个示例数据框 −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2
创建以下数据框
Class Rate 1: II 10 2: III 9 3: II 10 4: II 10 5: III 10 6: III 9 7: III 8 8: II 10 9: II 11 10: III 9 11: I 9 12: II 11 13: III 13 14: II 10 15: III 12 16: I 8 17: II 9 18: I 10 19: III 9 20: II 10
要在上述创建的数据框中通过 DT2 中的 Class 列标准化 Rate 列,请将以下代码添加到上述代码片段中 −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2[,"Rate":=as.vector(scale(Rate)),by=Class] DT2
输出
如果将上述所有代码片段作为一个程序执行,则会生成以下输出 −
Class Rate 1: II -0.18490007 2: III -0.50669175 3: II -0.18490007 4: II -0.18490007 5: III 0.07238454 6: III -0.50669175 7: III -1.08576803 8: II -0.18490007 9: II 1.47920052 10: III -0.50669175 11: I 0.00000000 12: II 1.47920052 13: III 1.80961338 14: II -0.18490007 15: III 1.23053710 16: I -1.00000000 17: II -1.84900065 18: I 1.00000000 19: III -0.50669175 20: II -0.18490007