更新时间:2023-11-18 20:46:22
我尝试了.说,我有一个如下数据框,
I tried my way. Say, I have a dataframe as below,
>>> df.show()
+----+----+----+
|col1|col2|col3|
+----+----+----+
| 1| 2|null|
|null| 3|null|
| 5|null|null|
+----+----+----+
>>> df1 = df.agg(*[F.count(c).alias(c) for c in df.columns])
>>> df1.show()
+----+----+----+
|col1|col2|col3|
+----+----+----+
| 2| 2| 0|
+----+----+----+
>>> nonNull_cols = [c for c in df1.columns if df1[[c]].first()[c] > 0]
>>> df = df.select(*nonNull_cols)
>>> df.show()
+----+----+
|col1|col2|
+----+----+
| 1| 2|
|null| 3|
| 5|null|
+----+----+