更新时间:2023-01-31 19:40:48
在开放源码配置单元(和可能的EMR)中,
$ #reducers =(映射器的输入字节数)
/(hive.exec.reducers.bytes.per.reducer)
默认的hive.exec.reducers.bytes.per.reducer是1G。
reducer还取决于输入文件的大小
您可以通过设置属性hive.exec.reducers.bytes.per.reducer来更改:
更改hive-site.xml
或使用set
I'm a novice. I'm curious to know how reducers are set to different hive data sets. Is it based on the size of the data processed? Or a default set of reducers for all?
For example, 5GB of data requires how many reducers? will the same number of reducers set to smaller data set?
Thanks in advance!! Cheers!
In open source hive (and EMR likely)
# reducers = (# bytes of input to mappers)
/ (hive.exec.reducers.bytes.per.reducer)
default hive.exec.reducers.bytes.per.reducer is 1G.
Number of reducers depends also on size of the input file You could change that by setting the property hive.exec.reducers.bytes.per.reducer:
either by changing hive-site.xml
hive.exec.reducers.bytes.per.reducer 1000000
or using set
hive -e "set hive.exec.reducers.bytes.per.reducer=100000