且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在安装了核心服务(Spark等)之后,是否可以设置引导操作以在EMR上运行?

更新时间:2022-06-27 00:20:47

您可以提交某些脚本作为 step ,而不是引导程序.例如,我制作了一个SSL证书更新脚本,并将其一步一步地应用于EMR.这是我用Python语言编写的lambda函数的一部分.但是您可以通过在控制台或其他语言上手动添加此步骤.

You can submit some script as a step, not a bootstrap. For example, I made an SSL certificate update script and it is applied to the EMR by a step. This is a part of my lambda function in Python language. But you can add this step by manually on the console, or other languages.

Steps=[{
    'Name': 'PrestoCertificate',
    'ActionOnFailure': 'CONTINUE',
    'HadoopJarStep': {
        'Jar': 's3://ap-northeast-2.elasticmapreduce/libs/script-runner/script-runner.jar',
        'Args': ['s3://myS3/PrestoSteps_InstallCertificate.sh']
    }
}]

关键点是由亚马逊预先构建的 script-runner.jar ,您可以通过更改区域前缀将其用于每个区域.它会接收一个.sh文件并运行它.

The key point is script-runner.jar that is pre-built by amazon and you can use that for each region by changing the region prefix. It receives a .sh file and runs it.

您应该知道的一件事是,脚本将在所有节点上运行,并且如果您只想执行主实例,则必须使用if-else语句.

One thing you should know is, the script will run on all the nodes and if you want to do it only the master instance then you have to use if-else statement.

#!/bin/bash
BOOL=`cat /emr/instance-controller/lib/info/instance.json | jq .isMaster`

if [ $BOOL == "true" ]
then
    <your code>
fi