yarn 的常用执行命令

客户端的子命令

1
2
3
4
5
6
7
8
9
10
11
12
13
app|application      prints application(s) report/kill application/manage long running application   显示应用列表和状态信息 杀死应用等
applicationattempt prints applicationattempt(s) report 打印尝试运行任务的报告
classpath prints the class path needed to get the hadoop jar and the required libraries 打印需要的类库和环境变量等信息
cluster prints cluster information 打印集群报告
container prints container(s) report 打印容器报告
envvars display computed Hadoop environment variables 显示hadoop环境变量
jar <jar> run a jar file 运行jar包文件
logs dump container logs 操作容器日志文件
queue prints queue information 打印队列信息
schedulerconf Updates scheduler configuration 更新调度器配置
timelinereader run the timeline reader server 运行timeline 读取服务
top view cluster information 查看集群实时监控信息
version print the version 版本号
  • yarn app –list 查看任务

  • yarn app -list -appStates All 根据状态筛选任务

  • yarn app -kill jobid 杀死任务

  • yarn applicationattempt -list [appId] 查看当前正在尝试运行的任务 ,从这里能获取到container ID 信息

    首先从yarn app –list 获取到 appId

yarn applicationattempt -list application_1649993505147_0037
1
2
3
4
5

Total number of application attempts :1
ApplicationAttempt-Id State AM-Container-Id Tracking-URL
appattempt_1649993505147_0037_000001 FINISHED container_1649993505147_0037_01_000001 http://hadoop2:8088/proxy/application_1649993505147_0037/

能够获取到任务的具体运行的容器等信息.

  • yarn logs -applicationId application_1649993505147_0001 查看日志

  • yarn container -status container_1649993505147_0037_01_000001 查看容器的状态

  • yarn queue -status default 查看名字为 default 的队列的状态,队列名称可以从 yarn app –list 获取

  • yarn schedulerconf -global yarn.scheduler.capacity.maximum-applications=10000 动态更新yarn 调度的一些相关的参数

  • yarn top 类似于yarn的任务管理器,可以实时查看yarn的相关的动态信息

管理命令

1
2
3
4
daemonlog            get/set the log level for each daemon 
node prints node report(s)
rmadmin admin tools
scmadmin SharedCacheManager admin tools
  • yarn node -list 查看节点信息

  • yarn rmadmin resourceManger 的相关命令

    • yarn rmadmin -refreshQueues 刷新队列

    • yarn rmadmin -refreshNodesResources 刷新节点资源

生成环境参数配置

主要是说明一些生产环境根据实际情况配置的一些参数。

yarn 的默认参数配置文件在 %HADOOP_HOME%/share/hadoop/yarn/hadoop-yarn-common-3.1.3.jar jar包下的 yarn-default.xml记录了配置的所有默认参数。

Resource Manager参数

  • 调度器配置类型
1
2
3
4
5
<property>
<description>The class to use as the resource scheduler.</description>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

默认是 CapacityScheduler 可以更改为公平调度器。

  • 客户端的最大请求线程数
1
2
3
4
5
6
7

<property>
<description>Number of threads to handle scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>50</value>
</property>

表示对应客户端请求的并发接收能力

Node Manager参数

  • 自动根据硬件进行配置
1
2
3
4
5
6
7
8
9

<property>
<description>Enable auto-detection of node capabilities such as
memory and CPU.
</description>
<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
<value>false</value>
</property>

是否开启自动硬件配置设置,一般设置为false。不让它自动设置

  • 虚拟核心数是否作为计算核心数
1
2
3
4
5
6
7
8
9
10
11

<property>
<description>Flag to determine if logical processors(such as
hyperthreads) should be counted as cores. Only applicable on Linux
when yarn.nodemanager.resource.cpu-vcores is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true.
</description>
<name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
<value>false</value>
</property>

开启后虚拟核也作为一个cpu

  • 虚拟化和真实核的乘数

这个值是 真实核 * 它 = 虚拟化的数量

比如 双核 4线程 那么应该配置2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14


<property>
<description>Multiplier to determine how to convert phyiscal cores to
vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
is set to -1(which implies auto-calculate vcores) and
yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The
number of vcores will be calculated as
number of CPUs * multiplier.
</description>
<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
<value>1.0</value>
</property>

  • NodeManger 的使用内存数

配置可以使用多少内存

1
2
3
4
5
6
7
8
9
10
11
<property>
<description>Amount of physical memory, in MB, that can be allocated
for containers. If set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically calculated(in case of Windows and Linux).
In other cases, the default is 8192MB.
</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>-1</value>
</property>

  • 使用的CPU核心数

可以用于向容器分配的cpu

1
2
3
4
5
6
7
8
9
10
11
12
13
<property>
<description>Number of vcores that can be allocated
for containers. This is used by the RM scheduler when allocating
resources for containers. This is not used to limit the number of
CPUs used by YARN containers. If it is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically determined from the hardware in case of Windows and Linux.
In other cases, number of vcores is 8 by default.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>-1</value>
</property>


  • 开启物理内存检查限制容器
1
2
3
4
5
6
7
8

<property>
<description>Whether physical memory limits will be enforced for
containers.</description>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>true</value>
</property>

  • 开启虚拟的内存检查限制容器 建议关闭此选项
1
2
3
4
5
6
7
<property>
<description>Whether virtual memory limits will be enforced for
containers.</description>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>true</value>
</property>

  • 虚拟内存和物理内存的比例

虚拟内存 / 物理内存

1
2
3
4
5
6
7
8
9
10
11
12

<property>
<description>Ratio between virtual memory to physical memory when
setting memory limits for containers. Container allocations are
expressed in terms of physical memory, and virtual memory usage
is allowed to exceed this allocation by this ratio.
</description>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>


Container 参数

  • 容器分配的最小内存
1
2
3
4
5
6
7
8
9
10

<property>
<description>The minimum allocation for every container request at the RM
in MBs. Memory requests lower than this will be set to the value of this
property. Additionally, a node manager that is configured to have less memory
than this value will be shut down by the resource manager.</description>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>

  • 容器分配的最大内存
1
2
3
4
5
6
7
8
<property>
<description>The maximum allocation for every container request at the RM
in MBs. Memory requests higher than this will throw an
InvalidResourceRequestException.</description>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>

  • 容器分配的最小cpu
1
2
3
4
5
6
7
8
9
10
<property>
<description>The minimum allocation for every container request at the RM
in terms of virtual CPU cores. Requests lower than this will be set to the
value of this property. Additionally, a node manager that is configured to
have fewer virtual cores than this value will be shut down by the resource
manager.</description>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>

  • 容器分配的最大cpu
1
2
3
4
5
6
7
8
<property>
<description>The maximum allocation for every container request at the RM
in terms of virtual CPU cores. Requests higher than this will throw an
InvalidResourceRequestException.</description>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>4</value>
</property>

yarn 的配置

配置多队列

配置多个队列,实现队列的降级策略,当一个队列阻塞的时候,还可以使用其他的队列运行。

需求1:default队列占总内存的40%,最大资源容量占总资源60%,hive队列占总内存的60%,最大资源容量占总资源80%。
需求2:配置队列优先级

修改文件 /etc/capacity-scheduler.xml

  1. 配置添加新的队列
1
2
3
4
5
6
7
8

<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>default,hive</value>
<description>
The queues at the this level (root is the root queue).
</description>
</property>

可以通过逗号添加多个队列配置

  1. 修改default队列的资源选项
1
2
3
4
5
6
7
8

<property>
<name>yarn.scheduler.capacity.root.default.capacity</name>
<value>40</value>
<description>Default queue target capacity.</description>
</property>


修改default 的默认容量,总容量 100

1
2
3
4
5
6
7
8
9
<property>
<name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
<value>60</value>
<description>
The maximum capacity of the default queue.
</description>
</property>


修改default的最大的容量,总容量是 100

  1. 每个队列都可以配置队列的特定信息,那么新添加的队列也需要配置
1
2
3
4
5
6
7
<property>
<name>yarn.scheduler.capacity.root.hive.capacity</name>
<value>60</value>
<description>hive queue target capacity.</description>
</property>


配置hive队列的容量

1
2
3
4
5
6
7
8
9

<property>
<name>yarn.scheduler.capacity.root.hive.maximum-capacity</name>
<value>80</value>
<description>
The maximum capacity of the hive queue.
</description>
</property>

配置hive的最大容量

1
2
3
4
5
6
7
8
 
<property>
<name>yarn.scheduler.capacity.root.hive.user-limit-factor</name>
<value>1</value>
<description>
hive queue user limit a percentage from 0.0 to 1.0.
</description>
</property>

用户可以使用的资源的倍数,每个用户最多能使用队列的资源的比例,如果是0.2 那么每个用户最多只能使用 队列的 20%资源

1
2
3
4
5
6
7
8
9

<property>
<name>yarn.scheduler.capacity.root.hive.state</name>
<value>RUNNING</value>
<description>
RUNNING or STOPPED.
</description>
</property>

表示是否开启这个队列,如果要临时关闭,可以把这个修改

1
2
3
4
5
6
7
8
 <property>
<name>yarn.scheduler.capacity.root.hive.acl_submit_applications</name>
<value>*</value>
<description>
The ACL of who can submit jobs to the hive queue.
</description>
</property>

配置acl 访问此队列的权限

1
2
3
4
5
6
7
8
9

<property>
<name>yarn.scheduler.capacity.root.hive.acl_administer_queue</name>
<value>*</value>
<description>
The ACL of who can administer jobs on the hive queue.
</description>
</property>

配置管理此队列的权限

1
2
3
4
5
6
7
8
9
<property>
<name>yarn.scheduler.capacity.root.hive.acl_application_max_priority</name>
<value>*</value>
<description>
The ACL of who can submit applications with configured priority.
For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
</description>
</property>

配置acl的优先级

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<property>
<name>yarn.scheduler.capacity.root.hive.maximum-application-lifetime
</name>
<value>-1</value>
<description>
Maximum lifetime of an application which is submitted to a queue
in seconds. Any value less than or equal to zero will be considered as
disabled.
This will be a hard time limit for all applications in this
queue. If positive value is configured then any application submitted
to this queue will be killed after exceeds the configured lifetime.
User can also specify lifetime per application basis in
application submission context. But user lifetime will be
overridden if it exceeds queue maximum lifetime. It is point-in-time
configuration.
Note : Configuring too low value will result in killing application
sooner. This feature is applicable only for leaf queue.
</description>
</property>


配置应用在此队列的最大存活时间,也就是超时时间
https://blog.cloudera.com/enforcing-application-lifetime-slas-yarn/
如果application指定了超时时间,则提交到该队列的application能够指定的最大超时时间不能超过该值。https://blog.cloudera.com/enforcing-application-lifetime-slas-yarn/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

<property>
<name>yarn.scheduler.capacity.root.hive.default-application-lifetime
</name>
<value>-1</value>
<description>
Default lifetime of an application which is submitted to a queue
in seconds. Any value less than or equal to zero will be considered as
disabled.
If the user has not submitted application with lifetime value then this
value will be taken. It is point-in-time configuration.
Note : Default lifetime can't exceed maximum lifetime. This feature is
applicable only for leaf queue.
</description>
</property>

配置队列中任务的默认的时间,如果提交的任务的时间小于此值将被忽略。

  1. 配置修改后,将配置发送到其他机器上。

  2. 刷新队列的配置

1
yarn rmadmin -refreshQueues

查看到已经有多个队列了

  1. 使用指定队列提交任务

命令行参数

1
hadoop jar hadoop-mapreduce-examples-3.1.3.jar  wordcount -D mapreduce.job.queuename=hive /input /dd

  1. 代码中指定的方式

添加此配置参数

1
2
conf.set("mapreduce.job.queuename","hive");

任务优先级配置

当一个任务指定了优先级后,会优先给优先级高的任务分配资源。

  1. 修改配置,开启任务的优先级功能

修改yarn-site.xml

新增配置信息,指定最大的优先级参数

1
2
3
4
5
<property>
<name>yarn.cluster.max-application-priority</name>
<value>10</value>
</property>

  1. 修改完毕后,配置文件同步到其他机器

    1
    xsync yarn-site.xml
  2. yan 重启(resoucre Manger 节点操作)

1
2
3
./stop-yarn.sh 
./start-yarn.sh

  1. 提交任务的时候指定优先级
1
2
3
4
5
//参数指定
-D mapreduce.job.priority=5
//代码指定
conf.set("mapreduce.job.priority","5");

配置公平调度器

  1. 配置使用公平调度器策略

编辑配置文件 yarn-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
<description>配置使用公平调度器</description>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/usr/local/software/hadoop/hadoop-3.1.3/etc/hadoop/fair-scheduler.xml</value>
<description>指明公平调度器队列分配配置文件</description>
</property>

<property>
<name>yarn.scheduler.fair.preemption</name>
<value>false</value>
<description>禁止队列间资源抢占</description>
</property>

默认的是容量调度器,所以需要添加此配置改成公平调度器

  1. 上面配置了调度配置文件为 fair-scheduler.xml ,此文件需要新建
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

<?xml version="1.0"?>
<allocations>
<!-- 单个队列中Application Master占用资源的最大比例,取值0-1 ,企业一般配置0.1 -->
<queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
<!-- 单个队列最大资源的默认值 test atguigu default -->
<queueMaxResourcesDefault>4096mb,4vcores</queueMaxResourcesDefault>

<!-- 增加一个队列test_1 -->
<queue name="test_1">
<!-- 队列最小资源 -->
<minResources>2048mb,2vcores</minResources>
<!-- 队列最大资源 -->
<maxResources>4096mb,4vcores</maxResources>
<!-- 队列中最多同时运行的应用数,默认50,根据线程数配置 -->
<maxRunningApps>4</maxRunningApps>
<!-- 队列中Application Master占用资源的最大比例 -->
<maxAMShare>0.5</maxAMShare>
<!-- 该队列资源权重,默认值为1.0 -->
<weight>1.0</weight>
<!-- 队列内部的资源分配策略 -->
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<!-- 增加一个队列test_2 -->
<queue name="test_2" type="parent">
<!-- 队列最小资源 -->
<minResources>2048mb,2vcores</minResources>
<!-- 队列最大资源 -->
<maxResources>4096mb,4vcores</maxResources>
<!-- 队列中最多同时运行的应用数,默认50,根据线程数配置 -->
<maxRunningApps>4</maxRunningApps>
<!-- 队列中Application Master占用资源的最大比例 -->
<maxAMShare>0.5</maxAMShare>
<!-- 该队列资源权重,默认值为1.0 -->
<weight>1.0</weight>
<!-- 队列内部的资源分配策略 -->
<schedulingPolicy>fair</schedulingPolicy>
</queue>

<!-- 任务队列分配策略,可配置多层规则,从第一个规则开始匹配,直到匹配成功 -->
<queuePlacementPolicy>
<!-- 提交任务时指定队列,如未指定提交队列,则继续匹配下一个规则; false表示:如果指定队列不存在,不允许自动创建-->
<rule name="specified" create="false"/>
<!-- 提交到root.group.username队列,若root.group不存在,不允许自动创建;若root.group.user不存在,允许自动创建 -->
<rule name="nestedUserQueue" create="true">
<rule name="primaryGroup" create="false"/>
</rule>
<!-- 最后一个规则必须为reject或者default。Reject表示拒绝创建提交失败,default表示把任务提交到default队列 -->
<rule name="reject" />
</queuePlacementPolicy>
</allocations>
  1. 重启yarn
1
2
3
stop-yarn.sh
start-yarn.sh

  1. 指定队列的任务
1
2
hadoop jar hadoop-mapreduce-examples-3.1.3.jar  pi -Dmapreduce.job.queuename=root.test_1 1 1

  1. 不指定队列
1
2
hadoop jar hadoop-mapreduce-examples-3.1.3.jar  pi  1 1

会出现聚合,因为目前配置的是

yarn Tool 接口

org.apache.hadoop.util.Tool 是一个标准的命令行选项的接口。在此接口中的run 方法的参数都为处理用户的参数项而不处理系统的相关的参数.

  1. 定义一个处理程序实现tool 接口
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

public class MapWorkCount implements Tool {

private Configuration configuration;


@Override
public int run(String[] args) throws Exception {
Job job = Job.getInstance(configuration);
job.setJarByClass(WorkCountDriver.class);

job.setMapperClass(WorkCountMapper.class);
job.setReducerClass(WorkCountMapReduce.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.setInputPaths(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
return job.waitForCompletion(true) ? 0: 1;
}

@Override
public void setConf(Configuration conf) {
this.configuration = conf;
}

@Override
public Configuration getConf() {
return configuration;
}
}


实现run 方法,run方法接收的参数只有自定义的目录等信息
实现设置配置和获取配置的方法

  1. 编写Mapper 和Reduce 程序。

  2. 为多个处理程序编写统一的Driver

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

public class WorkCountDriver {

private static Tool tool;

public static void main(String[] args) throws Exception {
Configuration configuration = new Configuration();
switch (args[0]){
case "wordCount":
tool = new MapWorkCount();
break;
default:
throw new IllegalArgumentException();

}
tool.setConf(configuration);
int ret = ToolRunner.run(tool, new String[]{args[args.length - 2], args[args.length - 1]});
System.exit(ret);
}
}


在Driver中去除无用的参数,并获取对应类型的tool 实现,设置配置类,使用ToolRunner 调用run 方法

  1. 执行命令运行
1
2

hadoop jar mapReduceDemo-1.0-SNAPSHOT.jar com.w.mapreduce.tools.WorkCountDriver wordCount /input /_88