hadoop(十二)hadoop组件之Yarn-Yarn操作及配置项

yarn 的常用执行命令

客户端的子命令

app|application      prints application(s) report/kill application/manage long running application   显示应用列表和状态信息 杀死应用等
applicationattempt   prints applicationattempt(s) report                                             打印尝试运行任务的报告  
classpath            prints the class path needed to get the hadoop jar and the required libraries   打印需要的类库和环境变量等信息
cluster              prints cluster information                                                      打印集群报告
container            prints container(s) report                                                      打印容器报告
envvars              display computed Hadoop environment variables                                   显示hadoop环境变量
jar <jar>            run a jar file                                                                  运行jar包文件
logs                 dump container logs                                                             操作容器日志文件
queue                prints queue information                                                        打印队列信息
schedulerconf        Updates scheduler configuration                                                 更新调度器配置  
timelinereader       run the timeline reader server                                                  运行timeline 读取服务   
top                  view cluster information                                                        查看集群实时监控信息 
version              print the version                                                               版本号

yarn app –list 查看任务
yarn app -list -appStates All 根据状态筛选任务
yarn app -kill jobid 杀死任务
yarn applicationattempt -list [appId] 查看当前正在尝试运行的任务 ,从这里能获取到container ID 信息
首先从yarn app –list 获取到 appId

yarn applicationattempt -list application_1649993505147_0037


Total number of application attempts :1
         ApplicationAttempt-Id                 State                        AM-Container-Id                            Tracking-URL
appattempt_1649993505147_0037_000001                FINISHED    container_1649993505147_0037_01_000001  http://hadoop2:8088/proxy/application_1649993505147_0037/

能够获取到任务的具体运行的容器等信息.

yarn logs -applicationId application_1649993505147_0001 查看日志
yarn container -status container_1649993505147_0037_01_000001 查看容器的状态
yarn queue -status default 查看名字为 default 的队列的状态，队列名称可以从 yarn app –list 获取

yarn schedulerconf -global yarn.scheduler.capacity.maximum-applications=10000 动态更新yarn 调度的一些相关的参数
yarn top 类似于yarn的任务管理器，可以实时查看yarn的相关的动态信息

管理命令

daemonlog            get/set the log level for each daemon 
node                 prints node report(s)
rmadmin              admin tools
scmadmin             SharedCacheManager admin tools

yarn node -list 查看节点信息
yarn rmadmin resourceManger 的相关命令
- yarn rmadmin -refreshQueues 刷新队列
- yarn rmadmin -refreshNodesResources 刷新节点资源

生成环境参数配置

主要是说明一些生产环境根据实际情况配置的一些参数。

yarn 的默认参数配置文件在 %HADOOP_HOME%/share/hadoop/yarn/hadoop-yarn-common-3.1.3.jar jar包下的 yarn-default.xml记录了配置的所有默认参数。

Resource Manager参数

调度器配置类型

<property>
  <description>The class to use as the resource scheduler.</description>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

默认是 CapacityScheduler 可以更改为公平调度器。

客户端的最大请求线程数


<property>
    <description>Number of threads to handle scheduler interface.</description>
    <name>yarn.resourcemanager.scheduler.client.thread-count</name>
    <value>50</value>
  </property>

表示对应客户端请求的并发接收能力

Node Manager参数

自动根据硬件进行配置


<property>
  <description>Enable auto-detection of node capabilities such as
  memory and CPU.
  </description>
  <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
  <value>false</value>
</property>

是否开启自动硬件配置设置，一般设置为false。不让它自动设置

虚拟核心数是否作为计算核心数


<property>
  <description>Flag to determine if logical processors(such as
  hyperthreads) should be counted as cores. Only applicable on Linux
  when yarn.nodemanager.resource.cpu-vcores is set to -1 and
  yarn.nodemanager.resource.detect-hardware-capabilities is true.
  </description>
  <name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
  <value>false</value>
</property>

开启后虚拟核也作为一个cpu

虚拟化和真实核的乘数

这个值是真实核 * 它 = 虚拟化的数量

比如双核 4线程那么应该配置2.



<property>
  <description>Multiplier to determine how to convert phyiscal cores to
  vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
  is set to -1(which implies auto-calculate vcores) and
  yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The
  number of vcores will be calculated as
  number of CPUs * multiplier.
  </description>
  <name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
  <value>1.0</value>
</property>

NodeManger 的使用内存数

配置可以使用多少内存

<property>
   <description>Amount of physical memory, in MB, that can be allocated 
   for containers. If set to -1 and
   yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
   automatically calculated(in case of Windows and Linux).
   In other cases, the default is 8192MB.
   </description>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>-1</value>
 </property>

使用的CPU核心数

可以用于向容器分配的cpu

<property>
  <description>Number of vcores that can be allocated
  for containers. This is used by the RM scheduler when allocating
  resources for containers. This is not used to limit the number of
  CPUs used by YARN containers. If it is set to -1 and
  yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
  automatically determined from the hardware in case of Windows and Linux.
  In other cases, number of vcores is 8 by default.</description>
  <name>yarn.nodemanager.resource.cpu-vcores</name>
  <value>-1</value>
</property>

开启物理内存检查限制容器


<property>
  <description>Whether physical memory limits will be enforced for
  containers.</description>
  <name>yarn.nodemanager.pmem-check-enabled</name>
  <value>true</value>
</property>

开启虚拟的内存检查限制容器建议关闭此选项

<property>
  <description>Whether virtual memory limits will be enforced for
  containers.</description>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>true</value>
</property>

虚拟内存和物理内存的比例

虚拟内存 / 物理内存


<property>
  <description>Ratio between virtual memory to physical memory when
  setting memory limits for containers. Container allocations are
  expressed in terms of physical memory, and virtual memory usage
  is allowed to exceed this allocation by this ratio.
  </description>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>2.1</value>
</property>

Container 参数

容器分配的最小内存


<property>
  <description>The minimum allocation for every container request at the RM
  in MBs. Memory requests lower than this will be set to the value of this
  property. Additionally, a node manager that is configured to have less memory
  than this value will be shut down by the resource manager.</description>
  <name>yarn.scheduler.minimum-allocation-mb</name>
  <value>1024</value>
</property>

容器分配的最大内存

<property>
   <description>The maximum allocation for every container request at the RM
   in MBs. Memory requests higher than this will throw an
   InvalidResourceRequestException.</description>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>8192</value>
 </property>

容器分配的最小cpu

<property>
  <description>The minimum allocation for every container request at the RM
  in terms of virtual CPU cores. Requests lower than this will be set to the
  value of this property. Additionally, a node manager that is configured to
  have fewer virtual cores than this value will be shut down by the resource
  manager.</description>
  <name>yarn.scheduler.minimum-allocation-vcores</name>
  <value>1</value>
</property>

容器分配的最大cpu

<property>
   <description>The maximum allocation for every container request at the RM
   in terms of virtual CPU cores. Requests higher than this will throw an
   InvalidResourceRequestException.</description>
   <name>yarn.scheduler.maximum-allocation-vcores</name>
   <value>4</value>
 </property>

yarn 的配置

配置多队列

配置多个队列，实现队列的降级策略,当一个队列阻塞的时候，还可以使用其他的队列运行。

需求1：default队列占总内存的40%，最大资源容量占总资源60%，hive队列占总内存的60%，最大资源容量占总资源80%。
需求2：配置队列优先级

修改文件 /etc/capacity-scheduler.xml

配置添加新的队列


<property>
  <name>yarn.scheduler.capacity.root.queues</name>
  <value>default,hive</value>
  <description>
    The queues at the this level (root is the root queue).
  </description>
</property>

可以通过逗号添加多个队列配置

修改default队列的资源选项


<property>
  <name>yarn.scheduler.capacity.root.default.capacity</name>
  <value>40</value>
  <description>Default queue target capacity.</description>
</property>

修改default 的默认容量，总容量 100

<property>
  <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
  <value>60</value>
  <description>
    The maximum capacity of the default queue.
  </description>
</property>

修改default的最大的容量,总容量是 100

每个队列都可以配置队列的特定信息，那么新添加的队列也需要配置

<property>
   <name>yarn.scheduler.capacity.root.hive.capacity</name>
   <value>60</value>
   <description>hive queue target capacity.</description>
 </property>

配置hive队列的容量


<property>
  <name>yarn.scheduler.capacity.root.hive.maximum-capacity</name>
  <value>80</value>
  <description>
    The maximum capacity of the hive queue.
  </description>
</property>

配置hive的最大容量

 
<property>
  <name>yarn.scheduler.capacity.root.hive.user-limit-factor</name>
  <value>1</value>
  <description>
    hive queue user limit a percentage from 0.0 to 1.0.
  </description>
</property>

用户可以使用的资源的倍数,每个用户最多能使用队列的资源的比例,如果是0.2 那么每个用户最多只能使用队列的 20%资源


<property>
  <name>yarn.scheduler.capacity.root.hive.state</name>
  <value>RUNNING</value>
  <description>
     RUNNING or STOPPED.
  </description>
</property>

表示是否开启这个队列，如果要临时关闭，可以把这个修改

 <property>
  <name>yarn.scheduler.capacity.root.hive.acl_submit_applications</name>
  <value>*</value>
  <description>
    The ACL of who can submit jobs to the hive queue.
  </description>
</property>

配置acl 访问此队列的权限


<property>
  <name>yarn.scheduler.capacity.root.hive.acl_administer_queue</name>
  <value>*</value>
  <description>
    The ACL of who can administer jobs on the hive queue.
  </description>
</property>

配置管理此队列的权限

<property>
  <name>yarn.scheduler.capacity.root.hive.acl_application_max_priority</name>
  <value>*</value>
  <description>
    The ACL of who can submit applications with configured priority.
    For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
  </description>
</property>

配置acl的优先级

<property>
   <name>yarn.scheduler.capacity.root.hive.maximum-application-lifetime
   </name>
   <value>-1</value>
   <description>
      Maximum lifetime of an application which is submitted to a queue
      in seconds. Any value less than or equal to zero will be considered as
      disabled.
      This will be a hard time limit for all applications in this
      queue. If positive value is configured then any application submitted
      to this queue will be killed after exceeds the configured lifetime.
      User can also specify lifetime per application basis in
      application submission context. But user lifetime will be
      overridden if it exceeds queue maximum lifetime. It is point-in-time
      configuration.
      Note : Configuring too low value will result in killing application
      sooner. This feature is applicable only for leaf queue.
   </description>
 </property>

配置应用在此队列的最大存活时间，也就是超时时间
https://blog.cloudera.com/enforcing-application-lifetime-slas-yarn/
如果application指定了超时时间，则提交到该队列的application能够指定的最大超时时间不能超过该值。https://blog.cloudera.com/enforcing-application-lifetime-slas-yarn/


<property>
   <name>yarn.scheduler.capacity.root.hive.default-application-lifetime
   </name>
   <value>-1</value>
   <description>
      Default lifetime of an application which is submitted to a queue
      in seconds. Any value less than or equal to zero will be considered as
      disabled.
      If the user has not submitted application with lifetime value then this
      value will be taken. It is point-in-time configuration.
      Note : Default lifetime can't exceed maximum lifetime. This feature is
      applicable only for leaf queue.
   </description>
 </property>

配置队列中任务的默认的时间，如果提交的任务的时间小于此值将被忽略。

配置修改后，将配置发送到其他机器上。
刷新队列的配置

1	yarn rmadmin -refreshQueues

查看到已经有多个队列了

使用指定队列提交任务

命令行参数

1	hadoop jar hadoop-mapreduce-examples-3.1.3.jar wordcount -D mapreduce.job.queuename=hive /input /dd

代码中指定的方式

添加此配置参数

1 2	conf.set("mapreduce.job.queuename","hive");

任务优先级配置

当一个任务指定了优先级后，会优先给优先级高的任务分配资源。

修改配置，开启任务的优先级功能

修改yarn-site.xml

新增配置信息,指定最大的优先级参数

<property>
    <name>yarn.cluster.max-application-priority</name>
    <value>10</value>
</property>

修改完毕后，配置文件同步到其他机器
1
xsync yarn-site.xml
yan 重启(resoucre Manger 节点操作)

1
2
3

./stop-yarn.sh 
./start-yarn.sh

提交任务的时候指定优先级

//参数指定
-D mapreduce.job.priority=5 
//代码指定
conf.set("mapreduce.job.priority","5");

配置公平调度器

配置使用公平调度器策略

编辑配置文件 yarn-site.xml


<property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    <description>配置使用公平调度器</description>
</property>
<property>
 <name>yarn.scheduler.fair.allocation.file</name>
    <value>/usr/local/software/hadoop/hadoop-3.1.3/etc/hadoop/fair-scheduler.xml</value>
    <description>指明公平调度器队列分配配置文件</description>
</property>

<property>
    <name>yarn.scheduler.fair.preemption</name>
    <value>false</value>
    <description>禁止队列间资源抢占</description>
</property>

默认的是容量调度器，所以需要添加此配置改成公平调度器

上面配置了调度配置文件为 fair-scheduler.xml ,此文件需要新建


<?xml version="1.0"?>
<allocations>
  <!-- 单个队列中Application Master占用资源的最大比例,取值0-1 ，企业一般配置0.1 -->
  <queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
  <!-- 单个队列最大资源的默认值 test atguigu default -->
  <queueMaxResourcesDefault>4096mb,4vcores</queueMaxResourcesDefault>

  <!-- 增加一个队列test_1 -->
  <queue name="test_1">
    <!-- 队列最小资源 -->
    <minResources>2048mb,2vcores</minResources>
    <!-- 队列最大资源 -->
    <maxResources>4096mb,4vcores</maxResources>
    <!-- 队列中最多同时运行的应用数，默认50，根据线程数配置 -->
    <maxRunningApps>4</maxRunningApps>
    <!-- 队列中Application Master占用资源的最大比例 -->
    <maxAMShare>0.5</maxAMShare>
    <!-- 该队列资源权重,默认值为1.0 -->
    <weight>1.0</weight>
    <!-- 队列内部的资源分配策略 -->
    <schedulingPolicy>fair</schedulingPolicy>
  </queue>
  <!-- 增加一个队列test_2 -->
  <queue name="test_2" type="parent">
    <!-- 队列最小资源 -->
    <minResources>2048mb,2vcores</minResources>
    <!-- 队列最大资源 -->
    <maxResources>4096mb,4vcores</maxResources>
    <!-- 队列中最多同时运行的应用数，默认50，根据线程数配置 -->
    <maxRunningApps>4</maxRunningApps>
    <!-- 队列中Application Master占用资源的最大比例 -->
    <maxAMShare>0.5</maxAMShare>
    <!-- 该队列资源权重,默认值为1.0 -->
    <weight>1.0</weight>
    <!-- 队列内部的资源分配策略 -->
    <schedulingPolicy>fair</schedulingPolicy>
  </queue>

  <!-- 任务队列分配策略,可配置多层规则,从第一个规则开始匹配,直到匹配成功 -->
  <queuePlacementPolicy>
    <!-- 提交任务时指定队列,如未指定提交队列,则继续匹配下一个规则; false表示：如果指定队列不存在,不允许自动创建-->
    <rule name="specified" create="false"/>
    <!-- 提交到root.group.username队列,若root.group不存在,不允许自动创建；若root.group.user不存在,允许自动创建 -->
    <rule name="nestedUserQueue" create="true">
        <rule name="primaryGroup" create="false"/>
    </rule>
    <!-- 最后一个规则必须为reject或者default。Reject表示拒绝创建提交失败，default表示把任务提交到default队列 -->
    <rule name="reject" />
  </queuePlacementPolicy>
</allocations>

重启yarn

1
2
3

stop-yarn.sh
start-yarn.sh

指定队列的任务

1 2	hadoop jar hadoop-mapreduce-examples-3.1.3.jar pi -Dmapreduce.job.queuename=root.test_1 1 1

不指定队列

1 2	hadoop jar hadoop-mapreduce-examples-3.1.3.jar pi 1 1

会出现聚合，因为目前配置的是

yarn Tool 接口

org.apache.hadoop.util.Tool 是一个标准的命令行选项的接口。在此接口中的run 方法的参数都为处理用户的参数项而不处理系统的相关的参数.

定义一个处理程序实现tool 接口


public class MapWorkCount implements Tool {

    private Configuration configuration;


    @Override
    public int run(String[] args) throws Exception {
        Job job = Job.getInstance(configuration);
        job.setJarByClass(WorkCountDriver.class);

        job.setMapperClass(WorkCountMapper.class);
        job.setReducerClass(WorkCountMapReduce.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.setInputPaths(job,new Path(args[0]));
        FileOutputFormat.setOutputPath(job,new Path(args[1]));
        return job.waitForCompletion(true) ? 0: 1;
    }

    @Override
    public void setConf(Configuration conf) {
        this.configuration = conf;
    }

    @Override
    public Configuration getConf() {
        return configuration;
    }
}

实现run 方法，run方法接收的参数只有自定义的目录等信息
实现设置配置和获取配置的方法

编写Mapper 和Reduce 程序。
为多个处理程序编写统一的Driver


public class WorkCountDriver {

    private static Tool tool;

    public static void main(String[] args) throws Exception {
        Configuration configuration = new Configuration();
        switch (args[0]){
            case "wordCount":
                tool = new MapWorkCount();
                break;
            default:
                throw new IllegalArgumentException();

        }
        tool.setConf(configuration);
        int ret = ToolRunner.run(tool, new String[]{args[args.length - 2], args[args.length - 1]});
        System.exit(ret);
    }
}

在Driver中去除无用的参数，并获取对应类型的tool 实现，设置配置类，使用ToolRunner 调用run 方法

执行命令运行

1 2	hadoop jar mapReduceDemo-1.0-SNAPSHOT.jar com.w.mapreduce.tools.WorkCountDriver wordCount /input /_88