V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
yuntong
V2EX  ›  Hadoop

Hadoop single node 和 cluster 环境 对 MR 有什么影响?

  •  1
     
  •   yuntong · Jan 13, 2016 · 3374 views
    This topic created in 3769 days ago, the information mentioned may be changed or developed.

    最近在学 Hadoop,
    问题 1: 我写 的 同一个 MR 在单节点 和 集群环境下是否会影响输出结果?

    问题 2:mapper 的结果是如何对应到 不同的 reducer 的?(比如 我设置了 30 个 reducer)

    这里好像有个 partition 的概念.
    好像第一种是默认按照 哈希进行均匀的 分部到 reducer
    第二种可以具体手动指定到分配到某个 reducer.
    是不是这样?

    因为 我的 电脑内存 等原因 还在用单节点 进行学习.

    1 replies    2016-01-19 13:25:55 +08:00
    staticor
        1
    staticor  
       Jan 19, 2016
    2. When there are multiple reducers, the map tasks partition their output, each creating one partition for each reduce task. There can be many keys (and their associated values) in each partition, but the records for any given key are all in a single partition. The partitioning can be controlled by a user-defined partitioning function, but normally the default partitioner -- which buckets keys using a hash function -- works very well.


    刚刚看到这里, 回答一下问题 2 是的. 重写 partitioner.
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   908 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 34ms · UTC 20:45 · PVG 04:45 · LAX 13:45 · JFK 16:45
    ♥ Do have faith in what you're doing.