V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
ylxw
V2EX  ›  问与答

分块读取 hdfs 数据,一条数据分为多条

  •  
  •   ylxw · Jun 1, 2018 · 1310 views
    This topic created in 2888 days ago, the information mentioned may be changed or developed.
    请问在读取 hdfs 文件的时候,采用分块 chunksize 读取数据,但怎么会把一条数据拆分成多条呢?
    with client.read(full_path,encoding='utf-8',chunk_size=10000) as reader:
    for piece in reader:
    piece=piece.split('\n')
    for line in piece:
    print(line)

    本来数据是 2018-05-01|weorjerjsfj|worjwelfjs|
    结果读出来的数据是 2018-05-01|weo
    rjerjsfj|worjwelfjs|分别显示了两条记录
    1 replies    2019-04-19 11:06:21 +08:00
    RmanzzZ
        1
    RmanzzZ  
       Apr 19, 2019
    老哥问题解决了吗 遇到同样问题了 不知道怎么处理
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   3589 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 33ms · UTC 11:01 · PVG 19:01 · LAX 04:01 · JFK 07:01
    ♥ Do have faith in what you're doing.