从 Kaggle 下载了一份数据做练习, 其中一列的内容是字典列表, 格式如下:
[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]
请问,如何获取到 每个字典中 name 对应的值?
data = load_data() # 读取 csv 文件获取数据, data 为 DataFrame 类型
production=data['production_companies'] # 获取指定列的数据, production 为 Series 类型
p_values = production.values # 获取 values 列表, p_values 为 numpy.ndarray 类型
# 问题出现了 p_values 中的元素为字符串类型!
# ndarray 中的元素是字符串, 如何再进一步提取所需内容(每个字典中 name 对应的值)?
print(p_values)
获得输出如下所示:
['[{"name": "Ingenious Film Partners", "id": 289}, {"name": "Twentieth Century Fox Film Corporation", "id": 306}, {"name": "Dune Entertainment", "id": 444}, {"name": "Lightstorm Entertainment", "id": 574}]' '[{"name": "Walt Disney Pictures", "id": 2}, {"name": "Jerry Bruckheimer Films", "id": 130}, {"name": "Second Mate Productions", "id": 19936}]' '[{"name": "Columbia Pictures", "id": 5}, {"name": "Danjaq", "id": 10761}, {"name": "B24", "id": 69434}]' ... '[{"name": "Front Street Pictures", "id": 3958}, {"name": "Muse Entertainment Enterprises", "id": 6438}]' '[]' '[{"name": "rusty bear entertainment", "id": 87986}, {"name": "lucky crow films", "id": 87987}]']
尝试将 ndarray 转为 list, 结果元素类型依旧是 string
1
viiii OP 补充一下, 数据下载地址(需要登陆) https://www.kaggle.com/tmdb/tmdb-movie-metadata#tmdb_5000_movies.csv
|
2
viiii OP 问题已解决,下沉吧
一直在往 pandas 上想,忘了基础类型了,哈哈 |