
本文介绍一种通用、可扩展的递归方法,将具有深层嵌套结构(如按地域层级展开)的字典列表扁平化为单一层级的字典列表,保留关键字段(person、city、address、facebooklink),并自动提取每层的业务数据。
在处理地理层级、组织架构或树状分类等嵌套 JSON 数据时,常遇到类似如下结构:顶层是国家,其下以键名(如 "united states")存储子列表,每个子项又包含同构字段及下一级嵌套键(如 "ohio" → "clevland" → "Street A")。目标不是简单展开数组,而是逐层提取有效业务对象,忽略作为容器的动态键名,仅保留含 person、city、address、facebooklink 等语义字段的字典。
以下是一个健壮、可读性强的递归实现:
def flatten_objects(data):
"""
递归扁平化嵌套字典列表。
假设每个有效节点都包含 person/city/address/facebooklink 字段;
动态键(如 "united states", "ohio")对应子列表,需递归处理。
"""
result = []
# 支持输入为单个 dict 或 list of dict
if isinstance(data, dict):
data = [data]
for item in data:
# 提取当前层级的业务字段(非嵌套值)
base_fields = {}
nested_lists = {}
for key, value in item.items():
# 若 value 是 list 且所有元素均为 dict,则视为嵌套子结构
if isinstance(value, list) and value and all(isinstance(v, dict) for v in value):
nested_lists[key] = value
else:
base_fields[key] = value
# 当前层级有有效字段 → 保存
if base_fields:
result.append(base_fields)
# 递归处理每个嵌套列表
for sublist in nested_lists.values():
result.extend(flatten_objects(sublist))
return result✅ 使用示例:
nested_data = [
{
"person": "abc",
"city": "united states",
"facebooklink": "link",
"address": "united states",
"united states": [
{
"person": "cdf",
"city": "ohio",
"facebooklink": "link",
"address": "united states/ohio",
"ohio": [
{
"person": "efg",
"city": "clevland",
"facebooklink": "link",
"address": "united states/ohio/clevland",
"clevland": [
{
"person": "jkl",
"city": "Street A",
"facebooklink": "link",
"address": "united states/ohio/clevland/Street A",
"Street A": [
{
"person": "jkl",
"city": "House 1",
"facebooklink": "link",
"address": "united states/ohio/clevland/Street A/House 1"
}
]
}
]
},
{
"person": "ghi",
"city": "columbus",
"facebooklink": "link",
"address": "united states/ohio/columbus"
}
]
},
{
"person": "abc",
"city": "washington",
"facebooklink": "link",
"address": "united states/washington"
}
]
}
]
flattened = flatten_objects(nested_data)
for obj in flattened:
print(obj)⚠️ 注意事项:
立即学习“Python免费学习笔记(深入)”;
- 该函数不依赖外部库(如 flatten_json),避免因键名动态性导致的路径解析失败;
- 判断嵌套的标准是:value 为非空 list,且所有元素均为 dict —— 这能准确区分数据容器与普通字段(如 "facebooklink": "link");
- 若原始数据中存在同名字段(如某层 "address" 是字符串,另一层是对象),需提前清洗,本函数默认按字符串/基础类型处理;
- 时间复杂度为 O(N),其中 N 是所有嵌套字典节点总数;空间复杂度为 O(D),D 为最大嵌套深度(递归栈开销)。
? 进阶建议:如需保留层级路径信息(例如增加 "level": 2, "parent": "ohio" 字段),可在递归调用时传入上下文参数;若需支持异构结构(混合 list/dict/str),可进一步增强类型判断逻辑。但对本文所示的典型地域树结构,上述实现已简洁、高效且易于维护。










