更新时间:2023-01-26 08:08:17
我建议将 all_paths
更改为 leaf_paths
,这意味着它只会产生那些开始于一片叶子.
I would suggest changing all_paths
to leaf_paths
, meaning that it would only yield those paths that start at a leaf.
然后使用这些路径:
以下是您如何在标有注释的两个位置更改 all_paths
:
Here is how you would alter all_paths
at two places marked with a comment:
def leaf_paths(adj):
def recur(path):
node = path[-1]
neighbors = [neighbor for neighbor in adj[node] if not neighbor in path]
if not neighbors:
yield path
for neighbor in neighbors:
yield from recur(path + [neighbor])
# identify the internal nodes (not leaves)
internals = set(parent for parents in adj.values() for parent in parents)
for node in adj:
if not node in internals: # require that the starting node is a leaf
yield from recur([node])
然后添加这个函数:
def all_leaf_pairs(paths):
trees = {}
for path in paths:
if len(path) > 1:
root = path[-1]
if not root in trees:
trees[root] = []
it = iter(path)
leaf = next(it)
trees[root].extend((leaf, node) for node in it)
return trees
你的主程序会这样做:
data = [
["ANALYTICAL_BALANCE","BFG_DEPOSIT"],
["CUSTOMER_DETAIL","BALANCE"],
["BFG_2056", "FFD_15"],
["BALANCE","BFG_16"],
["BFG_16","STAT_HIST"],
["ANALYTICAL_BALANCE","BFG_2056"],
["CUSTOM_DATA","AND_11"],
["AND_11","DICT_DEAL"],
["DICT_DEAL","BFG_2056"]
]
adj = create_adj(data)
paths = leaf_paths(adj)
import pprint
pprint.pprint(all_leaf_pairs(paths))
这将输出:
{'BFG_DEPOSIT': [('ANALYTICAL_BALANCE', 'BFG_DEPOSIT')],
'FFD_15': [('ANALYTICAL_BALANCE', 'BFG_2056'),
('ANALYTICAL_BALANCE', 'FFD_15'),
('CUSTOM_DATA', 'AND_11'),
('CUSTOM_DATA', 'DICT_DEAL'),
('CUSTOM_DATA', 'BFG_2056'),
('CUSTOM_DATA', 'FFD_15')],
'STAT_HIST': [('CUSTOMER_DETAIL', 'BALANCE'),
('CUSTOMER_DETAIL', 'BFG_16'),
('CUSTOMER_DETAIL', 'STAT_HIST')]}
leaf_paths
的说明这个函数使用递归.它在其范围内定义了 recur
.
但主要代码从识别内部节点(即至少有一个子节点的节点)开始.由于 adj
提供给定节点的父节点,我们只需要收集所有这些父节点.
But the main code starts with identifying the internal nodes (i.e. the nodes that have at least one child). Since adj
provides the parent(s) for a given node, we just have to collect all those parents.
我们使用这组内部节点来确保我们只在叶节点上开始递归,因为在输出中我们希望路径总是以叶节点开始.
We use this set of internal nodes to make sure we start the recursion only on leaf nodes, as in the output we want to have paths that always start out with a leaf node.
recur
函数将从给定的叶子走向它可以找到的任何根.它使用它可以找到的下一个父级(neighbor
)扩展路径并执行递归,直到没有更多的父级(即,它是一个根).当这种情况发生时,累积的路径(以叶子开始并以根结束)yield.
The recur
function will walk from the given leaf towards any root it can find. It extends the path with the next parent it can find (neighbor
) and performs recursion until there is no more parent (i.e., it is a root). When that happens the accumulated path (that starts with a leaf and ends with a root) is yielded.
leaf_paths
本身产生任何 recur
产生的路径.
leaf_paths
itself yields any path that recur
yields.