问题:在最近的需求开发中,有这么个分组比例计算求和问题,根据字段'CPN'进行分组,计算每一笔PO Line Actual CT2R * line 数量比重,取名为'Weighted(QTY)CT2R',再根据相同的'CPN'对每行'Weighted(QTY)CT2R'值进行汇总求和得到总的'Weighted(QTY)CT2R'值,如下图填充色为黄色的单元格即是我们所需要的目标值
具体计算逻辑如下:
用Pandas代码实现上述需求如下所示:
import pandas as pddf = pd.DataFrame([['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',10,90],['01-0989',200,50],['02-0437',20,80],['02-0437',20,80],['02-0437',20,80]],columns = ['cpn','po_line_qty','actual_ct2r'])# 根据字段'cpn'进行分组,对字段'po_line_qty'中的值进行求和,取名为total
total = df.groupby('cpn').agg({'po_line_qty':sum}).reset_index()
# 将字段'po_line_qty'更名为'total_po_line_qty'
total = total.rename(columns = {'po_line_qty':'total_po_line_qty'})
# df表与total表根据字段'cpn'进行左连接,取名为new_res
new_res = pd.merge(df,total,how='left',on='cpn')def weighted_qty_ct2r(row):scale = row['po_line_qty'] / row['total_po_line_qty']weighted_qty_ct2r = scale * row['actual_ct2r']return weighted_qty_ct2r# 生成字段'weighted_qty_ct2r'
new_res['weighted_qty_ct2r'] = new_res.apply(lambda row:weighted_qty_ct2r(row), axis=1)
# 根据字段'cpn'进行分组,对字段'weighted_qty_ct2r'中的值进行求和,取名为df_result
df_result = new_res.groupby('cpn').agg({'weighted_qty_ct2r':sum})
df
total
new_res
df_result