ตั้งค่า ET.SubElement.text เป็น dict.value หาก dict.key เท่ากับโหนด XML อื่นภายในพาเรนต์เดียวกัน

ดังนั้น ฉันกำลังสร้างองค์ประกอบย่อยใหม่ด้วย ElementTree โดยที่ข้อความของโหนดใหม่ควรเป็นค่า dict หากคีย์ dict ของค่าที่สอดคล้องกันเท่ากับข้อความของโหนด XML อื่นภายในโหนดหลักเดียวกัน

ตัวอย่าง XML:

<ns0:scaleType xmlns:ns0="http://someURL.com/">
  <scales>
    <scale>
        <names>
            <name id="0">abc</name>
            <name id="1" />
        </names>
        <alternativeExportValues>
        </alternativeExportValues>
    </scale>
    <scale>
        <names>
            <name id="0">def</name>
            <name id="1" />
        </names>
        <alternativeExportValues>
        </alternativeExportValues>
    </scale>
 </scales>
</ns0:scaleType>

ตัวอย่าง CSV:

name;value
abc;10012
def;20025

รหัส Python ตอนนี้:

import xml.etree.ElementTree as ET

import csv

csvData = []

with open('myCSV.csv', 'r', encoding="utf8") as f:
    reader = csv.reader(f, delimiter=";")
    for row in reader:
        csvData.append({'name': row[0], 'value': row[1]})

tree = ET.parse('myXml.xml')
root = tree.getroot()

def my_Function():
    for p in csvData:
        for name in root.findall(".//name[@id='0']"):
            text = name.text
            if p['name'] == text:
                value = p['value']
                return value
my_Function()


for elem in root.iter('alternativeExportValues'):
    newNode = ET.SubElement(elem, 'alternativeExportValue')
    newNode.text = 

tree.write("myNewXML.xml", encoding="utf-8")

ผลลัพธ์ที่คาดหวัง:

<ns0:scaleType xmlns:ns0="http://someURL.com/">
  <scales>
    <scale>
        <names>
            <name id="0">abc</name>
            <name id="1" />
        </names>
        <alternativeExportValues>
           <alternativeExportValue>10012</alternativeExportValue>
        </alternativeExportValues>
    </scale>
    <scale>
        <names>
            <name id="0">def</name>
            <name id="1" />
        </names>
        <alternativeExportValues>
           <alternativeExportValue>20025</alternativeExportValue>
        </alternativeExportValues>
    </scale>
 </scales>
</ns0:scaleType>

ฉันพยายามใส่ for loop ที่สร้างโหนด alternativeExportValue ใน my_Function แต่สุดท้ายก็ได้รับค่าเดียวกันใน newNode.text หรือติดอยู่ในลูปไม่มีที่สิ้นสุด

ดังที่คุณเห็นในผลลัพธ์ที่คาดหวัง ฉันต้องการให้ dict.value เป็นข้อความสำหรับโหนดที่สร้างขึ้นใหม่ ถ้ามันตรงกับ
<name id="0"> innerText ภายในพาเรนต์เดียวกัน <scale>

python xml elementtree

Alecbalec 11.12.2019 แหล่งที่มา

คำตอบ (1)

arrow_upward
1
arrow_downward

ฉันไม่แน่ใจแน่ชัดว่า my_Function ควรจะทำอะไร แต่ให้พิจารณาตรรกะต่อไปนี้:

อ่าน/ประมวลผลข้อมูล CSV (คุณกำลังดำเนินการนี้อยู่แล้ว แต่ลองพิจารณา DictReader แทน ซึ่งจะแมปค่ากับ dict โดยใช้คีย์จากแถวแรก)
ประมวลผลแต่ละองค์ประกอบ scale
สร้างองค์ประกอบ alternativeExportValue ใหม่ด้วยค่า "value"
ตรวจสอบว่าองค์ประกอบ name ที่มีค่าแอตทริบิวต์ id "0" ตรงกับรายการ "name" ปัจจุบันหรือไม่
หากเป็นเช่นนั้น ให้เพิ่มองค์ประกอบ alternativeExportValue ใหม่ต่อท้าย

ตัวอย่าง...

import xml.etree.ElementTree as ET
import csv

with open('myCSV.csv', 'r', encoding="utf8") as csvfile:
    tree = ET.parse('myXml.xml')

    for row in csv.DictReader(csvfile, delimiter=";"):
        name = row.get("name")
        new_aev_elem = ET.Element("alternativeExportValue")
        new_aev_elem.text = row.get("value")
        for scale in tree.findall(".//scale"):
            name0 = scale.find("names/name[@id='0']")
            if name0.text == name:
                aevs_elem = scale.find("alternativeExportValues")
                aevs_elem.append(new_aev_elem)
                break

    tree.write("myNewXML.xml", encoding="utf-8")

วิธีนี้ใช้งานได้แต่ไม่ได้มีประสิทธิภาพมากนัก เนื่องจากคุณต้องประมวลผลองค์ประกอบ scale ทุกรายการที่อยู่ก่อนหน้าองค์ประกอบ scale จริงที่คุณต้องการแก้ไข

ที่แย่กว่านั้นคือ ถ้าคุณลบ break ออก มันจะประมวลผลทุกองค์ประกอบ scale ใน XML (สำหรับทุกแถวของ CSV!)

หากคุณสามารถเปลี่ยนไปใช้ lxml ได้ คุณสามารถใช้ XPath* ที่ซับซ้อนขึ้นเล็กน้อยซึ่งจะประมวลผลเฉพาะ scale องค์ประกอบที่ต้องแก้ไข...

from lxml import etree
import csv

with open('myCSV.csv', 'r', encoding="utf8") as csvfile:
    tree = etree.parse('myXml.xml')

    uc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    lc = "abcdefghijklmnopqrstuvwxyz"

    for row in csv.DictReader(csvfile, delimiter=";"):
        name = row.get("name").lower()
        new_aev_elem = etree.Element("alternativeExportValue")
        new_aev_elem.text = row.get("value")
        aevs_elem = tree.xpath(f".//scale[translate(names/name[@id='0'],'{uc}','{lc}')='{name}']/alternativeExportValues")[0]
        aevs_elem.append(new_aev_elem)

    tree.write("myNewXML.xml", encoding="utf-8")

*การสนับสนุน XPath ใน ElementTree มีจำนวนจำกัด .

Daniel Haley 12.12.2019

comment

ขอบคุณ @แดเนียล เฮลีย์ ความคิดของฉันกับ my_Function คือการค้นหาค่าของคีย์ที่ตรงกันใน .//name[@id="0"] ฉันพยายามเปลี่ยนเป็น lxml แต่ aevs_elem = tree.xpath(f".//scale[names/name[@id='0']='{name}']/alternativeExportValues")[0] ส่งกลับ IndexError: list index out of range นี่หมายความว่า xPath ไม่มีอยู่จริง - Alecbalec; 12.12.2019

comment

@Alecbalec - คุณเปลี่ยนมาใช้ DictReader ด้วยหรือไม่ ถ้าไม่เช่นนั้น คุณอาจกำลังประมวลผลบรรทัดแรกของ CSV และ XPath ล้มเหลวเนื่องจาก name[@id='0']='name' หากคุณไม่ต้องการเปลี่ยนไปใช้ DictReader คุณสามารถข้ามบรรทัดแรกของ CSV หรือใช้การลอง/ยกเว้น หากคุณ ไม่ เปลี่ยนไปใช้ DictReader จะต้องมีบางอย่างที่แตกต่างออกไปใน CSV หรือ XML ของคุณ เนื่องจากฉันได้ทดสอบกับสิ่งที่คุณมีในคำถามของคุณแล้ว และไม่ได้รับข้อผิดพลาดใดๆ - Daniel Haley; 12.12.2019

comment

ก่อนอื่นเลย ขอบคุณมาก! ปัญหาแรกคือการเข้ารหัสเมื่ออ่านไฟล์ CSV ใน for loop ฉันรัน print(row) เพื่อดูเอาต์พุต และมันส่งคืน '\ufeffname': แทนที่จะเป็น name: เปลี่ยนการเข้ารหัสเป็น "utf-8-sig" และทุกอย่างดูดีขึ้น ฉันยกเลิกหมายเหตุโค้ดและเพิ่ม print(get.('value') เพื่อดูว่าบรรทัดใดในไฟล์ .csv ที่โค้ดเสียหายจริง หลังจากการดูอย่างรวดเร็ว เห็นได้ชัดว่า xPath คำนึงถึงตัวพิมพ์เล็กและตัวพิมพ์ใหญ่ และด้วยเหตุนี้จึงส่งคืน IndexError: list index out of range - Alecbalec; 12.12.2019

comment

@Alecbalec - อ่าใช่ XPath คำนึงถึงตัวพิมพ์เล็กและตัวพิมพ์ใหญ่อย่างแน่นอน การบังคับให้ค่าเป็นตัวพิมพ์ใหญ่หรือตัวพิมพ์เล็กเป็นเรื่องที่เจ็บปวดใน XPath 1.0 (ซึ่งเป็นสิ่งที่ lxml รองรับ) แต่ฉันจะอัปเดตคำตอบ lxml ของฉัน ดังนั้นจึงไม่คำนึงถึงขนาดตัวพิมพ์ - Daniel Haley; 12.12.2019

comment

ใช่ โชคดีที่มีเพียงไม่กี่บรรทัดที่อักขระตัวหนึ่งเป็นตัวพิมพ์เล็กในไฟล์ XML ดั้งเดิมและตัวบนใน csv ขอขอบคุณอีกครั้ง - Alecbalec; 12.12.2019

ตั้งค่า ET.SubElement.text เป็น dict.value หาก dict.key เท่ากับโหนด XML อื่นภายในพาเรนต์เดียวกัน

ตัวอย่าง XML:

ตัวอย่าง CSV:

รหัส Python ตอนนี้:

ผลลัพธ์ที่คาดหวัง:

คำตอบ (1)

คำถามในหัวข้อ