
本教程详细介绍了如何使用java dom解析器处理包含多层级和关联数据的xml文件。文章首先纠正了getelementsbytagname全局搜索的常见误区,并演示了如何通过限定父节点范围进行精确查找。随后,教程深入探讨了如何利用java对象和map结构聚合来自不同xml节点的数据,实现基于关联id的统一输出,从而有效管理和展示复杂xml数据。
在处理复杂的XML数据时,尤其当数据分布在不同的层级并存在关联时,使用Java的Document Object Model (DOM) 解析器是一种常见且有效的方法。DOM解析器将整个XML文档加载到内存中,并将其表示为一个树形结构,开发者可以通过遍历这棵树来访问和操作数据。
考虑以下员工信息XML结构,它包含员工列表、职位详情和员工联系信息三个主要类别,并通过ref属性进行关联:
<?xml version="1.0" encoding="UTF-8"?>
<employee>
<employee_list>
<employee ID="1">
<firstname>Andrei</firstname>
<lastname>Rus</lastname>
<age>23</age>
<position-skill ref="Java"/>
<detail-ref ref="AndreiR"/>
</employee>
<!-- ... 其他员工 ... -->
</employee_list>
<position_details>
<position ID="Java">
<role>Junior Developer</role>
<skill_name>Java</skill_name>
<experience>1</experience>
</position>
<!-- ... 其他职位 ... -->
</position_details>
<employee_info>
<detail ID="AndreiR">
<username>AndreiR</username>
<residence>Timisoara</residence>
<yearOfBirth>1999</yearOfBirth>
<phone>0</phone>
</detail>
<!-- ... 其他详情 ... -->
</employee_info>
</employee>我们的目标是解析这些数据,并最终以统一的格式输出每个员工的所有关联信息。
在使用DOM解析XML之前,需要进行一些标准的初始化步骤:
立即学习“Java免费学习笔记(深入)”;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.ArrayList;
import java.util.List;
public class XmlParserTutorial {
public static void main(String[] args) {
try {
File xmlDoc = new File("employees.xml"); // 确保XML文件存在于项目根目录或指定路径
DocumentBuilderFactory dbFact = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuild = dbFact.newDocumentBuilder();
Document doc = dBuild.parse(xmlDoc);
// 可选:规范化XML文档,合并相邻的文本节点
doc.getDocumentElement().normalize();
System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
System.out.println("-----------------------------------------------------------------------------");
// 后续解析逻辑将在此处添加
// ...
} catch (Exception e) {
e.printStackTrace(); // 打印异常堆栈,便于调试
}
}
}Document.getElementsByTagName(tagName)方法会从整个文档的根节点开始,全局搜索所有匹配给定标签名的元素。这可能导致一些非预期的结果。例如,如果XML根元素本身包含与某个子元素相同的标签名,或者文档中存在多个同名但层级不同的元素,全局搜索可能会返回多余或不准确的节点。
为了避免这种情况,我们应该在更具体的父元素上下文中调用getElementsByTagName,从而限定搜索范围。
错误示例(可能包含根元素或非直接子元素):
// 假设根元素也是"employee",或者其他地方有"employee"标签
NodeList nList = doc.getElementsByTagName("employee"); // 可能返回多于预期的结果正确做法:限定搜索范围
对于employee_list类别,我们应该首先找到employee_list元素,然后在该元素下搜索employee:
// 获取employee_list节点
NodeList employeeListNodes = doc.getElementsByTagName("employee_list");
Element employeeListElement = (Element) employeeListNodes.item(0); // 假设只有一个employee_list
// 在employee_listElement下搜索employee节点
NodeList employeeNodes = employeeListElement.getElementsByTagName("employee");
System.out.println("Total employees found: " + employeeNodes.getLength());同样,对于position_details和employee_info,也应采用类似策略:
// 获取position_details节点,并在其下搜索position
NodeList positionDetailsNodes = doc.getElementsByTagName("position_details");
Element positionDetailsElement = (Element) positionDetailsNodes.item(0);
NodeList positionNodes = positionDetailsElement.getElementsByTagName("position");
System.out.println("Total positions found: " + positionNodes.getLength());
// 获取employee_info节点,并在其下搜索detail
NodeList employeeInfoNodes = doc.getElementsByTagName("employee_info");
Element employeeInfoElement = (Element) employeeInfoNodes.item(0);
NodeList detailNodes = employeeInfoElement.getElementsByTagName("detail");
System.out.println("Total details found: " + detailNodes.getLength());为了实现按人员分组的输出,我们需要将来自不同XML部分的关联数据整合到一个Java对象中。这可以通过以下步骤完成:
// EmployeeRecord.java
class EmployeeRecord {
private String id;
private String firstname;
private String lastname;
private String age;
private String role;
private String skillName;
private String experience;
private String username;
private String residence;
private String yearOfBirth;
private String phone;
// 构造函数、Getter和Setter方法
public EmployeeRecord() {}
// 省略所有getter和setter以保持代码简洁,实际开发中应包含
public void setId(String id) { this.id = id; }
public String getId() { return id; }
public void setFirstname(String firstname) { this.firstname = firstname; }
public String getFirstname() { return firstname; }
public void setLastname(String lastname) { this.lastname = lastname; }
public String getLastname() { return lastname; }
public void setAge(String age) { this.age = age; }
public String getAge() { return age; }
public void setRole(String role) { this.role = role; }
public String getRole() { return role; }
public void setSkillName(String skillName) { this.skillName = skillName; }
public String getSkillName() { return skillName; }
public void setExperience(String experience) { this.experience = experience; }
public String getExperience() { return experience; }
public void setUsername(String username) { this.username = username; }
public String getUsername() { return username; }
public void setResidence(String residence) { this.residence = residence; }
public String getResidence() { return residence; }
public void setYearOfBirth(String yearOfBirth) { this.yearOfBirth = yearOfBirth; }
public String getYearOfBirth() { return yearOfBirth; }
public void setPhone(String phone) { this.phone = phone; }
public String getPhone() { return phone; }
@Override
public String toString() {
return "Person ID: " + id + "\n" +
"First Name: " + firstname + "\n" +
"Last Name: " + lastname + "\n" +
"Age: " + age + "\n" +
"Role: " + role + "\n" +
"Skill Name: " + skillName + "\n" +
"Experience: " + experience + "\n" +
"Username: " + username + "\n" +
"Residence: " + residence + "\n" +
"Year of Birth: " + yearOfBirth + "\n" +
"Phone: " + phone + "\n" +
"--------------------------------------------------------------------------";
}
}在XmlParserTutorial的main方法中,在解析employee_list之前,先解析其他两个辅助类别:
// ... (之前的DOM初始化代码) ...
// 存储职位详情的Map,键为position ID
Map<String, Element> positionDetailsMap = new HashMap<>();
NodeList positionDetailsNodes = doc.getElementsByTagName("position_details");
if (positionDetailsNodes.getLength() > 0) {
Element positionDetailsElement = (Element) positionDetailsNodes.item(0);
NodeList positionNodes = positionDetailsElement.getElementsByTagName("position");
for (int i = 0; i < positionNodes.getLength(); i++) {
Node node = positionNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element positionElement = (Element) node;
positionDetailsMap.put(positionElement.getAttribute("ID"), positionElement);
}
}
}
// 存储员工额外信息的Map,键为detail ID
Map<String, Element> employeeInfoMap = new HashMap<>();
NodeList employeeInfoNodes = doc.getElementsByTagName("employee_info");
if (employeeInfoNodes.getLength() > 0) {
Element employeeInfoElement = (Element) employeeInfoNodes.item(0);
NodeList detailNodes = employeeInfoElement.getElementsByTagName("detail");
for (int i = 0; i < detailNodes.getLength(); i++) {
Node node = detailNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element detailElement = (Element) node;
employeeInfoMap.put(detailElement.getAttribute("ID"), detailElement);
}
}
}
// ... (后续解析employee_list的代码) ...现在,我们可以遍历employee_list中的每个employee,并使用之前构建的Map来查找和关联数据。
// 存储所有完整员工记录的列表
List<EmployeeRecord> allEmployeeRecords = new ArrayList<>();
NodeList employeeListNodes = doc.getElementsByTagName("employee_list");
if (employeeListNodes.getLength() > 0) {
Element employeeListElement = (Element) employeeListNodes.item(0);
NodeList employeeNodes = employeeListElement.getElementsByTagName("employee");
for (int i = 0; i < employeeNodes.getLength(); i++) {
Node node = employeeNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element employeeElement = (Element) node;
EmployeeRecord record = new EmployeeRecord();
// 解析employee_list中的数据
record.setId(employeeElement.getAttribute("ID"));
record.setFirstname(getTagValue("firstname", employeeElement));
record.setLastname(getTagValue("lastname", employeeElement));
record.setAge(getTagValue("age", employeeElement));
// 获取position-skill ref并查找关联的position详情
String positionSkillRef = employeeElement.getElementsByTagName("position-skill").item(0).getAttributes().getNamedItem("ref").getNodeValue();
Element positionElement = positionDetailsMap.get(positionSkillRef);
if (positionElement != null) {
record.setRole(getTagValue("role", positionElement));
record.setSkillName(getTagValue("skill_name", positionElement));
record.setExperience(getTagValue("experience", positionElement));
}
// 获取detail-ref并查找关联的employee_info详情
String detailRef = employeeElement.getElementsByTagName("detail-ref").item(0).getAttributes().getNamedItem("ref").getNodeValue();
Element detailElement = employeeInfoMap.get(detailRef);
if (detailElement != null) {
record.setUsername(getTagValue("username", detailElement));
record.setResidence(getTagValue("residence", detailElement));
record.setYearOfBirth(getTagValue("yearOfBirth", detailElement));
record.setPhone(getTagValue("phone", detailElement));
}
allEmployeeRecords.add(record);
}
}
}
// 辅助方法:安全地获取子标签的文本内容
private static String getTagValue(String tagName, Element element) {
NodeList nodeList = element.getElementsByTagName(tagName);
if (nodeList != null && nodeList.getLength() > 0) {
Node node = nodeList.item(0);
if (node != null && node.getNodeType() == Node.ELEMENT_NODE) {
return node.getTextContent();
}
}
return ""; // 返回空字符串或null,表示未找到
}
// 打印所有聚合后的员工记录
System.out.println("\n=============================================================================================");
System.out.println("Aggregated Employee Records:");
System.out.println("=============================================================================================");
for (EmployeeRecord record : allEmployeeRecords) {
System.out.println(record);
}将以上所有片段整合到一个XmlParserTutorial.java文件中,并确保EmployeeRecord.java类也在同一包或可访问的位置。
// XmlParserTutorial.java
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.SAXException;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class XmlParserTutorial {
public static void main(String[] args) {
try {
File xmlDoc = new File("employees.xml"); // 确保XML文件存在于项目根目录或指定路径
DocumentBuilderFactory dbFact = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuild = dbFact.newDocumentBuilder();
Document doc = dBuild.parse(xmlDoc);
doc.getDocumentElement().normalize(); // 规范化XML文档
System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
System.out.println("-----------------------------------------------------------------------------");
// 1. 预解析position_details到Map
Map<String, Element> positionDetailsMap = new HashMap<>();
NodeList positionDetailsNodes = doc.getElementsByTagName("position_details");
if (positionDetailsNodes.getLength() > 0) {
Element positionDetailsElement = (Element) positionDetailsNodes.item(0);
NodeList positionNodes = positionDetailsElement.getElementsByTagName("position");
for (int i = 0; i < positionNodes.getLength(); i++) {
Node node = positionNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element positionElement = (Element) node;
positionDetailsMap.put(positionElement.getAttribute("ID"), positionElement);
}
}
}
// 2. 预解析employee_info到Map
Map<String, Element> employeeInfoMap = new HashMap<>();
NodeList employeeInfoNodes = doc.getElementsByTagName("employee_info");
if (employeeInfoNodes.getLength() > 0) {
Element employeeInfoElement = (Element) employeeInfoNodes.item(0);
NodeList detailNodes = employeeInfoElement.getElementsByTagName("detail");
for (int i = 0; i < detailNodes.getLength(); i++) {
Node node = detailNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element detailElement = (Element) node;
employeeInfoMap.put(detailElement.getAttribute("ID"), detailElement);
}
}
}
// 3. 遍历employee_list并聚合数据
List<EmployeeRecord> allEmployeeRecords = new ArrayList<>();
NodeList employeeListNodes = doc.getElementsByTagName("employee_list");
if (employeeListNodes.getLength() > 0) {
Element employeeListElement = (Element) employeeListNodes.item(0);
NodeList employeeNodes = employeeListElement.getElementsByTagName("employee");
for (int i = 0; i < employeeNodes.getLength(); i++) {
Node node = employeeNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element employeeElement = (Element) node;
EmployeeRecord record = new EmployeeRecord();
// 解析employee_list中的数据
record.setId(employeeElement.getAttribute("ID"));
record.setFirstname(getTagValue("firstname", employeeElement));
record.setLastname(getTagValue("lastname", employeeElement));
record.setAge(getTagValue("age", employeeElement));
// 获取position-skill ref并查找关联的position详情
NodeList positionSkillList = employeeElement.getElementsByTagName("position-skill");
if (positionSkillList.getLength() > 0) {
String positionSkillRef = positionSkillList.item(0).getAttributes().getNamedItem("ref").getNodeValue();
Element positionElement = positionDetailsMap.get(positionSkillRef);
if (positionElement != null) {
record.setRole(getTagValue("role", positionElement));
record.setSkillName(getTagValue("skill_name", positionElement));
record.setExperience(getTagValue("experience", positionElement));
}
}
// 获取detail-ref并查找关联的employee_info详情
NodeList detailRefList = employeeElement.getElementsByTagName("detail-ref");
if (detailRefList.getLength() > 0) {
String detailRef = detailRefList.item(0).getAttributes().getNamedItem("ref").getNodeValue();
Element detailElement = employeeInfoMap.get(detailRef);
if (detailElement != null) {
record.setUsername(getTagValue("username", detailElement));
record.setResidence(getTagValue("residence", detailElement));
record.setYearOfBirth(getTagValue("yearOfBirth", detailElement));
record.setPhone(getTagValue("phone", detailElement));
}
}
allEmployeeRecords.add(record);
}
}
}
// 打印所有聚合后的员工记录
System.out.println("\n=============================================================================================");
System.out.println("Aggregated Employee Records:");
System.out.println("=============================================================================================");
for (EmployeeRecord record : allEmployeeRecords) {
System.out.println(record);
}
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();以上就是Java DOM解析多层级XML并关联数据教程的详细内容,更多请关注php中文网其它相关文章!
每个人都需要一台速度更快、更稳定的 PC。随着时间的推移,垃圾文件、旧注册表数据和不必要的后台进程会占用资源并降低性能。幸运的是,许多工具可以让 Windows 保持平稳运行。
Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号