用 PHP5 轻松解析 XML

6年以前  |  阅读数:239 次  |  编程语言:PHP 

用 sax 方式的时候,要自己构建3个函数,而且要直接用这三的函数来返回数据,要求较强的逻辑。在处理不同结构的 xml 的时候,还要重新进行构造这三个函数,麻烦!

用 dom 方式,倒是好些,但是他把每个节点都看作是一个 node,,操作起来要写好多的代码,麻烦!

网上有好多的开源的 xml 解析的类库,以前看过几个,但是心里总是觉得不踏实,感觉总是跟在别人的屁股后面。

这几天在搞 Java,挺累的,所以决定换换脑袋,写点 PHP 代码,为了防止以后 XML 解析过程再令我犯难,就花了一天的时间写了下面一个 XML 解析的类,于是就有了下面的东西。

实现方式是通过包装"sax方式的解析结果"来实现的。总的来说,对于我个人来说挺实用的,性能也还可以,基本上可以完成大多数的处理要求。

功能:
1\ 对基本的 XML 文件的节点进行 查询 / 添加 / 修改 / 删除 工作。
2\ 导出 XML 文件的所有数据到一个数组里面。
3\ 整个设计采用了 OO 方式,在操作结果集的时候,使用方法类似于 dom

缺点:
1\ 每个节点最好都带有一个id(看后面的例子),每个"节点名字"="节点的标签_节点的id",如果这个 id 值没有设置,程序将自动给他产生一个 id,这个 id 就是这个节点在他的上级节点中的位置编号,从 0 开始。
2\ 查询某个节点的时候可以通过用"|"符号连接"节点名字"来进行。这些"节点名字"都是按顺序写好的上级节点的名字。

使用说明:
运行下面的例子,在执行结果页面上可以看到函数的使用说明

代码是通过 PHP5 来实现的,在 PHP4 中无法正常运行。

由于刚刚写完,所以没有整理文档,下面的例子演示的只是一部分的功能,代码不是很难,要是想知道更多的功能,可以研究研究源代码。

目录结构:

test.php
test.xml
xml / SimpleDocumentBase.php
xml / SimpleDocumentNode.php
xml / SimpleDocumentRoot.php
xml / SimpleDocumentParser.php

文件:test.xml

<?xml version="1.0" encoding="GB2312"?>

华联
北京长安街-9999号
连锁超市 food11 12.90 food12 22.10 好东西推荐 tel21 1290 coat31 112 coat32 45 hot41 99

文件:test.php

<?php
require_once "xml/SimpleDocumentParser.php";
require_once "xml/SimpleDocumentBase.php";
require_once "xml/SimpleDocumentRoot.php";
require_once "xml/SimpleDocumentNode.php";

$test = new SimpleDocumentParser();
$test->parse("test.xml");
$dom = $test->getSimpleDocument();

echo "

";

echo "


";
echo "下面是通过函数getSaveData()返回的整个xml数据的数组";
echo "

";
print_r($dom->getSaveData());

echo "


";
echo "下面是通过setValue()函数,给给根节点添加信息,添加后显示出结果xml文件的内容";
echo "

";
$dom->setValue("telphone", "123456789");
echo htmlspecialchars($dom->getSaveXml());

echo "


";
echo "下面是通过getNode()函数,返回某一个分类下的所有商品的信息";
echo "

";
$obj = $dom->getNode("cat_food");
$nodeList = $obj->getNode();
foreach($nodeList as $node){
$data = $node->getValue();
echo "商品名:".$data[name]."
";
print_R($data);
print_R($node->getAttribute());
}

echo "


";
echo "下面是通过findNodeByPath()函数,返回某一商品的信息";
echo "

";
$obj = $dom->findNodeByPath("cat_food|goods_food11");
if(!is_object($obj)){
echo "该商品不存在";
}else{
$data = $obj->getValue();
echo "商品名:".$data[name]."
";
print_R($data);
print_R($obj->getAttribute());
}

echo "


";
echo "下面是通过setValue()函数,给商品\"food11\"添加属性, 然后显示添加后的结果";
echo "

";
$obj = $dom->findNodeByPath("cat_food|goods_food11");
$obj->setValue("leaveword", array("value"=>"这个商品不错", "attrs"=>array("author"=>"hahawen", "date"=>date('Y-m-d'))));
echo htmlspecialchars($dom->getSaveXml());

echo "


";
echo "下面是通过removeValue()/removeAttribute()函数,给商品\"food11\"改变和删除属性, 然后显示操作后的结果";
echo "

";
$obj = $dom->findNodeByPath("cat_food|goods_food12");
$obj->setValue("name", "new food12");
$obj->removeValue("desc");
echo htmlspecialchars($dom->getSaveXml());

echo "


";
echo "下面是通过createNode()函数,添加商品, 然后显示添加后的结果";
echo "

";
$obj = $dom->findNodeByPath("cat_food");
$newObj = $obj->createNode("goods", array("id"=>"food13"));
$newObj->setValue("name", "food13");
$newObj->setValue("price", 100);
echo htmlspecialchars($dom->getSaveXml());

echo "


";
echo "下面是通过removeNode()函数,删除商品, 然后显示删除后的结果";
echo "

";
$obj = $dom->findNodeByPath("cat_food");
$obj->removeNode("goods_food12");
echo htmlspecialchars($dom->getSaveXml());

?>

文件:SimpleDocumentParser.php

<?php
/*
================================================

  • @author hahawen(大龄青年)

  • @since 2004-12-04

  • @copyright Copyright (c) 2004, NxCoder Group

  • ================================================
    /
    /**

  • class SimpleDocumentParser

  • use SAX parse xml file, and build SimpleDocumentObject

  • all this pachage's is work for xml file, and method is action as DOM.

  • @package SmartWeb.common.xml

  • @version 1.0
    */
    class SimpleDocumentParser
    {

    private $domRootObject = null;

    private $currentNO = null;
    private $currentName = null;
    private $currentValue = null;
    private $currentAttribute = null;

    public
    function getSimpleDocument()
    {
    return $this->domRootObject;
    }

    public function parse($file)
    {
    $xmlParser = xml_parser_create();
    xml_parser_set_option($xmlParser,XML_OPTION_CASE_FOLDING,
    0);
    xml_parser_set_option($xmlParser,XML_OPTION_SKIP_WHITE, 1);
    xml_parser_set_option($xmlParser,
    XML_OPTION_TARGET_ENCODING, 'UTF-8');
    xml_set_object($xmlParser, $this);

    xml_set_element_handler($xmlParser, "startElement", "endElement");   
    xml_set_character_data_handler($xmlParser,   
    "characterData");
    
    if (!xml_parse($xmlParser, file_get_contents($file)))
    
    die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xmlParser)),   
    xml_get_current_line_number($xmlParser)));
    
    xml_parser_free($xmlParser);

    }

    private function startElement($parser, $name, $attrs)
    {
    $this->currentName = $name;
    $this->currentAttribute = $attrs;
    if($this->currentNO == null)
    {
    $this->domRootObject = new SimpleDocumentRoot($name);

        $this->currentNO = $this->domRootObject;   
    }   
    else  
    {   
        $this->currentNO = $this->currentNO->createNode($name, $attrs);
    
    }   

    }

    private function endElement($parser, $name)
    {
    if($this->currentName==$name)

    {   
        $tag = $this->currentNO->getSeq();   
        $this->currentNO  = $this->currentNO->getPNodeObject();   
        if($this->currentAttribute!=null && sizeof($this->currentAttribute)>0)   
        $this->currentNO->setValue($name, array('value'=>$this->currentValue, 'attrs'=>$this->currentAttribute));  
        else   
        $this->currentNO->setValue($name, $this->currentValue);
    
        $this->currentNO->removeNode($tag);   
    }   
    else  
    {   
        $this->currentNO = (is_a($this->currentNO, 'SimpleDocumentRoot'))?   null:   
        $this->currentNO->getPNodeObject();   
    }   

    }

    private function characterData($parser, $data)
    {
    $this->currentValue = iconv('UTF-8', 'GB2312', $data);
    }

    function __destruct()
    {
    unset($this->domRootObject);
    }

}
?>

文件:SimpleDocumentBase.php

<?php
/*
=================================================

  • @author hahawen(大龄青年)

  • @since 2004-12-04

  • @copyright Copyright (c) 2004, NxCoder Group

  • =================================================
    /
    /**

  • abstract class SimpleDocumentBase

  • base class for xml file parse

  • all this pachage's is work for xml file, and method is action as DOM.

  • 1\ add/update/remove data of xml file.

  • 2\ explode data to array.

  • 3\ rebuild xml file

  • @package SmartWeb.common.xml

  • @abstract

  • @version 1.0
    */
    abstract class SimpleDocumentBase
    {

    private $nodeTag = null;

    private $attributes = array();
    private $values =
    array();

    private $nodes = array();

    function __construct($nodeTag)
    {
    $this->nodeTag = $nodeTag;
    }

    public function getNodeTag()
    {
    return $this->nodeTag;
    }

    public function setValues($values){
    $this->values = $values;
    }

    public function setValue($name, $value)
    {
    $this->values[$name] = $value;
    }

    public function getValue($name=null)
    {
    return $name==null?
    $this->values: $this->values[$name];
    }

    public function removeValue($name)
    {
    unset($this->values["$name"]);
    }

    public function setAttributes($attributes){
    $this->attributes = $attributes;
    }

    public function setAttribute($name, $value)
    {
    $this->attributes[$name] = $value;
    }

    public function getAttribute($name=null)
    {
    return $name==null? $this->attributes:
    $this->attributes[$name];
    }

    public function removeAttribute($name)
    {
    unset($this->attributes["$name"]);
    }

    public function getNodesSize()
    {
    return sizeof($this->nodes);
    }

    protected function setNode($name, $nodeId)
    {
    $this->nodes[$name]
    = $nodeId;
    }

    public abstract function createNode($name, $attributes);

    public abstract function removeNode($name);

    public abstract function getNode($name=null);

    protected function getNodeId($name=null)
    {
    return $name==null? $this->nodes: $this->nodes[$name];
    }

    protected function createNodeByName($rootNodeObj, $name, $attributes, $pId)
    {
    $tmpObject = $rootNodeObj->createNodeObject($pId, $name, $attributes);
    $key = isset($attributes[id])?
    $name.''.$attributes[id]: $name.''.$this->getNodesSize();
    $this->setNode($key, $tmpObject->getSeq());
    return $tmpObject;
    }

    protected function removeNodeByName($rootNodeObj, $name)
    {
    $rootNodeObj->removeNodeById($this->getNodeId($name));
    if(sizeof($this->nodes)==1)
    $this->nodes = array();
    else
    unset($this->nodes[$name]);
    }

    protected function getNodeByName($rootNodeObj, $name=null)
    {
    if($name==null)
    {
    $tmpList = array();
    $tmpIds = $this->getNodeId();
    foreach($tmpIds as $key=>$id)
    $tmpList[$key] = $rootNodeObj->getNodeById($id);
    return $tmpList;
    }
    else
    {
    $id = $this->getNodeId($name);
    if($id===null)
    {
    $tmpIds = $this->getNodeId();

            foreach($tmpIds as $tkey=>$tid)   
            {   
                if(strpos($key, $name)==0)   
                {   
                    $id = $tid;   
                    break;   
                }   
            }   
        }   
        return $rootNodeObj->getNodeById($id);   
    }   

    }

    public function findNodeByPath($path)
    {
    $pos = strpos($path, '|');
    if($pos<=0)
    {
    return $this->getNode($path);

    }   
    else   
    {
    
        $tmpObj = $this->getNode(substr($path, 0,   
        $pos));
    
        return is_object($tmpObj)?   
        $tmpObj->findNodeByPath(substr($path,   
        $pos+1)):   
        null;   
    }   

    }

    public function getSaveData()
    {
    $data = $this->values;
    if(sizeof($this->attributes)>0)

    $data[attrs] = $this->attributes;   
    $nodeList = $this->getNode();
    
    if($nodeList==null)
    
    return $data;   
    foreach($nodeList as $key=>$node)   
    {   
        $data[$key] = $node->getSaveData();   
    }
    
    return $data;   

    }

    public function getSaveXml($level=0)
    {

    $prefixSpace   
    = str_pad("",   
    $level, "\t");   
    $str = "$prefixSpace<$this->nodeTag";
    
    foreach($this->attributes as $key=>$value)   
    $str .= " $key=\"$value\"";
    
    $str .= ">\r\n";
    
    foreach($this->values as $key=>$value){
    
        if(is_array($value))   
        {   
            $str .= "$prefixSpace\t<$key";
    
            foreach($value[attrs] as $attkey=>$attvalue)
    
            $str .= " $attkey=\"$attvalue\"";
    
            $tmpStr = $value[value];
    
        }   
        else
    
        {   
            $str .= "$prefixSpace\t<$key";
    
            $tmpStr = $value;   
        }   
        $tmpStr = trim(trim($tmpStr, "\r\n"));
    
        $str .= ($tmpStr===null || $tmpStr==="")? " />\r\n": ">$tmpStr</$key>\r\n";
    
    }
    
    foreach($this->getNode() as $node)   
    $str .= $node->getSaveXml($level+1)."\r\n";
    
    $str .= "$prefixSpace</$this->nodeTag>";
    
    return $str;   

    }

    function __destruct()
    {
    unset($this->nodes, $this->attributes, $this->values);

    }

}
?>

文件:SimpleDocumentRoot.php

<?php
/*
==============================================

  • @author hahawen(大龄青年)
  • @since 2004-12-04
  • @copyright Copyright (c) 2004, NxCoder Group
  • ==============================================
    /
    /**

  • class SimpleDocumentRoot
  • xml root class, include values/attributes/subnodes.
  • all this pachage's is work for xml file, and method is action as DOM.
  • @package SmartWeb.common.xml
  • @version 1.0
    */

class SimpleDocumentRoot extends SimpleDocumentBase
{
private $prefixStr = '';
private $nodeLists = array();

function __construct($nodeTag)   
{   
    parent::__construct($nodeTag);   
}

public function createNodeObject($pNodeId, $name, $attributes)   
{   
    $seq = sizeof($this->nodeLists);   
    $tmpObject = new SimpleDocumentNode($this,   
    $pNodeId, $name, $seq);   
    $tmpObject->setAttributes($attributes);

    $this->nodeLists[$seq] = $tmpObject;   
    return $tmpObject;   
}

public function removeNodeById($id)   
{   
    if(sizeof($this->nodeLists)==1)   
    $this->nodeLists = array();   
    else   
    unset($this->nodeLists[$id]);   
}

public function getNodeById($id)   
{   
    return $this->nodeLists[$id];   
}

public function createNode($name, $attributes)   
{   
    return $this->createNodeByName($this, $name, $attributes, -1);   
}

public function removeNode($name)   
{   
    return $this->removeNodeByName($this, $name);   
}

public function getNode($name=null)   
{   
    return $this->getNodeByName($this, $name);   
}

public function getSaveXml()   
{   
    $prefixSpace = "";   
    $str = $this->prefixStr."\r\n";   
    return $str.parent::getSaveXml(0);   
}   

}
?>

文件:SimpleDocumentNode.php

<?php
/*
===============================================

  • @author hahawen(大龄青年)

  • @since 2004-12-04

  • @copyright Copyright (c) 2004, NxCoder Group

  • ===============================================
    /
    /**

  • class SimpleDocumentNode

  • xml Node class, include values/attributes/subnodes.

  • all this pachage's is work for xml file, and method is action as DOM.

  • @package SmartWeb.common.xml

  • @version 1.0
    */
    class SimpleDocumentNode extends SimpleDocumentBase
    {
    private $seq = null;
    private $rootObject = null;
    private $pNodeId = null;

    function construct($rootObject, $pNodeId, $nodeTag, $seq)
    {
    parent::
    construct($nodeTag);
    $this->rootObject = $rootObject;
    $this->pNodeId = $pNodeId;
    $this->seq = $seq;
    }

    public function getPNodeObject()
    {
    return ($this->pNodeId==-1)?
    $this->rootObject:
    $this->rootObject->getNodeById($this->pNodeId);
    }

    public function getSeq(){
    return $this->seq;
    }

    public function createNode($name, $attributes)
    {
    return $this->createNodeByName($this->rootObject,
    $name, $attributes,
    $this->getSeq());
    }

    public function removeNode($name)
    {
    return $this->removeNodeByName($this->rootObject, $name);
    }

    public function getNode($name=null)
    {
    return $this->getNodeByName($this->rootObject,
    $name);
    }
    }
    ?>

下面是例子运行对结果


下面是通过函数getSaveData()返回的整个xml数据的数组


Array
(
[name] => 华联
[address] => 北京长安街-9999号
[desc] => 连锁超市
[cat_food] => Array
(
[attrs] => Array
(
[id] => food
)

        [goods_food11] => Array  
            (  
                [name] => food11  
                [price] => 12.90  
                [attrs] => Array  
                    (  
                        [id] => food11  
                    )  

            )  

        [goods_food12] => Array  
            (  
                [name] => food12  
                [price] => 22.10  
                [desc] => Array  
                    (  
                        [value] => 好东西推荐  
                        [attrs] => Array  
                            (  
                                [creator] => hahawen  
                            )  

                    )  

                [attrs] => Array  
                    (  
                        [id] => food12  
                    )  

            )  

    )  

[cat_1] => Array  
    (  
        [goods_tel21] => Array  
            (  
                [name] => tel21  
                [price] => 1290  
                [attrs] => Array  
                    (  
                        [id] => tel21  
                    )  

            )  

    )  

[cat_coat] => Array  
    (  
        [attrs] => Array  
            (  
                [id] => coat  
            )  

        [goods_coat31] => Array  
            (  
                [name] => coat31  
                [price] => 112  
                [attrs] => Array  
                    (  
                        [id] => coat31  
                    )  

            )  

        [goods_coat32] => Array  
            (  
                [name] => coat32  
                [price] => 45  
                [attrs] => Array  
                    (  
                        [id] => coat32  
                    )  

            )  

    )  

[special_hot] => Array  
    (  
        [attrs] => Array  
            (  
                [id] => hot  
            )  

        [goods_0] => Array  
            (  
                [name] => hot41  
                [price] => 99  
            )  

    )  

)


下面是通过setValue()函数,给给根节点添加信息,添加后显示出结果xml文件的内容


<?xml version="1.0" encoding="GB2312" ?>

华联
北京长安街-9999号
连锁超市 123456789 food11 12.90 food12 22.10 好东西推荐 tel21 1290 coat31 112 coat32 45 hot41 99

下面是通过getNode()函数,返回某一个分类下的所有商品的信息


商品名:food11
Array
(
[name] => food11
[price] => 12.90
)
Array
(
[id] => food11
)
商品名:food12
Array
(
[name] => food12
[price] => 22.10
[desc] => Array
(
[value] => 好东西推荐
[attrs] => Array
(
[creator] => hahawen
)

    )  

)
Array
(
[id] => food12
)


下面是通过findNodeByPath()函数,返回某一商品的信息


商品名:food11
Array
(
[name] => food11
[price] => 12.90
)
Array
(
[id] => food11
)


下面是通过setValue()函数,给商品"food11"添加属性, 然后显示添加后的结果


<?xml version="1.0" encoding="GB2312" ?>

华联
北京长安街-9999号
连锁超市 123456789 food11 12.90 这个商品不错 food12 22.10 好东西推荐 tel21 1290 coat31 112 coat32 45 hot41 99

下面是通过removeValue()/removeAttribute()函数,给商品"food11"改变和删除属性, 然后显示操作后的结果


<?xml version="1.0" encoding="GB2312" ?>

华联
北京长安街-9999号
连锁超市 123456789 food11 12.90 这个商品不错 new food12 22.10 tel21 1290 coat31 112 coat32 45 hot41 99

下面是通过createNode()函数,添加商品, 然后显示添加后的结果


<?xml version="1.0" encoding="GB2312" ?>

华联
北京长安街-9999号
连锁超市 123456789 food11 12.90 这个商品不错 new food12 22.10 food13 100 tel21 1290 coat31 112 coat32 45 hot41 99

下面是通过removeNode()函数,删除商品, 然后显示删除后的结果


<?xml version="1.0" encoding="GB2312" ?>

华联
北京长安街-9999号
连锁超市 123456789 food11 12.90 这个商品不错 food13 100 tel21 1290 coat31 112 coat32 45 hot41 99
 相关文章:
PHP分页显示制作详细讲解
SSH 登录失败:Host key verification failed
获取IMSI
将二进制数据转为16进制以便显示
文件下载
获取IMEI
贪吃蛇
双位运算符
发送邮件
PHP自定义函数获取搜索引擎来源关键字的方法
Java生成UUID
提取后缀名
年的日历图
在Zeus Web Server中安装PHP语言支持
让你成为最历害的git提交人
Yii2汉字转拼音类的实例代码
再谈PHP中单双引号的区别详解
指定应用ID以获取对应的应用名称
Python 2与Python 3版本和编码的对比
php封装的page分页类完整实例