Subscribe

RSS Feed (xml)

Retrieving XML With Curl and SimpleXML

Introduction

PHP 5 introduces SimpleXML and its a perfect name as parsing XML data is truly simple. In this tutorial we'll be using curl to retrieve the XML data from a remote web server. We're going to create a class to connect to the remote web server and pass POST data to the server and based on the POST data the remote server will return valid XML. The class will parse the XML response and return an array containing the data. For the sake of simplicity in this tutorial we're not going into detail on the workings of the remote server generating the XML responses.
The first thing we're going to cover is the simple script used to call the class.
$URL = 'http://www.example.com/XML';
$request = 'getInventory';
$parameters = array("param1" => "value1", "param2" => "value2");

$XMLClass = new getXML;
$response = $XMLClass->pullXML($URL,$request,$parameters);
 ?>
This simple script is all that is needed to use the class. We need to set the URL, request, and attributes being passed for the request. The $XMLClass = new getXML initializes the getXML class discussed later on in the tutorial. The response data of the request is stored in $response. For this example we're going to use the following XML.
       
               
               
       

 ?>

var_dump($response) would be
array(2) { [0]=> array(2) { ["attribA"]=> string(6) "valueA" ["attribB"]=> string(6) "valueB" } [1]=> array(2) { ["attribA"]=> string(6) "valueC" ["attribB"]=> string(6) "valueD" } } 

Retrieving Data

Now here is the meat, building a simple curl class to handle the retrieving of the data. More information on using curl in php.
Class getData {
   public $URL;
   public $XMLRequest;
   public $XMLResponseRaw;
   public $parameters;
   public $XPath;

   function buildCurlParamString() {
       $urlstring = '';

       foreach ($this->parameters as $key => $value) {
           $urlstring .= urlencode($key).'='.urlencode($value).'&';
       }

       if (trim($urlstring) != '') {
           $urlstring = preg_replace("/&$/", "", $urlstring);
           return ($urlstring);
       } else {
           return (-1);
       }
   }

 ?>
In this first part we declared a few variables for use in the class. Our first function buildCurlParamString() takes the class variable parameters, which is an associated array with the key being the variable name and the value being the variable value for the request, and turns them into the query string for our POST request to the remote server. We'll set the parameters array later just know it stores the names and values of the fields to post. We loop through the arrays urlencoding the data. The urlencode function formats our data properly for transmission to the server hosting the XML feed.
Then we trim off our trailing & from the previous loop. If for any reason we aren't returning a valid urlencoded string we return -1 out of buildCurlParamString(). 

Using curl

   function curlRequest() {
       $urlstring=$this->buildCurlParamString();

       if ($urlstring==-1) {
           echo "Couldn't Build Parameter String
"."n";
           return(-1);
       }
               
       $ch=curl_init();
       curl_setopt($ch, CURLOPT_URL, $this->URL.$this->XMLRequest);
       curl_setopt($ch, CURLOPT_TIMEOUT, 180);
       curl_setopt($ch, CURLOPT_HEADER, 0);
       curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
       curl_setopt($ch, CURLOPT_POST, 1);
       curl_setopt($ch, CURLOPT_POSTFIELDS, $urlstring);
       $data=curl_exec($ch);
       curl_close($ch);

       return($data);
   }
 ?>
This function handles the actual pulling of the data from the remote server with curl. We start by calling the buildCurlParamString() function and getting our POST data ready. Then we use curl_init() to initialize the curl handler and curl_setopt() to prepare curl. Notice we're using the class variables $this->URL and $this->XMLRequest, those will be set when we initialize our class and retrieve data. The data is pulled and returned to the calling function.
   function getFeed() {
       $rawData=$this->curlRequest();

       if ($rawData!=-1) {
           $this->XMLResponseRaw=$rawData;
       }
   }
 ?>
The last function in the curl class is the function that will get called from outside this class. Notice it stores the data in a class variable, that's because we're going to extend this class with our SimpleXML class. 

Finally SimpleXML

Class getXML extends getData {


   function pullXML($URL, $request, $parameters) {
       $this->URL = $URL;
       $this->XMLRequest = $request;
       $this->parameters = $parameters;
       $this->getFeed();
       $this->simpleXML = simplexml_load_string($this->XMLResponseRaw);
       return ($this->parseXPath($this->simpleXML));
   }
 ?>
This is the function that will be called externally to get the data in your scripts using this code. The function pullXML is where you pass the URL of the site, the request, and the arguments for the request. We return the value of our next function parseXPath. ParseXPath is used to parse any Xpath statements we might want to filter our results with.
   function parseXPath() {
       if ($this->XPath!='') {
           $this->XMLXPath=$this->simpleXML->xpath($this->XPath);
           $a=0;
           if (isset($this->XMLXPath[$a])) {
               $XMLParse = parseSimpleXMLData($this->XMLXPath);
           } else {
               $XMLParse=-1;
           }
           return($XMLParse);
       } else {
           $XMLParse = parseSimpleXMLData($this->simpleXML->DATA);
       }
       if (isset($XMLParse)) {
           return($XMLParse);
       } else {
           return(-1);
       }
   }

   function parseSimpleXMLData($data) {
       $i=0;
       while (isset($data[$i])) {
           foreach($data[$i]->attributes() as $attrib => $value) {
               $XMLParse[$a][$attrib]=$value;
           }
           $i++;
       }

       return($XMLParse);
   }
 ?>
In parseXPath we first check if there is a valid simpleXML resource, if the resource is invalid $this->simpleXML will equal FALSE so we return FALSE on this. Then if an Xpath is set we execute the xpath function of simplexml on our simpleXML handle with the $this->simpleXML->xpath($this->XPath). This lets simplexml handle the work of filtering our dataset. Now we pass the data into parseSimpleXMLData which parses through each row of our dataset with the while loop.
You'll notice that as you work your way through a simpleXML resource the nesting uses the same names as the tags. For lists where each record has the same tag an indexed array is created.
The foreach Loop takes the attributes of the row and stores them in the $XMLParse array. If there aren't any rows in the dataset we set $XMLParse to -1. The array $XMLParse is what is returned from the function. If there isn't a Xpath statement we do the same thing without processing an Xpath statement. 

SimpleXML Wrapup

simplexml_load_string($string) -- loads XML from a string variable
simplexml_load_file($filename) -- loads XML from a file specified
Once the data is loaded node names are stored as pseudo class variables and can be accessed via $simpleXML->child.
$simpleXML->xpath("XPath Statement") -- parses Xpath statements by calling the pseudo class function xpath on the simpleXML element.
$simpleXML->children() -- returns the names of child nodes
$simpleXML->attributes() -- returns the attributes for a node applied to




Related Posts with Thumbnails