Zebra_cURL, a high performance PHP cURL library

Get the latest updates on this PHP library via RSS

Zebra_cURL is a high performance PHP library acting as a wrapper to PHP’s libcurl library, which not only allows the running of multiple requests at once asynchronously, in parallel, but also as soon as one thread finishes it can be processed right away without having to wait for the other threads in the queue to finish.

Also, each time a request is completed another one is added to the queue, thus keeping a constant number of threads running at all times and eliminating wasted CPU cycles from busy waiting. This result is a faster and more efficient way of processing large quantities of cURL requests (like fetching thousands of RSS feeds at once), drastically reducing processing time.

This script supports GET (with caching) and POST request, basic downloads as well as downloads from FTP servers, HTTP Authentication, and requests through proxy servers.

For maximum efficiency downloads are streamed (bytes downloaded are directly written to disk) removing the unnecessary strain from the server of having to read files into memory first, and then writing them to disk.

Zebra_cURL requires the PHP cURL extension to be enabled.

The code is heavily commented and generates no warnings/errors/notices when PHP’s error reporting level is set to E_ALL.

Top

Features review

  • supports GET (with caching) and POST request, basic downloads as well as downloads from FTP servers, HTTP Authentication, and requests through proxy servers
  • allows the running of multiple requests at once asynchronously, in parallel, but also as soon as one thread finishes it can be processed right away without having to wait for the other threads in the queue to finish
  • downloads are streamed (bytes downloaded are directly written to disk) removing the unnecessary strain from the server of having to read files into memory first, and then writing them to disk
  • provides a very detailed information about the made requests
  • has comprehensive documentation
  • code is heavily commented and generates no warnings/errors/notices when PHP’s error reporting level is set to E_ALL

Top

Requirements

PHP 5.0.2+ with the cURL extension installed

Top

Installation

Download the latest version, unpack it, and put it in a place accessible to your scripts.

Top

How to use

Fetch RSS feeds

<?php

function callback($result) {

    // remember, the "body" property of $result is run through
    // "htmlentities()", so you may need to "html_entity_decode" it

    // show everything
    print_r('<pre>');
    print_r($result->info);

}

require 'path/to/Zebra_cURL.php';

// instantiate the Zebra_cURL class
$curl = new Zebra_cURL();

// cache results 60 seconds
$curl->cache('cache', 60);

// get RSS feeds of some popular tech websites
$curl->get(array(
    'http://rss1.smashingmagazine.com/feed/',
    'http://allthingsd.com/feed/',
    'http://feeds.feedburner.com/nettuts',
    'http://www.webmonkey.com/feed/',
    'http://feeds.feedburner.com/alistapart/main',
), 'callback');

?>

Twitter Search

<?php

function callback($result) {

    // results from twitter is json-encoded;
    // remember, the "body" property of $result is run through
    // "htmlentities()" so we need to "html_entity_decode" it
    $result->body = json_decode(html_entity_decode($result->body));

    // show everything
    print_r('<pre>');
    print_r($result);

}

// include the library
require 'path/to/Zebra_cURL.php';

// instantiate the Zebra_cURL class
$curl = new Zebra_cURL();

// cache results 60 seconds
$curl->cache('cache', 60);

// search twitter for the "jquery" hashtag
$curl->get('http://search.twitter.com/search.json?q=' . urlencode('#jquery'), 'callback');

?>

Download an image

<?php

// include the library
require 'path/to/Zebra_cURL.php';

// instantiate the Zebra_cURL class
$curl = new Zebra_cURL();

// download one of the official twitter image
$curl->download('https://abs.twimg.com/a/1362101114/images/resources/twitter-bird-callout.png', 'cache');

?>

Top

Download

version 1.2.1
If you find this library to be useful to you, you can support the author by donating a small amount via PayPal:

Zebra_cURL is distributed under the LGPL.

In plain English, this means that you have the right to view and to modify the source code of this software, but if you modify and distribute it, you are required to license your copy under a LGPL-compatible license, and to make the entire source code of your derivation available to anybody you distribute the software to.

You also have the right to use this software together with software that has different licensing terms (including, but not limited to, commercial and closed-source software), and distribute the combined software, as long as state that your software contains portions licensed under the LGPL license, and provide information about where the LGPL licensed software can be downloaded.

If you distribute copies of this software you may not change the copyright or license of this software.


You may also like:

Top

Documentation

Documentation Become a ninja.
Read the comprehensive documentation.
Top

Changelog

Click on a version to expand/collapse information.

version 1.2.1 (November 12, 2014)
  • fixed an issue that appeared since PHP 5.3.0 where, because of how htmlentities has changed since that version, the body of a fetched page would be an empty string the output would contain invalid code unit sequences within the given encoding (utf-8 in our case);
  • fixed an issues in composer.json due to which the class was not registered for autoloading after installation, and the library now explicitly requires lib-curl; thanks to Igor Denisenko
  • fixed some documentation issues; thanks to Igor Denisenko
version 1.1.0 (June 26, 2014)
  • fixed a bug where the “post” method was not working with callback functions
  • added a workaround for PHP bug: https://bugs.php.net/bug.php?id=61141; thanks to Syed I.R
  • custom arguments can now pe passed to the callback functions
  • callback functions may now return FALSE instructing the library to not cache the respective request; this makes it easy to retry failed requests without having to clear all cache;
  • added an example for FTP download
version 1.0.2 (August 29, 2013)
  • fixed a bug where the “type” argument of the “http_authentication” method could not be changed; thanks apmolsa;
  • fixed a bug where the “chmod” argument of the “cache” method could not be changed; thanks apmolsa;
  • fixed a bug where PHP’s htmlentities() function was accidentally being run on the response body of downloads;
  • the constructor now takes one argument specifying whether the response body should be run through PHP’s htmlentities() function;
version 1.0.1 (May 30, 2013)
  • fixed a bug where in PHP 5.2.7+ the library was triggering fatal error because I was using func_num_args() as an argument to another function;
  • the project is now also available on GitHub and as a package for Composer
version 1.0 (March 02, 2013)
  • initial release;

Top

33 responses to “Zebra_cURL, a high performance PHP cURL library”

Follow the comments via RSS
  • Nate, 2014-06-05, 00:02

    Hi, I followed the sample for doing a post from the documentation and I keep getting:

    Fatal error: Callback function "Array" does not exist! in /homepages/1/d146197026/htdocs/HDPhim/Zebra_cURL.php on line 1509

    Any idea why?

    Reply
  • Ron, 2014-07-22, 16:34

    Great work on the script.
    Could you give a quick example of how to authenticate to a webiste, and once logged in make several get requests.

    I tried using the http_authentication function but it didn’t seem to. I must be missing something.

    Thanks,
    Ron

    Reply
  • Tony, 2014-10-13, 02:59

    Dumb question: I would like to know how to simplexml_load_file the cached file.

    Reply
    • Tony, 2014-10-13, 08:48

      Ok, I figured out the issue with simple_xml_load_file.

      $var = unserialize (gzuncompress(file_get_contents(ZEBRACURL . 'cache/' . $Md5name)));
      $NewVar = simplexml_load_string(html_entity_decode ($var->body));

      However, it would be nice if cached files were stored in such a manner as to limit the number of files per directory. Right now, all filenames use their md5 value and are placed in one common folder .. which is great until there are like 100,000+ files in one folder. I imagine performance would suffer.

      eAccelerator has a nice and fast directory / file placement structure. You create a directory based on the first character of the file name, create a sub-directory based on the second character of the filename, and so forth. After creating directories 5 or 6 levels deep, the file would then be placed.

  • Tony, 2014-10-13, 08:50
    filename => 1dd51d2133952cea5e8202b4360aa39d
    directory => cache/1/d/d/5/1/1dd51d2133952cea5e8202b4360aa39d
    Reply