Comprehensive URL Access with cURL in PHP

The file_get_contents() function, especially when combined with context options, lets you make a wide variety of HTTP requests. But when you really need control over the details of your HTTP requests and responses, turn to PHP’s cURL functions. By using a powerful underlying library, libcurl, these functions give you access to all aspects of your HTTP requests and responses.

1. Retrieving URLs via GET

Accessing a URL with cURL begins by passing the URL you want to access to curl_init(). This function doesn’t immediately go out and retrieve the URL; it returns a handle, which is a variable that you pass to other functions to set options and configure how cURL should work. You can have multiple handles in different variables at the same time. Each handle controls a different request.

The curl_setopt() function controls the PHP engine’s behavior when retrieving the URL, and the curl_exec() function actually causes the request to be retrieved. Example 11-6 uses cURL to retrieve the numbersapi.com URL from Example 11-1.

Example 11-6. Retrieving a URL with cURL

<?php

$c = curl_init(‘http://numbersapi.com/09/27’);

// Tell cURL to return the response contents as a string

// rather then printing them out immediately

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

// Execute the request

$fact = curl_exec($c);

?>

Did you know that <?= $fact ?>

In Example 11-6, the call to curl_setopt() sets the CURLOPT_RETURNTRANSFER option. This tells cURL that when it makes the HTTP request, it should return the response as a string. Otherwise, it prints out the response as it is retrieved. The curl_exec() function makes the request and returns the result.

Other cURL options let you set headers. Example 11-7 uses cURL functions to make the request from Example 11-4.

Example 11-7. Using cURL with query string parameters and headers

// Just key and query term, no format specified in query string

$params = array(‘api_key’ => NDB_API_KEY,

  ‘q’ => ‘black pepper’);

$url = “http://api.nal.usda.gov/ndb/search?” . http_build_query($params);

$c = curl_init($url);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

curl_setopt($c, CURLOPT_HTTPHEADER, array(‘Content-Type: application/json’));

print curl_exec($c);

In Example 11-7, the URL is constructed in a familiar way with http_build_query(). The query string parameters are part of the URL, so they go into the URL string passed to curl_init(). The new CURLOPT_HTTP_HEADER option sets the HTTP header to be sent with the request. If you have multiple headers, put multiple items in this array.

There are two kinds of errors to deal with from cURL requests. The first is an error from cURL itself. This could be something such as not finding the hostname, or not being able to make a connection to the remote server. If this kind of thing happens, curl_exec() returns false and curl_errno() returns an error code. The curl_error() function returns the error message that corresponds to the code.

The second kind of error is an error from the remote server. This happens if the URL you ask for isn’t found or the server has a problem producing a response to your request. cURL still considers this a successful request because the server returned something, so you need to check the HTTP response code to see if there’s a problem. The curl_getinfo() function returns an array of information about the request. One of the elements in that array is the HTTP response code.

Example 11-8 shows cURL request-making code that handles both kinds of errors.

Example 11-8. Handling errors with cURL

// A pretend API endpoint that doesn’t exist

$c = curl_init(‘http://api.example.com’);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($c);

// Get all the connection info, whether or not it succeeded

$info = curl_getinfo($c);

// Something went wrong with the connection

if ($result === false) {

print “Error #” . curl_errno($c) . “\n”;

print “Uh-oh! cURL says: “ . curl_error($c) . “\n”;

}

// HTTP response codes in the 400s and 500s mean errors

else if ($info[‘http_code’] >= 400) {

print “The server says HTTP error {$info[‘http_code’]}.\n“;

}

else {

print “A successful result!” ;

}

// The request info includes timing statistics as well

print “By the way, this request took {$info[‘total_time’]} seconds.\n”;

Example 11-8 starts out with a standard cURL request. After making the request, it stores the request info from curl_getinfo() into $info. The curl_getinfo() func­tion needs to be passed the cURL handle it should operate on, just like curl_errno() and curl_error(). This is necessary in order to return information about the correct request.

The host api.exanple.com doesn’t actually exist, so cURL can’t connect to it to make a request. So, curl_exec() returns false. Example 11-8 prints:

Error #6

Uh-oh! cURL says: Could not resolve host: api.exanple.com

By the way, this request took 0.000146 seconds.

The PHP manual page about curl_errno() has a list of all the cURL error codes.

If the request made it to the server but the server returned an error, then $result is not false, but holds whatever response the server sent back. This response code is in the http_code element of the $info array. If Example 11-8 encountered an HTTP 404 error, which means that the server couldn’t find the page the request asked for, then the example would print:

The server says HTTP error 404.

By the way, this request took 0.00567 seconds.

Both outputs from the example also include the total time it took to make the request. This is another handy bit of request data in the $info array. The PHP manual page for curl_getinfo() lists all the elements of this array.

2. Retrieving URLs via POST

To use the POST method with cURL, adjust the settings to change the request method and supply the request body data. The CURLOPT_POST setting tells cURL you want a POST request, and the CURLOPT_POSTFIELDS setting holds the data you want to send. Example 11-9 shows how to make a POST request with cURL.

Example 11-9. Making a POST request with cURL

$url = ‘http://php7.example.com/post-server.php’;

// Two variables to send via POST

$form_data = array(‘name’ => ‘black pepper’,

‘smell’ => ‘good’);

$c = curl_init($url);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

// This should be a POST request

curl_setopt($c, CURLOPT_POST, true);

// This is the data to send

curl_setopt($c, CURLOPT_POSTFIELDS, $form_data);

print curl_exec($c);

In Example 11-9, you don’t need to set the Content-Type header or format the data you’re sending. cURL takes care of that for you.

However, if you want to send a different content type than regular form data, you need to do a little more work. Example 11-10 shows how to send JSON via a POST request with cURL.

Example 11-10. Sending JSON via POST with cURL

$url = ‘http://php7.example.com/post-server.php’;

// Two variables to send as JSON via POST

$form_data = array(‘name’ => ‘black pepper’,

‘smell’ => ‘good’);

$c = curl_init($url);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

// This should be a POST request

curl_setopt($c, CURLOPT_POST, true);

// This is a request containing JSON

curl_setopt($c, CURLOPT_HTTPHEADER, array(‘Content-Type: application/json’));

// This is the data to send, formatted appropriately

curl_setopt($c, CURLOPT_POSTFIELDS, json_encode($form_data));

print curl_exec($c);

In Example 11-10, the CURLOPT_HTTPHEADER setting tells the server that the request body is JSON, not regular form data. Then, the value of CURLOPT_POSTFIELDS is set to json_encode($form_data) so that the request body is indeed JSON.

3. Using Cookies

If the response to a cURL request includes a header that sets a cookie, cURL doesn’t do anything special with that header by default. But cURL does give you a few config­uration settings that let you track cookies, even across different PHP programs or executions of the same program.

Example 11-11 is a simple page that maintains a cookie, c. Each time the page is requested, the response includes a c cookie whose value is one greater than whatever value is supplied for the c cookie in the request. If no c cookie is sent, then the response sets the c cookie to 1.

Example 11-11. Simple cookie-setting server

// Use the value sent in the cookie, if any, or 0 if no cookie supplied

$value = $_COOKIE[‘c’] ?? 0;

// Increment the value by 1

$value++;

// Set the new cookie in the response

setcookie(‘c’, $value);

// Tell the user what cookies we saw

print “Cookies: “ . count($_COOKIE) . “\n”;

foreach ($_COOKIE as $k => $v) {

print “$k: $v\n”;

}

With no additional configuration, cURL doesn’t keep track of the cookie sent back in Example 11-11. In Example 11-12, curl_exec() is called twice on the same handle,
but the cookie sent back in the response to the first request is not sent on the second request.

Example 11-12. cURL’s default cookie-handling behavior

// Retrieve the cookie server page, sending no cookies

$c = curl_init(‘http://php7.example.com/cookie-server.php’);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

// The first time, there are no cookies

$res = curl_exec($c);

print $res;

// The second time, there are still no cookies

$res = curl_exec($c);

print $res;

Example 11-12 prints:

Cookies: 0

Cookies: 0

Both requests get a response of Cookies: 0 because cURL sent no Cookie header with the request.

Enabling cURL’s cookie jar tells it to keep track of cookies. To keep track of cookies within the lifetime of a specific cURL handle, set CURLOPT_COOKIEJAR to true, as in Example 11-13.

Example 11-13. Enabling cURL’s cookie jar

// Retrieve the cookie server page, sending no cookies

$c = curl_init(‘http://php7.example.com/cookie-server.php’);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

// Turn on the cookie jar

curl_setopt($c, CURLOPT_COOKIEJAR, true);

// The first time, there are no cookies

$res = curl_exec($c);

print $res;

// The second time, there are cookies from the first request

$res = curl_exec($c);

print $res;

Example 11-13 prints:

Cookies: 0

Cookies: 1

c: 1

In Example 11-13, cURL keeps track of cookies sent in response to a request as long as the handle for that cURL request exists in your program. The second time curl_exec() is called for the handle $c, the cookie set in the first response is used.

In this mode, the cookie jar only tracks cookies within a handle. Changing the value of CURLOPT_COOKIEJAR to a filename tells cURL to write the cookie values to that file. Then you can also provide that filename as the value for CURLOPT_COOKIEFILE. Before sending a request, cURL reads in any cookies from the CURLOPT_COOKIEFILE file and uses them in subsequent requests. Example 11-14 shows the cookie jar and cookie file in action.

Example 11-14. Tracking cookies across requests

// Retrieve the cookie server page

$c = curl_init(‘http://php7.example.com/cookie-server.php’);

curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

// Save cookies to a ‘saved.cookies’ file in the same directory

// as this program

curl_setopt($c, CURLOPT_COOKIEJAR, __DIR__ . ‘/saved.cookies’);

// Load cookies (if any have been previously saved) from the

// ‘saved.cookies’ file in this directory

curl_setopt($c, CURLOPT_COOKIEFILE, __DIR__ . ‘/saved.cookies’);

// This request includes cookies from the file (if any)

$res = curl_exec($c);

print $res;

The first time Example 11-14 is run, it prints:

Cookies: 0

The second time Example 11-14 is run, it prints:

Cookies: 1 c: 1

The third time Example 11-14 is run, it prints:

Cookies: 1

c: 2

And so forth. Each time the program runs, it looks for a saved.cookies file, loads up any cookies stored in the file, and uses them for the request. After the request, it saves any cookies back to the same file because the CURLOPT_COOKIEFILE setting has the same value as the CURLOPT_COOKIEJAR setting. This updates the saved cookies file so it’s ready with the new value the next time the program runs.

If you’re writing a program that mimics a user logging in with a web browser and then making requests, the cookie jar is a very convenient way to have all of the server- sent cookies accompany the requests cURL makes.

Source: Sklar David (2016), Learning PHP: A Gentle Introduction to the Web’s Most Popular Language, O’Reilly Media; 1st edition.

Leave a Reply

Your email address will not be published. Required fields are marked *