Development Published

Parsing the HTTP Link header

Writing a HTTP Link header parser package to replace the suddenly removed package.

Suddenly missing a dependency

Recently an issue was created for one of the packages I maintain, which stated that a dependency was removed which resulted in the author being unable to use my package. To my surprise, the package was indeed completely removed and even the repository was gone. The package was used to parse the HTTP Link header for handling pagination in the API. The author of the package probably had their reasons to remove the package (or did it accidentally), but it was still a bit of a surprise to see it completely gone without any warning.

Tomas Votruba wrote a very interesting article about handling obsolete packages, which I recommend reading if you want to know how to handle such a scenario properly: How to Deprecate PHP Package Without Leaving Anyone Behind.

Developing the package

As the original package disappeared and I could not find a much-used modern alternative, I decided to write my own package. I've started to increase my knowledge about the HTTP Link header.

The HTTP Link header

The HTTP Link header is a header that is used to provide links to related resources. It is mostly used to provide information about the pagination URIs in APIs. The Link header mostly contains links to the next and previous pages with results. The Link header has a specific format (according to RFC 5988) but can still be a bit tricky to parse.

The format of the Link header is described in the MDN Web Docs as:

Link: <uri-reference>; param1=value1; param2="value2"

In addition to that, the Link header can contain multiple, comma-separated, links. A link contains a URI, followed by a semicolon and a list of parameters. The parameters are in the well-known key-value format, where the value can be provided directly or within quotes.

For example:

Link: <https://api.example.com/issues?page=2>; rel="prev", <https://api.example.com/issues?page=4>; rel="next", <https://api.example.com/issues?page=10>; rel="last", <https://api.example.com/issues?page=1>; rel="first"
Link: </style.css>; rel=preload; as=style; fetchpriority="high"

With this knowledge, I started to write the package.

Models

When working with the HTTP Link header, we can define two separate models:

  • LinkHeader: Represents the entire Link header and could contain multiple Link objects. This resulted in the LinkHeader class.
  • Link: A single link in the Link header, which contains the URI, relation type, and any parameters when present. This resulted in the Link class.

Parsing

The parsing of the Link header is a bit more of a challenge. The goal of the package is to correctly parse the components of the header.

The LinkHeaderFactory parses the raw Link header string and creates a LinkHeader object. The first step to parse the Link header is to remove the Link: prefix if it is present. This makes the parsing easier as we now have to deal with a single format. The next step is to split the header string into individual links, which can be done by splitting the string on commas. The package has to deal with cases where commas are part of a quoted string. The split method implements a delimiter-aware string splitter which takes quoted substrings and escape sequences into account.

Now that we have the individual link strings, we can parse each link to extract the URI, relation type, and parameters. The split method is used again, this time with a semicolon as the delimiter, to split the link string into its components. The first component is the URI, which is enclosed in angle brackets. The remaining components are parameters, which we can parse into key-value pairs.

This all resulted in the LinkHeaderFactory class.

The package

The package link-header-parser is now available on Packagist and can be installed via composer:

composer require goedemiddag/link-header-parser

In the README file of the package, you can find more information about how to use the package, but here I will show a real-life example of how I'm using this for the Moneybird API pagination. The Moneybird API returns a Link header with the previous and next pages.

When retrieving contacts from the Moneybird API and we are currently on the second page , the Link header contains:

Link: <https://moneybird.com/api/v2/contacts.json?page=3>; rel="next", <https://moneybird.com/api/v2/contacts.json?page=1>; rel="prev"

With the package this header can be parsed easily which allows us to retrieve the next and previous links:

use Goedemiddag\LinkHeaderParser\LinkHeaderFactory;

// Assume header is a PSR-7 ResponseInterface
$header = $response->getHeaderLine('Link');

$linkHeader = LinkHeaderFactory::fromHeader($header);

$next = $linkHeader->getLink('next');
$previous = $linkHeader->getLink('prev');

echo $next->uri; // https://moneybird.com/api/v2/contacts.json?page=3
echo $previous->uri; // https://moneybird.com/api/v2/contacts.json?page=1

Feel free to use the package to your advantage, and if you have any feedback or suggestions for improvement, please let me know by opening an issue on GitHub! Of course, you are also welcome to contribute to the package by creating a pull request.

Keep reading

Related posts