php-dataset/README.md

2.5 KiB

Dataset Library for PHP/Composer

This is a library for loading datasets from bundled packages. The idea isn't to use the classes in this library as a generic datasource, although you could probably pull that off. Instead, use the datasets to import the relevant data to your database of choice, and optionally keep track of the version numbers so new data can be imported automatically when the dependencies have been updated and a new version has been installed.

Installing

To install dataset, require it using composer:

$ composer require noccylabs/dataset

You also need some actual datasets. Some interesting ones could be:

Package Description
noccylabs/dataset-postal Patterns and info for postal (zip) code validation
noccylabs/dataset-calendar Bank holidays
noccylabs/dataset-iso3166 ISO 3166 country codes and namess

Example

use NoccyLabs\Dataset\DatasetManager;

$dm = new DatasetManager();

// Call on getDataset() if you want access to the metadata,
// Replace with openDataset() to quicly call getDataset()->open()
$ds = $dm->getDataset("noccylabs/dataset-iso3166#countries");

// This is how you get the metadata
echo "Dataset ID: ".$ds->getIdentifier(); // noccylabs/dataset-iso3166#countries
echo "Dataset version: ".$ds->getVersion(); // 2022.10.1

// Get a reader by calling open()
$reader = $ds->open();
foreach ($reader as $row) {
    // row is an array
}

Documentation

DatasetManager

The DatasetManager will automatically locate and load datasets on startup.

getDataset(string $identifier): Dataset
    Return a Dataset object, or throw exception on error
openDataset(string $identifer): Iterator
    Return a reader for a Dataset, same as getDataset()->open()
getAvailableDatasets(): array
    Returns the Dataset objects for all datasets found

Dataset

open(): Iterator
    Return an iterator to iterate over the data
filter(array|callable $condition): Iterator
    Return an iterator that only returns rows matching filter
getIdentifier(): string
    Return the dataset identifier (vendor/package#dataset)
getVersion(): string
    Return the package version of the dataset
getPackageName(): string
    Return the package name (vendor/package)
getDatasetName(): string
    Return the dataset name (dataset)
getLicense(): ?string
    Return the license for the dataset
getComment(): ?string
    Return the dataset comment