php-dataset/README.md

83 lines
2.5 KiB
Markdown

# Dataset Library for PHP/Composer
This is a library for loading datasets from bundled packages. The idea isn't to
use the classes in this library as a generic datasource, although you could
probably pull that off. Instead, use the datasets to import the relevant data to
your database of choice, and optionally keep track of the version numbers so
new data can be imported automatically when the dependencies have been updated
and a new version has been installed.
## Installing
To install dataset, require it using composer:
```shell
$ composer require noccylabs/dataset
```
You also need some actual datasets. Some interesting ones could be:
Package | Description
---|---
[noccylabs/dataset-postal](https://dev.noccylabs.info/noccy/dataset-postal) | Patterns and info for postal (zip) code validation
[noccylabs/dataset-calendar](https://dev.noccylabs.info/noccy/dataset-calendar) | Bank holidays
[noccylabs/dataset-iso3166](https://dev.noccylabs.info/noccy/dataset-iso3166) | ISO 3166 country codes and namess
## Example
```php
use NoccyLabs\Dataset\DatasetManager;
$dm = new DatasetManager();
// Call on getDataset() if you want access to the metadata,
// Replace with openDataset() to quicly call getDataset()->open()
$ds = $dm->getDataset("noccylabs/dataset-iso3166#countries");
// This is how you get the metadata
echo "Dataset ID: ".$ds->getIdentifier(); // noccylabs/dataset-iso3166#countries
echo "Dataset version: ".$ds->getVersion(); // 2022.10.1
// Get a reader by calling open()
$reader = $ds->open();
foreach ($reader as $row) {
// row is an array
}
```
# Documentation
## DatasetManager
The `DatasetManager` will automatically locate and load datasets on startup.
```
getDataset(string $identifier): Dataset
Return a Dataset object, or throw exception on error
openDataset(string $identifer): Iterator
Return a reader for a Dataset, same as getDataset()->open()
getAvailableDatasets(): array
Returns the Dataset objects for all datasets found
```
## Dataset
```
open(): Iterator
Return an iterator to iterate over the data
filter(array|callable $condition): Iterator
Return an iterator that only returns rows matching filter
getIdentifier(): string
Return the dataset identifier (vendor/package#dataset)
getVersion(): string
Return the package version of the dataset
getPackageName(): string
Return the package name (vendor/package)
getDatasetName(): string
Return the dataset name (dataset)
getLicense(): ?string
Return the license for the dataset
getComment(): ?string
Return the dataset comment
```