PHP Classes

How to encode and decode data in a more compact format suitable to implement PHP LLM API clients using the package ATON Format PHP: Encode and decode values using the ATON format

Recommend this page to a friend!
  Info   Documentation   View files Files   Install with Composer Install with Composer   Download Download   Reputation   Support forum   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2025-11-24 (3 months ago) RSS 2.0 feedNot yet rated by the usersTotal: Not yet counted Not yet ranked
Version License PHP version Categories
aton-format-php 8MIT/X Consortium ...8Compression, Data types, Artificial i..., P..., P...
Description 

Author

This package can encode and decode values using the ATON format.

It provides a class that can take a variable value and encode it as a string in the ATON (Adaptive Token-Oriented Notation) format.

The class can also take a string in the ATON format and decode it to return the original variable value.

Picture of Stefano D'Agostino
  Performance   Level  
Name: Stefano D'Agostino <contact>
Classes: 1 package by
Country: Italy Italy

Instructions

composer require dagost/aton-format

<?php

use Aton\ATON; use Aton\Enums\CompressionMode;

// Simple encode/decode $data = [

'employees' => [
    ['id' => 1, 'name' => 'Alice', 'role' => 'Engineer', 'active' => true],
    ['id' => 2, 'name' => 'Bob', 'role' => 'Designer', 'active' => true],
    ['id' => 3, 'name' => 'Carol', 'role' => 'Manager', 'active' => true],
]

];

$atonText = ATON::encode($data); echo $atonText; // Output: // @schema[id:int, name:str, role:str, active:bool] // @defaults[active:true] // // employees(3): // 1, "Alice", "Engineer" // 2, "Bob", "Designer" // 3, "Carol", "Manager"

// Decode back $original = ATON::decode($atonText);

Documentation

ATON - Adaptive Token-Oriented Notation (PHP)

PHP Version License: MIT

ATON is a token-efficient data serialization format designed specifically for LLM applications. It reduces token usage by up to 55% compared to JSON while maintaining perfect data fidelity.

V2 Features

  • Compression Modes: FAST, BALANCED, ULTRA, ADAPTIVE
  • Query Language: SQL-like syntax with full AST parser
  • Streaming Encoder: Process large datasets in chunks
  • Dictionary Compression: Automatic string deduplication
  • Full PHP 8 Support: Enums, named arguments, typed properties
  • Zero Dependencies: Lightweight and fast

Installation

composer require dagost/aton-format

Quick Start

<?php

use Aton\ATON;
use Aton\Enums\CompressionMode;

// Simple encode/decode
$data = [
    'employees' => [
        ['id' => 1, 'name' => 'Alice', 'role' => 'Engineer', 'active' => true],
        ['id' => 2, 'name' => 'Bob', 'role' => 'Designer', 'active' => true],
        ['id' => 3, 'name' => 'Carol', 'role' => 'Manager', 'active' => true],
    ]
];

$atonText = ATON::encode($data);
echo $atonText;
// Output:
// @schema[id:int, name:str, role:str, active:bool]
// @defaults[active:true]
//
// employees(3):
//   1, "Alice", "Engineer"
//   2, "Bob", "Designer"
//   3, "Carol", "Manager"

// Decode back
$original = ATON::decode($atonText);

Compression Modes

use Aton\Encoder;
use Aton\Enums\CompressionMode;

// Fast: No dictionary compression, fastest encoding
$fast = new Encoder(compression: CompressionMode::FAST);

// Balanced: Good compression with reasonable speed (default)
$balanced = new Encoder(compression: CompressionMode::BALANCED);

// Ultra: Maximum compression, best for large datasets
$ultra = new Encoder(compression: CompressionMode::ULTRA);

// Adaptive: Automatically selects mode based on data size
$adaptive = new Encoder(compression: CompressionMode::ADAPTIVE);

Query Language

ATON supports SQL-like queries for filtering data:

use Aton\ATON;
use Aton\QueryEngine;

$data = [
    'products' => [
        ['id' => 1, 'name' => 'Laptop', 'price' => 999, 'category' => 'Electronics'],
        ['id' => 2, 'name' => 'Mouse', 'price' => 29, 'category' => 'Electronics'],
        ['id' => 3, 'name' => 'Desk', 'price' => 299, 'category' => 'Furniture'],
    ]
];

// Parse and execute query
$queryEngine = ATON::createQueryEngine();
$query = $queryEngine->parse("products WHERE price > 100 ORDER BY price DESC LIMIT 10");
$results = $queryEngine->execute($data, $query);

// Or encode with query directly
$filteredAton = ATON::encodeWithQuery($data, "products WHERE category = 'Electronics'");

Query Syntax

-- Basic filtering
products WHERE price > 100

-- Multiple conditions
products WHERE price > 100 AND category = 'Electronics'

-- OR conditions
products WHERE category = 'Electronics' OR category = 'Furniture'

-- IN operator
products WHERE category IN ('Electronics', 'Furniture')

-- LIKE operator (pattern matching)
products WHERE name LIKE '%Laptop%'

-- BETWEEN
products WHERE price BETWEEN 100 AND 500

-- Sorting and pagination
products WHERE active = true ORDER BY price DESC LIMIT 10 OFFSET 5

-- Select specific fields
products SELECT id, name WHERE price > 100

Streaming Encoder

For large datasets, use the streaming encoder:

use Aton\StreamEncoder;
use Aton\Enums\CompressionMode;

$streamEncoder = new StreamEncoder(
    chunkSize: 100,
    compression: CompressionMode::BALANCED
);

$largeData = [
    'records' => array_map(
        fn($i) => ['id' => $i, 'name' => "Record $i", 'value' => rand()],
        range(1, 10000)
    )
];

// Process in chunks
foreach ($streamEncoder->streamEncode($largeData) as $chunk) {
    echo "Chunk {$chunk['chunkId']}/{$chunk['totalChunks']}\n";
    echo "Progress: " . ($chunk['metadata']['progress'] * 100) . "%\n";

    // Process chunk data
    sendToAPI($chunk['data']);
}

Compression Statistics

use Aton\ATON;

$stats = ATON::getCompressionStats($data);

echo "Original tokens: {$stats['originalTokens']}\n";
echo "Compressed tokens: {$stats['compressedTokens']}\n";
echo "Savings: {$stats['savingsPercent']}%\n";
echo "Compression ratio: {$stats['compressionRatio']}\n";

API Reference

ATON Facade

ATON::encode(array $data, bool $compress = true, CompressionMode $compression = CompressionMode::BALANCED): string
ATON::decode(string $atonString): array
ATON::encodeWithQuery(array $data, string $queryString): string
ATON::getCompressionStats(array $data, CompressionMode $compression = CompressionMode::BALANCED): array
ATON::createEncoder(...): Encoder
ATON::createDecoder(...): Decoder
ATON::createStreamEncoder(...): StreamEncoder
ATON::createQueryEngine(): QueryEngine

Encoder Class

$encoder = new Encoder(
    optimize: true,              // Enable schema and defaults optimization
    compression: CompressionMode::BALANCED,  // Compression mode
    queryable: false,            // Add queryable markers
    validate: true               // Validate input data
);

$encoder->encode($data, $compress);           // Encode to ATON
$encoder->encodeWithQuery($data, $query);     // Encode with query filter
$encoder->estimateTokens($text);              // Estimate token count
$encoder->getCompressionStats($data);         // Get compression stats

Decoder Class

$decoder = new Decoder(validate: true);

$decoder->decode($atonString);  // Decode ATON to array

QueryEngine Class

$queryEngine = new QueryEngine();

$query = $queryEngine->parse($queryString);    // Parse query to AST
$results = $queryEngine->execute($data, $query); // Execute query

StreamEncoder Class

$streamEncoder = new StreamEncoder(
    chunkSize: 100,
    compression: CompressionMode::BALANCED
);

foreach ($streamEncoder->streamEncode($data, $tableName) as $chunk) {
    // Process chunk
}

Exceptions

use Aton\Exceptions\ATONException;
use Aton\Exceptions\ATONEncodingException;
use Aton\Exceptions\ATONDecodingException;
use Aton\Exceptions\ATONQueryException;

try {
    $aton = ATON::encode($data);
} catch (ATONEncodingException $e) {
    echo "Encoding error: " . $e->getMessage();
}

ATON Format Specification

Basic Structure

@dict[#0:"repeated string", #1:"another string"]
@schema[field1:type1, field2:type2, ...]
@defaults[field1:value1, field2:value2, ...]

entityName(count):
  value1, value2, ...
  value1, value2, ...

Supported Types

| Type | Description | Example | |------|-------------|---------| | int | Integer | 42 | | float | Floating point | 3.14 | | str | String | "hello" | | bool | Boolean | true, false | | null | Null value | null | | array | Array | [1,2,3] | | object | Object | {key:value} |

Performance

| Dataset | JSON Tokens | ATON Tokens | Reduction | |---------|-------------|-------------|-----------| | Employee Records (1K) | 12,450 | 5,280 | 57.6% | | Product Catalog (10K) | 145,200 | 64,800 | 55.4% | | Transaction Log (100K) | 1,856,000 | 815,000 | 56.1% |

Requirements

  • PHP 8.0 or higher

Links

License

MIT License - see LICENSE for details.

Author

Stefano D'Agostino

  • GitHub: @dagoSte
  • Email: dago.stefano@gmail.com

Details

ATON Format Specification V2

Overview

ATON V2 builds on V1 with advanced features for better compression, querying, and large dataset handling.

New Features in V2

  1. Dictionary Compression: Automatic deduplication of repeated strings
  2. Default Values: Skip encoding when values match defaults
  3. Query Language: SQL-like filtering and sorting
  4. Streaming Encoder: Process large datasets in chunks
  5. Compression Modes: FAST, BALANCED, ULTRA, ADAPTIVE

Format Structure

Complete Syntax

@dict[#0:"repeated value", #1:"another repeated"]
@schema[field1:type1, field2:type2, ...]
@defaults[field1:defaultValue, field2:defaultValue]
@queryable[tableName]

tableName(recordCount):
  value1, value2, ...
  value1, value2, ...

Dictionary Compression

Purpose

Reduces token usage by replacing repeated strings with short references.

Syntax

@dict[#0:"Long repeated string", #1:"Another common value"]

Usage in Data

@dict[#0:"Electronics", #1:"In Stock"]
@schema[id:int, name:str, category:str, status:str]

products(3):
  1, "Laptop", #0, #1
  2, "Mouse", #0, #1
  3, "Keyboard", #0, "Out of Stock"

Compression Thresholds

| Mode | Min Length | Min Occurrences | |------|------------|-----------------| | FAST | No compression | - | | BALANCED | 5 chars | 3 times | | ULTRA | 3 chars | 2 times | | ADAPTIVE | Auto-selected based on data size |

Default Values

Purpose

Skip encoding values that match the most common value for a field.

Syntax

@defaults[status:"active", verified:true]

Example

@schema[id:int, name:str, status:str, verified:bool]
@defaults[status:"active", verified:true]

users(4):
  1, "Alice"
  2, "Bob"
  3, "Carol", "inactive"
  4, "Dave", "active", false

Users 1 and 2 have default status and verified values (not encoded). User 3 has non-default status. User 4 has non-default verified.

Query Language

Syntax

tableName [SELECT fields] [WHERE conditions] [ORDER BY field [ASC|DESC]] [LIMIT n] [OFFSET n]

Operators

| Operator | Description | Example | |----------|-------------|---------| | = | Equals | status = 'active' | | !=, <> | Not equals | status != 'deleted' | | < | Less than | age < 30 | | > | Greater than | price > 100 | | <= | Less or equal | count <= 10 | | >= | Greater or equal | score >= 80 | | LIKE | Pattern match | name LIKE '%john%' | | IN | In set | category IN ('A', 'B') | | NOT IN | Not in set | status NOT IN ('deleted') | | BETWEEN | Range | price BETWEEN 10 AND 100 |

Logical Operators

  • `AND`: Both conditions must be true
  • `OR`: Either condition must be true
  • `NOT`: Negates condition
  • Parentheses for grouping: `(a OR b) AND c`

Examples

-- Simple filter
users WHERE active = true

-- Multiple conditions
products WHERE price > 100 AND category = 'Electronics'

-- Pattern matching
users WHERE email LIKE '%@gmail.com'

-- Sorting and pagination
orders WHERE status = 'pending' ORDER BY created_at DESC LIMIT 10

-- Field selection
users SELECT id, name, email WHERE verified = true

Streaming Format

Chunk Structure

First chunk includes full schema:

@schema[id:int, name:str]

records(100):
  1, "First"
  2, "Second"
  ...

Subsequent chunks use continuation syntax:

records+(100):
  101, "Next"
  102, "Another"
  ...

Metadata

Each chunk includes: - chunkId: Current chunk number (0-indexed) - totalChunks: Total number of chunks - isFirst: Boolean, true for first chunk - isLast: Boolean, true for last chunk - metadata.table: Table name - metadata.recordsInChunk: Records in this chunk - metadata.startIdx: Starting record index - metadata.endIdx: Ending record index - metadata.totalRecords: Total records across all chunks - metadata.progress: Completion percentage (0.0 to 1.0)

Compression Modes

FAST

  • No dictionary compression
  • Fastest encoding
  • Best for: Small datasets, real-time encoding

BALANCED (Default)

  • Dictionary compression for strings ?5 chars appearing ?3 times
  • Good balance of speed and compression
  • Best for: General purpose use

ULTRA

  • Aggressive dictionary compression (?3 chars, ?2 times)
  • Maximum compression
  • Best for: Large datasets, bandwidth-constrained scenarios

ADAPTIVE

  • Automatically selects mode based on data size: - < 1KB: FAST - 1KB - 10KB: BALANCED - > 10KB: ULTRA

PHP Implementation

Encoder

use Aton\Encoder;
use Aton\Enums\CompressionMode;

$encoder = new Encoder(
    optimize: true,
    compression: CompressionMode::BALANCED,
    queryable: true,
    validate: true
);

// Basic encoding
$aton = $encoder->encode($data);

// With query filter
$aton = $encoder->encodeWithQuery($data, "users WHERE active = true");

// Get stats
$stats = $encoder->getCompressionStats($data);

Decoder

use Aton\Decoder;

$decoder = new Decoder(validate: true);
$data = $decoder->decode($atonString);

Query Engine

use Aton\QueryEngine;

$engine = new QueryEngine();
$query = $engine->parse("products WHERE price > 100 ORDER BY price DESC");
$results = $engine->execute($data, $query);

Stream Encoder

use Aton\StreamEncoder;
use Aton\Enums\CompressionMode;

$encoder = new StreamEncoder(
    chunkSize: 100,
    compression: CompressionMode::BALANCED
);

foreach ($encoder->streamEncode($largeData) as $chunk) {
    processChunk($chunk['data']);
}

Migration from V1

V2 is fully backward compatible with V1. To use V1-style encoding:

$encoder = new Encoder(
    optimize: false,              // Disable defaults optimization
    compression: CompressionMode::FAST  // No dictionary compression
);

V2 decoder can read both V1 and V2 format without any changes.


  Files folder image Files (36)  
File Role Description
Files folder imagedocs (6 files)
Files folder imagesrc (5 files, 4 directories)
Files folder imagetests (5 files)
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file phpunit.xml Data Auxiliary data
Accessible without login Plain text file README.md Doc. Read me

  Files folder image Files (36)  /  docs  
File Role Description
  Accessible without login Plain text file COMPRESSION.md Data Auxiliary data
  Plain text file EXAMPLES.md Class Class source
  Accessible without login Plain text file QUERY_LANGUAGE.md Data Auxiliary data
  Accessible without login Plain text file SPECIFICATION_V1.md Data Auxiliary data
  Accessible without login Plain text file SPECIFICATION_V2.md Data Auxiliary data
  Accessible without login Plain text file STREAMING.md Data Auxiliary data

  Files folder image Files (36)  /  src  
File Role Description
Files folder imageCompression (2 files)
Files folder imageEnums (5 files)
Files folder imageExceptions (4 files)
Files folder imageQuery (5 files)
  Plain text file ATON.php Class Class source
  Plain text file Decoder.php Class Class source
  Plain text file Encoder.php Class Class source
  Plain text file QueryEngine.php Class Class source
  Plain text file StreamEncoder.php Class Class source

  Files folder image Files (36)  /  src  /  Compression  
File Role Description
  Plain text file CompressionEngine.php Class Class source
  Plain text file DictionaryCompression.php Class Class source

  Files folder image Files (36)  /  src  /  Enums  
File Role Description
  Accessible without login Plain text file ATONType.php Aux. Configuration script
  Accessible without login Plain text file CompressionMode.php Aux. Configuration script
  Accessible without login Plain text file LogicalOperator.php Aux. Configuration script
  Accessible without login Plain text file QueryOperator.php Aux. Configuration script
  Accessible without login Plain text file SortOrder.php Aux. Configuration script

  Files folder image Files (36)  /  src  /  Exceptions  
File Role Description
  Plain text file ATONDecodingException.php Class Class source
  Plain text file ATONEncodingException.php Class Class source
  Plain text file ATONException.php Class Class source
  Plain text file ATONQueryException.php Class Class source

  Files folder image Files (36)  /  src  /  Query  
File Role Description
  Plain text file ParsedQuery.php Class Class source
  Plain text file QueryCondition.php Class Class source
  Plain text file QueryExpression.php Class Class source
  Plain text file QueryParser.php Class Class source
  Plain text file QueryTokenizer.php Class Class source

  Files folder image Files (36)  /  tests  
File Role Description
  Plain text file ATONFacadeTest.php Class Class source
  Plain text file DecoderTest.php Class Class source
  Plain text file EncoderTest.php Class Class source
  Plain text file QueryEngineTest.php Class Class source
  Plain text file StreamEncoderTest.php Class Class source

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.
Install with Composer Install with Composer
 Version Control Unique User Downloads  
 100%
Total:0
This week:0