ThorsAnvil

Home API Documentation

Internal: Mug · ThorsMug · ThorsSlack · NisseServer · NisseHTTP · ThorsSocket · ThorsCrypto · ThorsSerializer · ThorsMongo · ThorsLogging · ThorsIOUtil


ThorsSerializer Internal Documentation

Detailed architecture, traits system, and implementation details for the ThorsSerializer library.

Source: third/ThorsSerializer/src/Serialize/


Architecture

ThorsSerializer is built on a traits-based system with four layers:

  1. Traits<T> – A template specialization that describes the shape of type T (which members to read/write). Generated by macros.
  2. Exporter / Importer – Lightweight wrapper objects returned by jsonExporter() etc. Integrate with operator<< / operator>>.
  3. Serializer / DeSerializer – The internal engine that walks Traits<T> metadata and drives the printer/parser.
  4. PrinterInterface / ParserInterface – Format-specific backends (JSON, YAML, BSON) for byte-level I/O.

All serialization metadata is resolved at compile time.


File Map

File Purpose
Traits.h Core Traits<T> class and all declaration macros
SerUtil.h Traits<T> specializations for standard containers
Serialize.h / Serialize.tpp Serializer and DeSerializer engines
Exporter.h Exporter wrapper with operator<<
Importer.h Importer wrapper with operator>>
PrinterInterface.h/.cpp Abstract printer base class
ParserInterface.h/.cpp Abstract parser base class
JsonParser.h/.cpp JSON parser implementation
JsonPrinter.cpp JSON printer implementation
JsonThor.h JSON exporter/importer factory functions
YamlParser.h/.cpp YAML parser (wraps libyaml)
YamlPrinter.h/.cpp YAML printer (wraps libyaml)
YamlThor.h YAML exporter/importer factory functions
BsonParser.h/.cpp BSON parser
BsonPrinter.h/.cpp BSON printer
BsonThor.h BSON exporter/importer factory functions
BsonConfig.h/.cpp BSON-specific configuration
PrinterConfig.h PrinterConfig class
ParserConfig.h ParserConfig class
CustomSerialization.h/.tpp DefaultCustomSerializer<T> base class
StringInput.h String-to-stream adapter for deserialization
StringOutput.h Stream-to-string adapter for serialization
UnicodeIterator.h Unicode handling for JSON strings
Format.h Format detection utilities
PolymorphicMarker.h Polymorphic type registration
MongoUtility.h/.cpp MongoDB ObjectId support

Traits System

TraitType Enum

Value Meaning
Invalid No traits defined (compile error if serialized)
Map Object with named fields (structs/classes)
Array Ordered sequence (vectors, lists)
Parent Object that extends a serializable parent
Value Primitive value (int, string, bool)
Enum Enumeration type
Pointer Pointer/smart pointer
Reference Reference wrapper, optional
Custom_Serialize Custom serialization handler
Variant std::variant

Traits<T> Structure

For a Map type:

template<>
struct Traits<MyType> {
    static constexpr TraitType type = TraitType::Map;
    using Members = std::tuple<
        std::pair<char const*, int MyType::*>,
        std::pair<char const*, std::string MyType::*>
    >;
    static Members const& getMembers();
};

The getMembers() function returns a tuple of (name, member-pointer) pairs.

Macro Expansion

ThorsAnvil_MakeTrait(Color, red, green, blue) expands to a Traits<Color> specialization in the ThorsAnvil::Serialize namespace with:

ThorsAnvil_ExpandTrait(Base, Derived, field) adds:


Serializer Engine

Serialization Flow

  1. Serializer::print(object) dispatches based on Traits<T>::type:
    • Map/Parent: calls printer.openMap(), iterates members via printObjectMembers(), calls printer.closeMap()
    • Array: calls printer.openArray(), iterates elements, calls printer.closeArray()
    • Value: calls printer.addValue(object)
    • Pointer: dereferences and serializes the pointed-to value
    • Custom_Serialize: delegates to the registered custom serializer
  2. printObjectMembers() uses std::apply to iterate the members tuple, calling printer.addKey(name) then recursively serializing each member.
  3. For Parent types, it first serializes the parent’s members, then the derived class’s members.

Deserialization Flow

  1. DeSerializer::parse(object) dispatches based on Traits<T>::type:
    • Map/Parent: calls parser.getNextToken() expecting MapStart, then loops reading key-value pairs
    • Array: calls parser.getNextToken() expecting ArrayStart, then loops reading elements
    • Value: calls parser.getValue(object)
  2. For maps, the key is looked up in the traits metadata to find the corresponding member pointer. Unknown keys are handled based on ParseType (ignored, warned, or thrown).
  3. Exact parsing additionally checks that all declared members were seen.

Printer/Parser Interfaces

PrinterInterface

class PrinterInterface {
public:
    virtual void openDoc() = 0;
    virtual void closeDoc() = 0;
    virtual void openMap(std::size_t size) = 0;
    virtual void closeMap() = 0;
    virtual void openArray(std::size_t size) = 0;
    virtual void closeArray() = 0;
    virtual void addKey(std::string_view key) = 0;
    virtual void addValue(short/int/long/double/bool/string...) = 0;
    virtual void addNull() = 0;
};

ParserInterface

class ParserInterface {
public:
    virtual ParserToken getNextToken() = 0;
    virtual std::string getKey() = 0;
    virtual void getValue(short&/int&/long&/double&/bool&/string&...) = 0;
    virtual bool isValueNull() = 0;
};

ParserToken

enum class ParserToken { DocStart, DocEnd, MapStart, MapEnd, ArrayStart, ArrayEnd, Key, Value, Error };

String I/O Adapters

StringInput

Allows operator>> from std::string and std::string_view. Creates a temporary std::istringstream from the string data.

StringOutput

Allows operator<< to std::string. Creates a temporary std::ostringstream and appends the result to the target string.


Polymorphic Serialization Internals

Registration

ThorsAnvil_PolyMorphicSerializer(Type) adds a virtual method to the class that returns the type name. The ThorsAnvil_RegisterPolyMorphicType macro registers the type in a global map.

Serialization

When serializing a pointer to a base class, the serializer:

  1. Calls the virtual type-name method to get the concrete type name.
  2. Emits "__type": "ConcreteType" as the first key-value pair.
  3. Serializes the object using the concrete type’s traits.

Deserialization

When deserializing:

  1. Reads the "__type" key.
  2. Looks up the concrete type in the global registry.
  3. Allocates an instance of the concrete type.
  4. Deserializes using the concrete type’s traits.

shared_ptr Deduplication

When serializing std::shared_ptr<T>:

  1. Each unique pointer address is assigned an integer ID.
  2. First occurrence: serialized as {"__id": N, ...data...}.
  3. Subsequent occurrences of the same pointer: serialized as {"__ref": N}.

On deserialization, the IDs are used to reconstruct shared ownership.


BSON-Specific Details

BSON requires knowing the byte size of an object before writing it (the size is part of the header). The bsonGetPrintSize() function pre-computes this by walking the traits metadata and summing field sizes.

Custom serializers must implement getPrintSizeBson() for BSON support.


Key Design Notes