User:AlexanderRichardson/Structure Definitions: Difference between revisions

    From KDE UserBase Wiki
    (add information on strings)
    (add js section)
    Line 366: Line 366:


    ==JavaScript structures==
    ==JavaScript structures==
    {{Todo|TODO}}
    A JavaScript structure definition must contain a function called init which returns an object that can be converted to one of the datatypes.
    {{Todo|0.11: list of objects}}
     
    The sample from the [[#XML_structures]] section would look as follows in JavaScript:
    <syntaxhighlight lang="javascript">
    function init() {
      var obj = struct({
        signature: array(char(), 12),
        ImageHeader: struct({
          signature: array(char(), 4),
          width: uint32(),
          height: uint32(),
          bitDepth: uint32(),
          colourType: uint32(),
          compressionMethod: uint8(),
          filterMethod: uint8(),
          interlaceMethod: uint8()
        }
      });
      obj.name = "pngHeader";
      return obj;
    }
    </syntaxhighlight>
     
    ===Type functions===
    The following functions are available to create data types:
    *'''struct(fields)''' for [[#Structures]]
    *'''union(fields)''' for [[#Unions]]
    *'''taggedUnion(fields, alternatives, defaultFields)''' for [[#Tagged_unions]]
    *'''array(type, length)''' for [[#Arrays]]
    *'''string(encoding)''' for [[#Strings]]
    *'''uint8()''', '''char()''', '''float()''', etc. for the [[#Primitive_data_types]]
    *'''bitfield(type, width)''' for [[#Bitfields]]
    *'''pointer(type, target)''' for [[#Pointers]]
    *'''enumeration(enumName, type, enumValues)''' for [[#Enumerations]]. Using ''enum'' is not possible since this is a JavaScript reserved keyword.
    *'''flags(enumName, type, enumValues)''' for [[#Bitflags]]
     
     
     
    To set the additional properties that are not set by these functions you can either set them using standard JavaScript property assignment or alternatively there is also a '''set()''' function defined on all the returned objects:
     
     
     
    <syntaxhighlight lang="javascript">
      var foo = string("utf8");
      foo.maxByteCount = 12;
      foo.validationFunc = function() { ... };
      //also possible like this:
      var foo2 = string("utf8").set({maxByteCount: 12, validationFunc: function() { ... }});
      //this syntax is mainly useful in inline expressions (the deeper nested you structure is the more useful)
      var something = struct({x: uint8(), y: uint8(), name: string("utf8").set({maxByteCount: 12})});
      //without the set function this would look like this:
      var something2 = struct({x: uint8(), y: uint8(), name: string("utf8")});
      something2.fields.name.maxByteCount = 12;
    </syntaxhighlight>
     
    There is also a '''setUpdate(func)''' and '''setValidation(func)''' function defined for all those objects to save a bit of
    typing:
    <syntaxhighlight lang="javascript">
      //these 3 object are equivalent
      var x1 = array(uint8(), 12).setUpdate(function() { ... });
      var x2 = array(uint8(), 12).set(updateFunc: function() { ... });
      var x3 = array(uint8(), 12);
      x3.updateFunc = function() { ... };
      //the same here
      var y1 = array(uint8(), 12).setValidationUpdate(function() { ... });
      var y2 = array(uint8(), 12).set(validationFunc: function() { ... });
      var y3 = array(uint8(), 12);
      y3.validationFunc = function() { ... };
    </syntaxhighlight>
     


    ==Examples==
    ==Examples==

    Revision as of 03:18, 3 February 2013

    Writing Okteta structure definitions

    It is possible to define structures using either XML or JavaScript. Errors in the definition are viewable by opening the script console in Okteta.

    Directory layout

    Each structure definition consists of a folder containing two files. Inside this folder there must be a .desktop file (recommended name is metadata.desktop) for the metadata.

    If you decide to use XML you will need a <id>.osd file. The id is the value of the X-KDE-PluginInfo-Name entry in the metadata

    Note

    As of Oketeta 0.11 (Released with KDE SC 4.11) the filename main.osd will be checked first, if it doesn't exist <id>.osd will be used.

    If you use JavaScript instead you will need a file named main.js

    The metadata file

    The metadata file is a standard .desktop file. It has the following entries in the [Desktop Entry] section:

    Entry details
    Encoding (required) Always use UTF-8 here
    Icon (optional) The icon that will be displayed in the configuration UI. You can use any icon name known to KDE, or alternatively an absolute filesystem path.
    Type (required) Always use Service here
    ServiceTypes (required) Always use KPluginInfo here
    Name (required) The name of this structure. Will be displayed in the selection UI.
    Comment (optional) A short description of this structure. Will be displayed in the selection UI.
    X-KDE-PluginInfo-Author (optional) Your name. Will be displayed in the selection UI.
    X-KDE-PluginInfo-Email (optional) Your email address. Will be displayed in the selection UI.
    X-KDE-PluginInfo-Name (required) This entry will be used as the ID of this structure
    X-KDE-PluginInfo-Version (required) A version number for your structure.
    X-KDE-PluginInfo-Website (optional) A website for this structure like e.g. http://kde-files.org/content/show.php/rpm+structure+definition?content=147699
    X-KDE-PluginInfo-Category (required) The value here must be either structure if you are writing an XML structure or structure/js if you are writing a JavaScript structure
    X-KDE-PluginInfo-License (optional) A license like e.g. GPLv3


    A valid sample metadata.desktop could look as follows:

    [Desktop Entry]
    Encoding=UTF-8
    Icon=application-zip
    Type=Service
    ServiceTypes=KPluginInfo
    
    Name=Foo files
    Comment=My own custom compression format (.foo)
    
    X-KDE-PluginInfo-Author=Foo Bar
    X-KDE-PluginInfo-Email=[email protected]
    X-KDE-PluginInfo-Name=compressed-file
    X-KDE-PluginInfo-Version=0.1
    X-KDE-PluginInfo-Website=http://www.example.org/
    X-KDE-PluginInfo-Category=structure/js
    X-KDE-PluginInfo-License=GPLv3
    

    Available datatypes and their properties

    Structures

    A structure is a container type in which the children are read sequentially. This is analogous to the C/C++ struct type.

    Property fields: list of datatypes

    This property holds a list with the children of this struct. They will be read in the order they are defined. This property is also available at runtime to allow modifying the field.

    Property childCount (runtime only): unsigned integer

    This read-only property holds the number of fields.

    Unions

    A union is a container type in which the children are read sequentially, but always from the same starting offset. This is analogous to the C/C++ union type.

    Unions have the same properties as structures

    Tagged unions

    noframe
    noframe

    TODO

    TODO



    Arrays

    A collection of elements which have the same type. This is analogous to the C/C++ array concept.

    Note

    Array length is limited to 10000, since larger arrays would use to much memory. If this is a problem for your file format please file a bug report.


    Property type: datatype

    The type of this array. Can be any other element, even another array.

    Property length: unsigned integer or function

    Holds the length of the array. Can be either a fixed number or a JavaScript function that returns a number. If set to a JavaScript function, this function will be called everytime before the array is read and set the array length to the return value. Since arrays are limited to 10000 return values larger that that are set to the maximum. Example:

    function() { return this.parent.datalen.value }
    

    A shorthand is also available: You can specify the name of another element (must be a primitive type like integers, pointers, enums, flags)

    Note

    Referencing the name of another element only available in JavaScript as of Okteta 0.11, XML has always supported it. For older versions the function() { return ... } syntax must be used.


    When reading at runtime it will always return the current length as a number. There is no way to obtain the length function dynamically at runtime.

    Warning

    When writing to this property currently only unsigned integers are accepted. The only way to change this function is to replace the array with a new array with a different function. This will be fixed in Okteta 0.11.


    Strings

    Represents a string with a specified encoding. By default strings will be C-style null terminated strings.

    Property encoding: string

    This property can be any of the following.

    • ascii for a [1] encoded string
    • latin1 for a 8859-1 encoded string
    • utf-8 for [2] encoded string
    • utf-16 or utf-16-le for a [3] little endian encoded string
    • utf-16-be for a [4] big endian encoded string
    • utf-32 or utf32-le for a [5] little endian encoded string
    • utf-32-be for a [6] big endian encoded string

    Note

    The hyphens may be omitted. I.e. utf16le is the same as utf-16-le


    Property terminatedBy: unsigned integer

    This property determines the length of the string. The string extends until the current [7] is equal to terminatedBy. For C-style null terminated strings set this property to zero.

    Property maxByteCount: unsigned integer

    Set the maximum number of bytes in this string.

    Note

    For UTF-16 maxCharCount is no equal to 2*maxByteCount, since there may be surrogate characters

    Property maxCharCount: unsigned integer

    Set the maximum number of bytes in this string.

    Property byteCount (runtime only): unsigned integer

    This read-only property holds the number of bytes this string contains.

    Property charCount (runtime only): unsigned integer

    This read-only property holds the number of code points in this string

    Primitive data types

    The following primitive data types are available

    • int8, int16, int32, int64: Signed integers with 8, 16, 32 or 64 bits precision
    • uint8, uint16, uint32, uint64: unsigned integers with 8, 16, 32 or 64 bits precision
    • bool8, bool16, bool32, bool64: A boolean value (0 is false, any other value is true)
    • float: a 32 bit IEEE754 floating point number
    • double: a 64 bit IEEE754 floating point number
    • char: a single ASCII character (8 bits, although only values up to 0x80 are valid)

    Property value (runtime only): number or string

    Holds the value of this element.

    Warning

    Due to JavaScript limitations (every number is stored as a 64 bit floating point number) some values larger than 32 bits cannot be represented exactly, therefore for all 64 bit values decimal strings are used instead. This means you should always use strings in comparisons with 64 bit values. E.g,: if (this.value == "9007199254740993")


    Bitfields

    noframe
    noframe

    TODO

    TODO


    Pointers

    Pointers are primitive data types (the value property is also available) that also act as containers. The children of a pointer will be read at the offset equal to the value of the pointer.

    Note

    At the moment (Okteta 0.10) only absolute pointers are supported. Relative pointers will be available in Okteta 0.11.

    Property type: datatype

    The underlying primitive type of this pointer. Must be on of uint8, uint16, uint32 or uint64.

    Property target: datatype

    This property holds the type that is being pointed to. Can be any other element, even another pointer.

    Enumerations

    Enumerations are a primitive type where the textual value will be displayed instead of the numeric value. This is analogous to the C/C++ enum type. Since enumerations are primitive types they have the same properties as all primitive data types.

    Note

    The value property holds the numeric value not the textual one

    Property type: string

    This property must hold one of the strings int8, int16, int32, int64, uint8, uint16, uint32, uint64, since only integer type enumerations are allowed. A workaround for floating-point enumerations is interpreting the bit pattern of the corresponding floating-point value as an integer and using that for the enumeration value.

    Property enumName: string

    Contains the name of the underlying enumeration. This property exists so that the type column can display the name of the enumeration referred to instead of simply the string enum

    Property enumValues: map

    A list of key-value pairs which is used to perform the integer value to text translation

    Bitflags

    Bitflags are very similar to enumerations, only that a bitwise-or of the appropriate textual values will be displayed. For flags usually the enumerated values will be single set bits (i.e. numbers that are powers of 2), but any other value is also supported. If there are enum values that completly contain the bits of other values (e.g. 7 contains 2 and 4) only that value will be displayed.

    Example: Asumming you have an enumeration representing UNIX file access rights: R = 4, W = 2, X = 1. Then the value 7 will be displayed as R | W | X. If you add another value ALL_RIGHTS = 7 to your enumeration the value 7 will be displayed as ALL_RIGHTS instead. This can be useful if you want to have a shorter or more readable string displayed.

    Properties are the same as the properties for enumerations.

    Properties common to all types

    Property name: string

    The name of this element in the resulting structure. Note that if you name your element the same as an property you cannot access it with the normal syntax in script code. If you have an element named byteOrder you have to write parent.child("byteOrder") instead of parent.byteOrder since the latter will return the value of the byteOrder property of parent instead of the child element.

    Property byteOrder: string

    Set the endianess. The following values are possible:

    • big-endian: Always use big endian
    • little-endian: Always use little endian
    • from-settings: Always use the value specified in the settings page
    • inherit: Use the value of the parent element . For the root element this is equivalent to from-settings

    The default value is inherit.

    Property updateFunc: function

    A JavaScript function which gets called every time this element is read. Allows you to modify this element and its children. This allows your structure to dynamically change its visualization depending on the data. Since this function gets called before the data for this element is read you can only read the values of elements that come before. Specifically accessing this.value will not work.

    Property validationFunc:function

    A JavaScript function which gets called whenever the user presses the Validate button. This function should return a boolean value or a string. If a string is returned that string will be displayed as the validation failure message. Returning true will mark the element as sucessfully validated, false will display a validation error without a message.

    Examples:

    function() { return this.value == 0x42 }
    
    function() { if (this.value >= 0x80) return "Invalid ASCII character"; else return true; }
    

    Warning

    When writing definitions in XML you have to escape some characters, since otherwise the document may be malformed

    Property parent (runtime only): datatype

    This read-only property can be used at runtime to access the parent element. Should not be read in the root element.

    Property wasAbleToRead (runtime only): boolean

    This property holds the value true if the value could be read, or false if end of file was reached.

    Property validationError (runtime only): string

    This property can be written to inside a validation function. It is useful if you want to validate more than one child without writing a validation function for each of them. Example:

    function() {
      //ensure that the magic values hold 0xdeadbeef
      var valid = true;
      if (this.magic[0] != 0xde) {
        this.magic[0].validationError = "This byte must have the value 0xde";
        valid = false;
      } else if (this.magic[1] != 0xad) {
        this.magic[1].validationError = "This byte must have the value 0xad";
        valid = false;
      } else if (this.magic[2] != 0xbe) {
        this.magic[2].validationError = "This byte must have the value 0xbe";
        valid = false;
      } else if (this.magic[3] != 0xef) {
        this.magic[3].validationError = "This byte must have the value 0xef";
        valid = false;
      }
      return valid;
    }
    

    Obviously this example does not make that much sense since it would be easier to simply write

    function() {
      //ensure that the magic values hold 0xdeadbeef
      if (this.magic[0] == 0xde && this.magic[1] == 0xad && this.magic[2] == 0xbe && this.magic[3] == 0xef) {
        return true;
      } else {
        return "Magic bytes must be equal to 0xde, 0xad, 0xbe, 0xef";
      }
    }
    

    but there may be cases where this makes sense.

    XML structures

    A sample .osd is a XML file with <data> as the root element and may look as follows:

    <?xml version="1.0" encoding="UTF-8"?>
    <data>
      <struct name="pngHeader">
        <array name="signature" length="12">
          <primitive name="val" type="char" />
        </array>
        <struct name="ImageHeader">
          <array name="signature" length="4">
            <primitive name="val" type="char" />
          </array>
          <primitive name="width" type="uint32" />
          <primitive name="height" type="uint32" />
          <primitive name="bitDepth" type="uint8" />
          <primitive name="colourType" type="uint8" />
          <primitive name="compressionMethod" type="uint8" />
          <primitive name="filterMethod" type="uint8" />
          <primitive name="interlaceMethod" type="uint8" />
        </struct>
      </struct>
    </data>
    

    Type elements

    The types are represented in .osd files with the following XML elements:

    Properties may be specified either as an XML attribute or as an XML child element (mostly useful for properties which contain longer text and linebreaks like e.g. updateFunc). The following two declarations are equivalent:

    <primitive name="foo" type="uint32">
    
    <primitive>
      <name>foo</name>
      <type>uint32</type>
    </primitive>
    

    Obviously properties which require a list or a datatype cannot be used as a XML attribute, but must use XML elements instead.

    Special rules

    In general the properties in the datatypes section will map one-to-one to XML attributes/elements. The following exceptions exist:

    <struct>, <union>, <taggedUnion>

    All subelements that are a valid type elements are added to the fields property. A <fields> element will be ignored.

    <array>

    To reduce verbosity the type attribute may be omitted. If not specified the first child element with a valid type will be used instead.

    I.e. the following two declarations are equivalent:

    <array name="signature" length="4">
      <primitive name="val" type="char" />
    </array>
    

    and

    <array name="signature" length="4">
      <type>
        <primitive name="val" type="char" />
      </type>
    </array>
    

    As of Okteta 0.11 it is possible to write e.g <array type="uint8"> as a further shorthand. This works for all primitive type strings

    <pointer>

    A shorthand for using the type property exists. Instead of needing a child <type> element, it is also possible to write <pointer type="uint32" />. Remember that only unsigned integer types are allowed.

    To reduce verbosity the target attribute may be omitted. If no <target> element exists, the first child element with a valid type will be used instead.

    <enum>, <flags>

    The enumName and enumValues properties do not map directly to XML. Instead there is an XML attribute enum which references an <enumDef> element defined somewhere below the <data> element. This is so that the enum values can be reused by other <enum> elements

    The enumName property maps to the attribute name of the <enumDef>

    The enumValues property maps to the list of <entry> elements in the <enumDef>

    Example:

    <data>
      <enumDef name="numbers" type="uint8">
        <entry name="ONE" value="1" />
        <entry name="TWO" value="2" />
        <entry name="THREE" value="3" />
        <entry name="FOUR" value="4" />
      </enumDef>
      <array length="4" name="enums">
        <enum name="number" enum="numbers" type="uint8" />
      </array>
    </data>
    

    Note

    The type attribute on the <enumDef> element is needed so that it can be verified that all values are within the representable range for that type. This information is pretty redundant, but is due to the way the code was originally written.


    <primitive>

    As of Okteta 0.11 it is possible to write e.g. <uint8 />, <float /> instead of <primitive type="..." />

    JavaScript structures

    A JavaScript structure definition must contain a function called init which returns an object that can be converted to one of the datatypes.

    noframe
    noframe

    TODO

    0.11: list of objects


    The sample from the #XML_structures section would look as follows in JavaScript:

    function init() {
      var obj = struct({
        signature: array(char(), 12),
        ImageHeader: struct({
          signature: array(char(), 4),
          width: uint32(),
          height: uint32(),
          bitDepth: uint32(),
          colourType: uint32(),
          compressionMethod: uint8(),
          filterMethod: uint8(),
          interlaceMethod: uint8()
        }
      });
      obj.name = "pngHeader";
      return obj;
    }
    

    Type functions

    The following functions are available to create data types:


    To set the additional properties that are not set by these functions you can either set them using standard JavaScript property assignment or alternatively there is also a set() function defined on all the returned objects:


      var foo = string("utf8");
      foo.maxByteCount = 12;
      foo.validationFunc = function() { ... };
      //also possible like this:
      var foo2 = string("utf8").set({maxByteCount: 12, validationFunc: function() { ... }});
      //this syntax is mainly useful in inline expressions (the deeper nested you structure is the more useful)
      var something = struct({x: uint8(), y: uint8(), name: string("utf8").set({maxByteCount: 12})});
      //without the set function this would look like this:
      var something2 = struct({x: uint8(), y: uint8(), name: string("utf8")});
      something2.fields.name.maxByteCount = 12;
    

    There is also a setUpdate(func) and setValidation(func) function defined for all those objects to save a bit of typing:

      //these 3 object are equivalent
      var x1 = array(uint8(), 12).setUpdate(function() { ... });
      var x2 = array(uint8(), 12).set(updateFunc: function() { ... });
      var x3 = array(uint8(), 12);
      x3.updateFunc = function() { ... };
      //the same here
      var y1 = array(uint8(), 12).setValidationUpdate(function() { ... });
      var y2 = array(uint8(), 12).set(validationFunc: function() { ... });
      var y3 = array(uint8(), 12);
      y3.validationFunc = function() { ... };
    


    Examples

    There are some examples in SVN

    noframe
    noframe

    TODO

    Commented Examples on wiki