Okteta/Writing structure definitions
It is possible to define structures using either XML or JavaScript. Errors in the definition are viewable by opening the script console in Okteta.
Directory layout
Each structure definition consists of a folder containing two files. Inside this folder there must be a .desktop file (recommended name is metadata.desktop) for the metadata.
If you decide to use XML you will need a <id>.osd file. The id is the value of the X-KDE-PluginInfo-Name entry in the metadata
If you use JavaScript instead you will need a file named main.js
The metadata file
The metadata file is a standard .desktop file. It has the following entries in the [Desktop Entry] section:
Icon (optional) | The icon that will be displayed in the configuration UI. You can use any icon name known to KDE, or alternatively an absolute filesystem path. |
Type (required) | Always use Service here |
ServiceTypes (required) | Always use KPluginInfo here |
Name (required) | The name of this structure. Will be displayed in the selection UI. |
Comment (optional) | A short description of this structure. Will be displayed in the selection UI. |
X-KDE-PluginInfo-Author (optional) | Your name. Will be displayed in the selection UI. |
X-KDE-PluginInfo-Email (optional) | Your email address. Will be displayed in the selection UI. |
X-KDE-PluginInfo-Name (required) | This entry will be used as the ID of this structure |
X-KDE-PluginInfo-Version (required) | A version number for your structure. |
X-KDE-PluginInfo-Website (optional) | A website for this structure like e.g. https://kde-files.org/content/show.php/rpm+structure+definition?content=147699 |
X-KDE-PluginInfo-Category (required) | The value here must be either structure if you are writing an XML structure or structure/js if you are writing a JavaScript structure |
X-KDE-PluginInfo-License (optional) | A license like e.g. GPLv3 |
A valid sample metadata.desktop could look as follows:
[Desktop Entry] Icon=application-zip Type=Service ServiceTypes=KPluginInfo Name=Foo files Comment=My own custom compression format (.foo) X-KDE-PluginInfo-Author=Foo Bar X-KDE-PluginInfo-Email=[email protected] X-KDE-PluginInfo-Name=compressed-file X-KDE-PluginInfo-Version=0.1 X-KDE-PluginInfo-Website=http://www.example.org/ X-KDE-PluginInfo-Category=structure/js X-KDE-PluginInfo-License=GPLv3
Available datatypes and their properties
Structures
A structure is a container type in which the children are read sequentially. This is analogous to the C/C++ struct
type.
Property fields: list of datatypes
This property holds a list with the children of this struct
. They will be read in the order they are defined.
This property is also available at runtime to allow modifying the field.
Property childCount (runtime only): unsigned integer
This read-only property holds the number of fields.
Unions
A union is a container type in which the children are read sequentially, but always from the same starting offset. This is analogous to the C/C++ union
type.
Unions have the same properties as structures
Tagged unions
This datatype exists to simplify using structures that have a different layout depending on one key field. This is intended for C/C++ structures like this (example from Wikipedia):
enum ShapeKind { Square, Rectangle, Circle };
struct Shape {
int centerx;
int centery;
enum ShapeKind kind;
union {
struct { int side; } squareData;
struct { int length, height; } rectangleData;
struct { int radius; } circleData;
} shapeKindData;
};
struct
In the structures view only the relevant type will be displayed. E.g. if kind has the value Square the structures view will display a struct Square with the fields centerx, centery, kind and side.
Property fields: list of datatypes
This is the list of fields that are common to each type. In the example, this would be
- int centerx
- int centery
- enum ShapeKind kind
Property alternatives: list of object
This is a list of the alternatives that exist for this tagged union. Each alternative consists of three properties:
- fields: The list of fields that this alternative has
- selectIf: A function that returns true if this alternative should be selected. If there is only one field in the tagged union then an integer value is also permitted. In that case, this alternative will be selected whenever that integer value is selected.
- structName: The name that should be displayed for the whole structure if this alternative is selected.
fields | selectIf | structName |
---|---|---|
int side | function() { return this.kind == Square; } | "Square" |
int length, int height | function() { return this.kind == Rectangle; } | "Rectangle" |
int radius | function() { return this.kind == Circle; } | "Circle" |
Property defaultFields: list of datatypes
This property defines which fields should be displayed if none of the alternatives matched. By default, this will be none.
Arrays
A collection of elements which have the same type. This is analogous to the C/C++ array concept.
Property type: datatype
The type of this array. Can be any other element, even another array.
Property length: unsigned integer or function
Holds the length of the array. Can be either a fixed number or a JavaScript function that returns a number. If set to a JavaScript function, this function will be called every time before the array is read and set the array length to the return value. Since arrays are limited to 10000 return values larger than that are set to the maximum. Example:
function() { return this.parent.datalen.value }
A shorthand is also available: You can specify the name of another element (must be a primitive type like integers, pointers, enums, flags)
When reading at runtime it will always return the current length as a number. There is no way to obtain the length function dynamically at runtime.
Strings
Represents a string with a specified encoding. By default strings will be C-style null terminated strings.
Property encoding: string
This property can be any of the following.
- ascii for a US-ASCII encoded string
- latin1 for a ISO 8859-1 encoded string
- utf-8 for UTF-8 encoded string
- utf-16 or utf-16-le for a UTF-16 little endian encoded string
- utf-16-be for a UTF-16 big endian encoded string
- utf-32 or utf32-le for a UTF-32 little endian encoded string
- utf-32-be for a UTF-32 big endian encoded string
Property terminatedBy: unsigned integer
This property determines the length of the string. The string extends until the current [1] is equal to terminatedBy. For C-style null terminated strings set this property to zero.
Property maxByteCount: unsigned integer
Set the maximum number of bytes in this string.
Property maxCharCount: unsigned integer
Set the maximum number of bytes in this string.
Property byteCount (runtime only): unsigned integer
This read-only property holds the number of bytes this string contains.
Property charCount (runtime only): unsigned integer
This read-only property holds the number of code points in this string
Primitive data types
The following primitive data types are available
- int8, int16, int32, int64: Signed integers with 8, 16, 32 or 64 bits precision
- uint8, uint16, uint32, uint64: unsigned integers with 8, 16, 32 or 64 bits precision
- bool8, bool16, bool32, bool64: A boolean value (0 is false, any other value is true)
- float: a 32 bit IEEE754 floating point number
- double: a 64 bit IEEE754 floating point number
- char: a single ASCII character (8 bits, although only values up to 0x80 are valid)
Property value (runtime only): number or string
Holds the value of this element.
Bitfields
Pointers
Pointers are primitive data types (the value property is also available) that also act as containers. The children of a pointer will be read at the offset equal to the value of the pointer.
Property type: datatype
The underlying primitive type of this pointer. Must be on of uint8, uint16, uint32 or uint64.
Property target: datatype
This property holds the type that is being pointed to. Can be any other element, even another pointer.
Enumerations
Enumerations are a primitive type where the textual value will be displayed instead of the numeric value.
This is analogous to the C/C++ enum
type.
Since enumerations are primitive types they have the same properties as all primitive data types.
Property type: string
This property must hold one of the strings int8, int16, int32, int64, uint8, uint16, uint32, uint64, since only integer type enumerations are allowed. A workaround for floating-point enumerations is interpreting the bit pattern of the corresponding floating-point value as an integer and using that for the enumeration value.
Property enumName: string
Contains the name of the underlying enumeration. This property exists so that the type column can display the name of the enumeration referred to instead of simply the string enum
Property enumValues: map
A list of key-value pairs which is used to perform the integer value to text translation
Bitflags
Bitflags are very similar to enumerations, only that a bitwise-or of the appropriate textual values will be displayed. For flags usually the enumerated values will be single set bits (i.e. numbers that are powers of 2), but any other value is also supported. If there are enum values that completely contain the bits of other values (e.g. 7 contains 2 and 4) only that value will be displayed.
Example: Asumming you have an enumeration representing UNIX file access rights: R = 4, W = 2, X = 1
. Then the value 7 will be displayed as R | W | X. If you add another value ALL_RIGHTS = 7
to your enumeration the value 7 will be displayed as ALL_RIGHTS instead. This can be useful if you want to have a shorter or more readable string displayed.
Properties are the same as the properties for enumerations.
Properties common to all types
Property defaultLockOffset: number
Since Okteta 0.11
Setting this property allow you to ensure that the structure is locked at that offset whenever a new file is opened. Otherwise, it will be read from the cursor position. When this property is set you can of course still unlock the structure manually from the UI.
This is mainly intended for e.g. file headers, which will usually start at offset 0. This way you no longer have to move the cursor to offset 0 select the structure and then press the lock button.
The property is only useful for the root element. If you put it on any other element it will be ignored. For an example look here (XML) or here (JavaScript),
Property name: string
The name of this element in the resulting structure. Note that if you name your element the same as a property you cannot access it with the normal syntax in script code. If you have an element named byteOrder you have to write parent.child("byteOrder") instead of parent.byteOrder since the latter will return the value of the byteOrder property of parent instead of the child element.
Property byteOrder: string
Set the endianess. The following values are possible:
- big-endian: Always use big endian
- little-endian: Always use little endian
- from-settings: Always use the value specified in the settings page
- inherit: Use the value of the parent element . For the root element this is equivalent to from-settings
The default value is inherit.
Property updateFunc: function
A JavaScript function which gets called every time this element is read. Allows you to modify this element and its children.
This allows your structure to dynamically change its visualization depending on the data.
Since this function gets called before the data for this element is read you can only read the values of elements that come before. Specifically accessing this.value
will not work.
See #Update_and_validation_functions
Property validationFunc:function
A JavaScript function which gets called whenever the user presses the Validate button. This function should return a boolean value or a string. If a string is returned that string will be displayed as the validation failure message. Returning true will mark the element as successfully validated, false will display a validation error without a message.
Examples:
function() { return this.value == 0x42 }
function() { if (this.value >= 0x80) return "Invalid ASCII character"; else return true; }
Property parent (runtime only): datatype
This read-only property can be used at runtime to access the parent element. Should not be read in the root element.
Property wasAbleToRead (runtime only): boolean
This property holds the value true if the value could be read, or false if end of file was reached.
Property validationError (runtime only): string
This property can be written to inside a validation function. It is useful if you want to validate more than one child without writing a validation function for each of them. Example:
function() {
//ensure that the magic values hold 0xdeadbeef
var valid = true;
if (this.magic[0] != 0xde) {
this.magic[0].validationError = "This byte must have the value 0xde";
valid = false;
} else if (this.magic[1] != 0xad) {
this.magic[1].validationError = "This byte must have the value 0xad";
valid = false;
} else if (this.magic[2] != 0xbe) {
this.magic[2].validationError = "This byte must have the value 0xbe";
valid = false;
} else if (this.magic[3] != 0xef) {
this.magic[3].validationError = "This byte must have the value 0xef";
valid = false;
}
return valid;
}
Obviously this example does not make that much sense since it would be easier to simply write
function() { //ensure that the magic values hold 0xdeadbeef if (this.magic[0] == 0xde && this.magic[1] == 0xad && this.magic[2] == 0xbe && this.magic[3] == 0xef) { return true; } else { return "Magic bytes must be equal to 0xde, 0xad, 0xbe, 0xef"; } }
but there may be cases where this makes sense.
XML structures
A sample .osd is a XML file with <data> as the root element and may look as follows:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<struct name="pngHeader">
<array name="signature" length="12">
<primitive name="val" type="char" />
</array>
<struct name="ImageHeader">
<array name="signature" length="4">
<primitive name="val" type="char" />
</array>
<primitive name="width" type="uint32" />
<primitive name="height" type="uint32" />
<primitive name="bitDepth" type="uint8" />
<primitive name="colourType" type="uint8" />
<primitive name="compressionMethod" type="uint8" />
<primitive name="filterMethod" type="uint8" />
<primitive name="interlaceMethod" type="uint8" />
</struct>
</struct>
</data>
Type elements
The types are represented in .osd files with the following XML elements:
- <struct> for #Structures
- <union> for #Unions
- <taggedUnion> for #Tagged_unions
- <array> for #Arrays
- <string> for #Strings
- <primitive> for #Primitive_data_types. The XML type attribute maps to the primitive type indentifiers.
- <bitfield> for #Bitfields
- <pointer> for #Pointers
- <enum> for #Enumerations
- <flags> for #Bitflags
Properties may be specified either as an XML attribute or as an XML child element (mostly useful for properties which contain longer text and linebreaks like e.g. updateFunc). The following two declarations are equivalent:
<primitive name="foo" type="uint32">
<primitive>
<name>foo</name>
<type>uint32</type>
</primitive>
Obviously, properties which require a list or a datatype cannot be used as an XML attribute but must use XML elements instead.
Special rules
In general the properties in the datatypes section will map one-to-one to XML attributes/elements. The following exceptions exist:
<struct>, <union>, <taggedUnion>
All subelements that are a valid type elements are added to the fields property. A <fields> element will be ignored.
<array>
To reduce verbosity the type attribute may be omitted. If not specified the first child element with a valid type will be used instead.
I.e. the following two declarations are equivalent:
<array name="signature" length="4">
<primitive name="val" type="char" />
</array>
and
<array name="signature" length="4">
<type>
<primitive name="val" type="char" />
</type>
</array>
As of Okteta 0.11 it is possible to write e.g <array type="uint8">
as a further shorthand. This works for all primitive type strings
<pointer>
A shorthand for using the type property exists. Instead of needing a child <type> element, it is also possible to write <pointer type="uint32" />
. Remember that only unsigned integer types are allowed.
To reduce verbosity the target attribute may be omitted. If no <target> element exists, the first child element with a valid type will be used instead.
<enum>, <flags>
The enumName and enumValues properties do not map directly to XML. Instead there is an XML attribute enum which references an <enumDef> element defined somewhere below the <data> element. This is so that the enum values can be reused by other <enum> elements
The enumName property maps to the attribute name of the <enumDef>
The enumValues property maps to the list of <entry> elements in the <enumDef>
Example:
<data>
<enumDef name="numbers" type="uint8">
<entry name="ONE" value="1" />
<entry name="TWO" value="2" />
<entry name="THREE" value="3" />
<entry name="FOUR" value="4" />
</enumDef>
<array length="4" name="enums">
<enum name="number" enum="numbers" type="uint8" />
</array>
</data>
<primitive>
As of Okteta 0.11 it is possible to write e.g. <uint8 />, <float /> instead of <primitive type="..." />
JavaScript structures
A JavaScript structure definition must contain a function called init which returns an object that can be converted to one of the datatypes.
The sample from the #XML_structures section would look as follows in JavaScript:
function init() {
var obj = struct({
signature: array(char(), 12),
ImageHeader: struct({
signature: array(char(), 4),
width: uint32(),
height: uint32(),
bitDepth: uint32(),
colourType: uint32(),
compressionMethod: uint8(),
filterMethod: uint8(),
interlaceMethod: uint8()
})
});
obj.name = "pngHeader";
return obj;
}
Type functions
The following functions are available to create data types:
struct(fields)
for #Structures
union(fields)
for #Unions
taggedUnion(fields, alternatives, defaultFields)
for #Tagged_unions
The parameters fields and defaultFields are the same as the fields parameter of struct() or union(). The parameter alternatives is a JavaScript list of objects that have the properties fields, selectIf and optionally structName. To simplify creating this object an function alternative(selectIf, fields, structName) exists (3rd parameter is optional).
The example from the #Tagged_unions section would look as follows in JavaScript:
var shapeKinds = { Square: 0, Rectangle: 1, Circle: 2 };
var shapes = taggedUnion(
{
centerx: int32(),
centery: int32(),
kind: enumeration("ShapeKind", int32(), shapeKinds)
},
[
alternative(
function() { return this.kind == shapeKinds.Square; },
{ side: int32() },
"Square"
),
alternative(
function() { return this.kind == shapeKinds.Rectangle; },
{ length: int32(), height: int32() },
"Rectangle"
),
alternative(
function() { return this.kind == shapeKinds.Circle; },
{ radius: int32() },
"Circle"
)
]
);
array(type, length)
for #Arrays
string(encoding)
for #Strings
primitive types
To create each of the #Primitive_data_types a function with the same name exists.
E.g. uint8()
, int32()
, float()
, char()
, etc.
bitfield(type, width)
for #Bitfields
pointer(type, target)
for #Pointers
enumeration(enumName, type, enumValues)
for #Enumerations. Using enum is not possible since this is a JavaScript reserved keyword. Enum values can be any JavaScript object that can be interpreted as a string-number map. For example:
var enumValues = { RED: 1, GREEN: "2", BLUE: "0xffeeddccbbaa9988" };
flags(enumName, type, enumValues)
for #Bitflags
Setting properties
To set the additional properties that are not set by these functions you can either set them using standard JavaScript property assignment or alternatively there is also a set() function defined on all the returned objects:
var foo = string("utf8");
foo.maxByteCount = 12;
foo.validationFunc = function() { ... };
//also possible like this:
var foo2 = string("utf8").set({
maxByteCount: 12,
validationFunc: function() { ... }
});
//this syntax is mainly useful in inline expressions (the deeper nested you structure is the more useful)
var something = struct({
x: uint8(),
y: uint8(),
name: string("utf8").set({maxByteCount: 12})
});
//without the set function this would look like this:
var something2 = struct({
x: uint8(),
y: uint8(),
name: string("utf8")
});
something2.fields.name.maxByteCount = 12;
There is also a setUpdate(func) and setValidation(func) function defined for all those objects to save a bit of typing:
//these 3 object are equivalent
var x1 = array(uint8(), 12).setUpdate(function() { ... });
var x2 = array(uint8(), 12).set({
updateFunc: function() { ... }
});
var x3 = array(uint8(), 12);
x3.updateFunc = function() { ... };
//the same here
var y1 = array(uint8(), 12).setValidation(function() { ... });
var y2 = array(uint8(), 12).set({
validationFunc: function() { ... }
});
var y3 = array(uint8(), 12);
y3.validationFunc = function() { ... };
Update and validation functions
At runtime (during evaluation of updateFunc or validationFunc) accessing properties is slightly different than within the init() function, or the XML definition. This is due to the fact that at runtime you are accessing a JavaScript proxy to a C++ object, and at definition time it is just a plain JavaScript object.
A update function looks as follows:
function(root) {
//modify this object
}
The argument root is optional, if you do not need it, it can be omitted. It holds the root element of the current structure so that you don't have to write this.parent.parent....
to access it.
Within that function this refers to the current element.
Examples for what is possible in an updateFunc:
function init() {
var obj = struct({foo: uint32(), numbers: array(int16(), 10), childCount: int8() });
//note that field name childCount is also the name of a property of struct
obj.updateFunc = updateMyStruct; //does not have to be defined inline, can be anywhere in the file
return obj;
}
function updateMyStruct(root) {
var fooValue = this.foo.value; //reads the value of field foo
var arrayLen = this.numbers.length; //10
var wrongChildCountValue = this.childCount.value; //ERROR: cannot access childCount like this
//this.childCount will return the number 3, which has no property named value
//the correct way to access it is by writing
var childCountValue = this.child("childCount").value;
//to access fields whose name matches that of a property you must always use the .child("...") syntax
//children of an array can be accessed using standard array syntax (starting at 0):
var firstNumber = this.numbers[0];
this.name = "updatedStruct";
}
Properties can be written to just the same as within the init() function. However an error will be logged evertime you try to access a property that does not exist. This is not possible in the init function due to the fact that these objects are plain JavaScript objects. Every property read or write access in the update functions will be handled by C++ code, which makes the logging of errors possible.
Examples
There are some examples in Git