--[ Yuefei ]-- --[ 2007-05-22 09:25 ]-- C data structure marshalling/unmarshalling tool. Generates code and headers from a structure declaration. yuefei takes a structure declaration as its input, along with type information. It generates C struct definitions on its output along with routines for packing and unpacking them. yuefei assumes that sizeof(char) is 1, that a char is a byte, and that a byte is eight bits. We also assume that the C library for the target provides intN_t and u_intN_t types which are subsequently used by the standard type library. Each input file can declare a preable and postable for each of the header and implementation. These are explicitly first and last in the files. You may do this to declare for example, company standard headers and footers. Each type definition is of the form: type "yuefeitypename" { ctype = "C type name", width = widthinbytes, unpack = [[ code to unpack ]], pack = [[ code to pack ]], } If the pack/unpack methods differ for big/little storage then you can say packle and packbe. If the structure does not specify little/big then the generic pack/unpack will be used instead. yuefei will produce both le and be variants of the pack and unpack routines and will also provide defines to one or the other based on the target endianness which is determined at build time. You can declare {pre,post}{pack,unpack}{,le,be} statements which will be applied as appropriate. The end{pack,unpack}{,le,be} statements will be applied right at the end of the appropriate operation. The {header,impl}{pre,post}amble statements will be included at the top and bottom of the generated C output. There is no ordering guarantee on these. They will only be included once per type used in the output. You might use preamble to include necessary headers for example. In unpack you have: RESULT Replaced by the expression where this value goes BUFFER Replaced by the expression of the current input buffer position In pack you have: BUFFER Replaced by the expression of the current ouput buffer position INPUT Replaced by the expression of the value to be packed. BUFFER is always of the type 'byte *' and INPUT/RESULT are always of the ctype of the element. (byte is unsigned char) In both, the token MEMBER will be replaced with the name of the member being packed/unpacked as a human comprehensible string. MEMBERPATH will be replaced by a string which represents the member as a path from the top of the structure being packed/unpacked. Support routines for sign extension and the like can also be used. A full list of them can be found later. In any of the statements you can call ERROR. ERROR takes an integer value which will be returned from the routine. Note that zero is considered success so do not put ERROR(0) unless you really *REALLY* mean it. If you wish to assign an error string then yuefei provides the following: ERROR_SET("some string"); ERROR_CAT("some string"); The form ARG(foo) gets replaced by the value of foo in the instance or type declaration. The form BUFPOS(foo) evaluates to a pointer to the location of member foo in the buffer. Indeed BUFFER is in fact an expansion to BUFPOS(ARG(name)). The form BUFEND(foo) evaluates to a pointer to the byte beyond the end of member foo in the buffer. Thusly BUFEND(foo) - BUFPOS(foo) is the number of bytes member foo takes up in the stream. These macros are only guaranteed to work if all the members before the one referenced have been expanded or are are fixed in size. Thusly if you rely on things which come after this specific value in the structure, you should ensure that you only use these macros in the "endunpack" or "endpack" statements. Here are the trivial integer types for 8,16,24,32 bits, signed and unsigned type "uint8" { ctype = "u_int8_t", width = 1, unpack = [[ RESULT = *BUFFER; ]], pack = [[ *BUFFER = INPUT; ]] } type "uint16" { ctype = "u_int16_t", width = 2, unpackle = [[ RESULT = *BUFFER | (*(BUFFER + 1) << 8); ]], packle = [[ *BUFFER = INPUT & 0xFF; *(BUFFER + 1) = (INPUT >> 8) & 0xFF; ]], unpackbe = [[ RESULT = (*BUFFER << 8) | *(BUFFER + 1); ]], packbe = [[ *BUFFER = (INPUT >> 8) & 0xFF; *(BUFFER + 1) = INPUT & 0xFF; ]] } type "uint24" { ctype = "u_int32_t", width = 3, unpackle = [[ RESULT = BUFFER[0] | (BUFFER[1] << 8) | (BUFFER[2] << 16); ]], packle = [[ BUFFER[0] = INPUT & 0xFF; BUFFER[1] = (INPUT >> 8) & 0xFF; BUFFER[2] = (INPUT >> 16) & 0xFF; ]], unpackbe = [[ RESULT = BUFFER[2] | (BUFFER[1] << 8) | (BUFFER[0] << 16); ]], packbe = [[ BUFFER[0] = (INPUT >> 16) & 0xFF; BUFFER[1] = (INPUT >> 8) & 0xFF; BUFFER[2] = INPUT & 0xFF; ]] } type "uint32" { ctype = "u_int32_t", width = 4, unpackle = [[ RESULT = BUFFER[0] | (BUFFER[1] << 8) | (BUFFER[2] << 16) | (BUFFER[3] << 24); ]], packle = [[ BUFFER[0] = INPUT & 0xFF; BUFFER[1] = (INPUT >> 8) & 0xFF; BUFFER[2] = (INPUT >> 16) & 0xFF; BUFFER[3] = (INPUT >> 24) & 0xFF; ]], unpackbe = [[ RESULT = BUFFER[3] | (BUFFER[2] << 8) | (BUFFER[1] << 16) | (BUFFER[0] << 24); ]], packbe = [[ BUFFER[0] = (INPUT >> 24) & 0xFF; BUFFER[1] = (INPUT >> 16) & 0xFF; BUFFER[2] = (INPUT >> 8) & 0xFF; BUFFER[3] = INPUT & 0xFF; ]] } type "sint8" { ctyle = "int8_t", width = 1, unpack = [[ RESULT = (int8_t)(*BUFFER); ]], pack = [[ *BUFFER = (byte)INPUT; ]] } type "sint16" { ctype = "int16_t", width = 2, unpackle = [[ RESULT = sign_extend(*BUFFER | (*(BUFFER + 1) << 8), 16, sizeof(int16_t) * 8); ]], packle = [[ *BUFFER = INPUT & 0xFF; *(BUFFER + 1) = (INPUT >> 8) & 0xFF; ]], unpackbe = [[ RESULT = sign_extend((*BUFFER << 8) | *(BUFFER + 1), 16, sizeof(int16_t) * 8); ]], packbe = [[ *BUFFER = (INPUT >> 8) & 0xFF; *(BUFFER + 1) = INPUT & 0xFF; ]] } type "sint24" { ctype = "int32_t", width = 3, unpackle = [[ RESULT = sign_extend(BUFFER[0] | (BUFFER[1] << 8) | (BUFFER[2] << 16), 24, sizeof(int32_t) * 8); ]], packle = [[ BUFFER[0] = INPUT & 0xFF; BUFFER[1] = (INPUT >> 8) & 0xFF; BUFFER[2] = (INPUT >> 16) & 0xFF; ]], unpackbe = [[ RESULT = sign_extend(BUFFER[2] | (BUFFER[1] << 8) | (BUFFER[0] << 16), 24, sizeof(int32_t) * 8); ]], packbe = [[ BUFFER[0] = (INPUT >> 16) & 0xFF; BUFFER[1] = (INPUT >> 8) & 0xFF; BUFFER[2] = INPUT & 0xFF; ]] } type "sint32" { ctype = "int32_t", width = 4, unpackle = [[ RESULT = sign_extend(BUFFER[0] | (BUFFER[1] << 8) | (BUFFER[2] << 16) | (BUFFER[3] << 24), 32, sizeof(int32_t) * 8); ]], packle = [[ BUFFER[0] = INPUT & 0xFF; BUFFER[1] = (INPUT >> 8) & 0xFF; BUFFER[2] = (INPUT >> 16) & 0xFF; BUFFER[3] = (INPUT >> 24) & 0xFF; ]], unpackbe = [[ RESULT = sign_extend(BUFFER[3] | (BUFFER[2] << 8) | (BUFFER[1] << 16) | (BUFFER[0] << 24), 32, sizeof(int32_t) * 8); ]], packbe = [[ BUFFER[0] = (INPUT >> 24) & 0xFF; BUFFER[1] = (INPUT >> 16) & 0xFF; BUFFER[2] = (INPUT >> 8) & 0xFF; BUFFER[3] = INPUT & 0xFF; ]] } Declaring structures is slightly more involved because it has to be careful, however they're not impossible nor hard... here is an example, it's the JFFS2 node header. struct "JFFS2NodeHeader" { { type = "uint16", name = "magic" }, { type = "uint16", name = "nodetype" }, { type = "uint32", name = "totlen" }, { type = "uint32", name = "header_crc" } } Including structures inside structures isn't hard either... struct "JFFS2Node" { { type = "JFFS2NodeHeader", name = "header" }, { type = "JFFS2NodeOwner", name = "owner" } } Packing and unpacking do not have to be as trivial as the examples above, for example, JFFS2 node headers contain a CRC of the header information. We can define a type which packs to a CRC and unpacks verifying the CRC. type "crc32" { inherit = "uint32", prepack = [[ INPUT = calc_crc32(BUFPOS(ARG(first)), BUFEND(ARG(last)) - BUF(ARG(first))); ]], postunpack = [[ if (calc_crc32(BUFPOS(ARG(first)), BUFEND(ARG(last)) - BUFPOS(ARG(first))) != RESULT) { ERROR_SET("CRC mismatch verifying "); ERROR_CAT(MEMBERPATH); ERROR(ARG(errval)); } ]], } That crc32 type needs three arguments, first, last and errval Thusly we can redefine our node header as: struct "JFFS2NodeHeader" { { type = "uint16", name = "magic" }, { type = "uint16", name = "nodetype" }, { type = "uint32", name = "totlen" }, { type = "crc32", name = "header_crc", first = "magic", last = "totlen", errval = "JFFS2_ERR_BAD_HDR_CRC" } } Each structure can be given a pre/postamble which will be included verbatim, ditto each member of a strcture. This could be used, for example, for defining documentation comments. yuefei supports what it calls 'typed unions' -- namely a union of types, of which the desired branch can be determined by use of a member previously found in the containing structure. For example, let us assume that the type of the union is in the member 'header.nodetype' and that there are two structures 'JFFS2SummaryNode' and 'JFFS2OwnedNode' which we need to choose between based on that nodetype. yuefei does not require a one-to-one mapping of value to union branch, instead it relies on a C expression which will be used to determine the branch to fill out. If no expression matches, then the failure statement will be executed. If there's no way for this to happen such as in our example below, then failure does not need to be defined and will not be called. Let us consider therefore, the JFFS2Node structure again... struct "JFFS2Node" { { type = "JFFS2NodeHeader", name = "header" }, union { name = "content", { name = "summary", type = "JFFS2SummaryContent", selector = "header.nodetype == JFFS2_NODETYPE_SUMMARY" }, { name = "owned", type = "JFFS2OwnedContent", selector = "header.nodetype != JFFS2_NODETYPE_SUMMARY" } } } Due to yuefei's reliance on a previous member to define the union type, it is not possible to have a union as a top level element, nor as the first element in a structure. --[ END ]--