perlxstypemap (1)
Leading comments
Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35) Standard preamble: ========================================================================
NAME
perlxstypemap - Perl XS C/Perl type mappingDESCRIPTION
The more you think about interfacing between two languages, the more you'll realize that the majority of programmer effort has to go into converting between the data structures that are native to either of the languages involved. This trumps other matter such as differing calling conventions because the problem space is so much greater. There are simply more ways to shove data into memory than there are ways to implement a function call.Perl
In your
Anatomy of a typemap
In more practical terms, the typemap is a collection of code fragments which are used by the xsubpp compiler to map C function parameters and values to Perl values. The typemap file may consist of three sections labelled "TYPEMAP", "INPUT", and "OUTPUT". An unlabelled initial section is assumed to be a "TYPEMAP" section. TheEach type of section can appear an arbitrary number of times and does not have to appear at all. For example, a typemap may commonly lack "INPUT" and "OUTPUT" sections if all it needs to do is associate additional C types with core
Traditionally, typemaps needed to be written to a separate file, conventionally called "typemap" in a
TYPEMAP: <<HERE ... HERE
where "HERE" can be replaced by other identifiers like with normal Perl HERE-docs. All details below about the typemap textual format remain valid.
The "TYPEMAP" section should contain one pair of C type and
TYPEMAP # all variants of char* is handled by the T_PV typemap char * T_PV const char * T_PV unsigned char * T_PV ...
The "INPUT" and "OUTPUT" sections have identical formats, that is, each unindented line starts a new in- or output map respectively. A new in- or output map must start with the name of the
INPUT T_PV $var = ($type)SvPV_nolen($arg) T_PTR $var = INT2PTR($type,SvIV($arg))
We'll get to the meaning of those Perlish-looking variables in a little bit.
Finally, here's an example of the full typemap file for mapping C strings of the "char *" type to Perl scalars/strings:
TYPEMAP char * T_PV INPUT T_PV $var = ($type)SvPV_nolen($arg) OUTPUT T_PV sv_setpv((SV*)$arg, $var);
Here's a more complicated example: suppose that you wanted "struct netconfig" to be blessed into the class "Net::Config". One way to do this is to use underscores (_) to separate package names, as follows:
typedef struct netconfig * Net_Config;
And then provide a typemap entry "T_PTROBJ_SPECIAL" that maps underscores to double-colons (::), and declare "Net_Config" to be of that type:
TYPEMAP Net_Config T_PTROBJ_SPECIAL INPUT T_PTROBJ_SPECIAL if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){ IV tmp = SvIV((SV*)SvRV($arg)); $var = INT2PTR($type, tmp); } else croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\") OUTPUT T_PTROBJ_SPECIAL sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\", (void*)$var);
The
The "INT2PTR" macro (defined in perl.h) casts an integer to a pointer of a given type, taking care of the possible different size of integers and pointers. There are also "PTR2IV", "PTR2UV", "PTR2NV" macros, to map the other way, which may be useful in
The Role of the typemap File in Your Distribution
The default typemap in the lib/ExtUtils directory of the Perl source contains many useful types which can be used by Perl extensions. Some extensions define additional typemaps which they keep in their own directory. These additional typemaps may referenceFor
If you're developing a
Sharing typemaps Between CPAN Distributions
Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16
and better), it is rather easy to share typemap code between multiple
- *
- Declare "ExtUtils::Typemaps::Basic" as a build-time dependency in "Makefile.PL" (use "BUILD_REQUIRES"), or in your "Build.PL" (use "build_requires").
- *
-
Include the following line in the XSsection of yourXSfile: (don't break the line)
INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd -e "print embeddable_typemap(q{Basic})"
Writing typemap Entries
EachThis means that you can embed Perl code in your typemap (C) code using constructs such as "${ perl code that evaluates to scalar reference here }". A common use case is to generate error messages that refer to the true function name even when using the
${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }
For many typemap examples, refer to the core typemap file that can be found in the perl source tree at lib/ExtUtils/typemap.
The Perl variables that are available for interpolation into typemaps are the following:
- *
-
$var - the name of the input or output variable, eg. RETVALfor return values.
- *
- $type - the raw C type of the parameter, any ":" replaced with "_". e.g. for a type of "Foo::Bar", $type is "Foo__Bar"
- *
- $ntype - the supplied type with "*" replaced with "Ptr". e.g. for a type of "Foo*", $ntype is "FooPtr"
- *
- $arg - the stack entry, that the parameter is input from or output to, e.g. ST(0)
- *
- $argoff - the argument stack offset of the argument. ie. 0 for the first argument, etc.
- *
-
$pname - the full name of the XSUB,with including the "PACKAGE" name, with any "PREFIX" stripped. This is the non-ALIAS name.
- *
- $Package - the package specified by the most recent "PACKAGE" keyword.
- *
-
$ALIAS - non-zero if the current XSUBhas any aliases declared with "ALIAS".
Full Listing of Core Typemaps
Each C type is represented by an entry in the typemap file that is responsible for converting perl variables (- T_SV
-
This simply passes the C representation of the Perl variable (an SV*)
in and out of the XSlayer. This can be used if the C code wants to deal directly with the Perl variable.
- T_SVREF
-
Used to pass in and return a reference to an SV.
Note that this typemap does not decrement the reference count when returning the reference to an SV*. See also: T_SVREF_REFCOUNT_FIXED
- T_SVREF_FIXED
-
Used to pass in and return a reference to an SV.This is a fixed variant of T_SVREF that decrements the refcount appropriately when returning a reference to an SV*. Introduced in perl 5.15.4.
- T_AVREF
-
From the perl level this is a reference to a perl array.
From the C level this is a pointer to an AV.
Note that this typemap does not decrement the reference count when returning an AV*. See also: T_AVREF_REFCOUNT_FIXED
- T_AVREF_REFCOUNT_FIXED
-
From the perl level this is a reference to a perl array.
From the C level this is a pointer to an AV.This is a fixed variant of T_AVREF that decrements the refcount appropriately when returning an AV*. Introduced in perl 5.15.4.
- T_HVREF
-
From the perl level this is a reference to a perl hash.
From the C level this is a pointer to an HV.
Note that this typemap does not decrement the reference count when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
- T_HVREF_REFCOUNT_FIXED
-
From the perl level this is a reference to a perl hash.
From the C level this is a pointer to an HV.This is a fixed variant of T_HVREF that decrements the refcount appropriately when returning an HV*. Introduced in perl 5.15.4.
- T_CVREF
-
From the perl level this is a reference to a perl subroutine
(e.g. $sub = sub { 1 };). From the C level this is a pointer
to a CV.
Note that this typemap does not decrement the reference count when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
- T_CVREF_REFCOUNT_FIXED
-
From the perl level this is a reference to a perl subroutine
(e.g. $sub = sub { 1 };). From the C level this is a pointer
to a CV.
This is a fixed variant of T_HVREF that decrements the refcount appropriately when returning an HV*. Introduced in perl 5.15.4.
- T_SYSRET
-
The T_SYSRET typemap is used to process return values from system calls.
It is only meaningful when passing values from C to perl (there is
no concept of passing a system return value from Perl to C).
System calls return -1 on error (setting
ERRNOwith the reason) and (usually) 0 on success. If the return value is -1 this typemap returns "undef". If the return value is not -1, this typemap translates a 0 (perl false) to ``0 but true'' (which is perl true) or returns the value itself, to indicate that the command succeeded.The
POSIXmodule makes extensive use of this type. - T_UV
- An unsigned integer.
- T_IV
-
A signed integer. This is cast to the required integer type when
passed to C and converted to an IVwhen passed back to Perl.
- T_INT
-
A signed integer. This typemap converts the Perl value to a native
integer type (the "int" type on the current platform). When returning
the value to perl it is processed in the same way as for T_IV.
Its behaviour is identical to using an "int" type in
XSwith T_IV. - T_ENUM
-
An enum value. Used to transfer an enum component
from C. There is no reason to pass an enum value to C since
it is stored as an IVinside perl.
- T_BOOL
- A boolean type. This can be used to pass true and false values to and from C.
- T_U_INT
- This is for unsigned integers. It is equivalent to using T_UV but explicitly casts the variable to type "unsigned int". The default type for "unsigned int" is T_UV.
- T_SHORT
- Short integers. This is equivalent to T_IV but explicitly casts the return to type "short". The default typemap for "short" is T_IV.
- T_U_SHORT
-
Unsigned short integers. This is equivalent to T_UV but explicitly
casts the return to type "unsigned short". The default typemap for
"unsigned short" is T_UV.
T_U_SHORT is used for type "U16" in the standard typemap.
- T_LONG
- Long integers. This is equivalent to T_IV but explicitly casts the return to type "long". The default typemap for "long" is T_IV.
- T_U_LONG
-
Unsigned long integers. This is equivalent to T_UV but explicitly
casts the return to type "unsigned long". The default typemap for
"unsigned long" is T_UV.
T_U_LONG is used for type "U32" in the standard typemap.
- T_CHAR
- Single 8-bit characters.
- T_U_CHAR
- An unsigned byte.
- T_FLOAT
- A floating point number. This typemap guarantees to return a variable cast to a "float".
- T_NV
- A Perl floating point number. Similar to T_IV and T_UV in that the return type is cast to the requested numeric type rather than to a specific type.
- T_DOUBLE
- A double precision floating point number. This typemap guarantees to return a variable cast to a "double".
- T_PV
- A string (char *).
- T_PTR
- A memory address (pointer). Typically associated with a "void *" type.
- T_PTRREF
-
Similar to T_PTR except that the pointer is stored in a scalar and the
reference to that scalar is returned to the caller. This can be used
to hide the actual pointer value from the programmer since it is usually
not required directly from within perl.
The typemap checks that a scalar reference is passed from perl to
XS. - T_PTROBJ
-
Similar to T_PTRREF except that the reference is blessed into a class.
This allows the pointer to be used as an object. Most commonly used to
deal with C structs. The typemap checks that the perl object passed
into the XSroutine is of the correct class (or part of a subclass).
The pointer is blessed into a class that is derived from the name of type of the pointer but with all '*' in the name replaced with 'Ptr'.
For "DESTROY" XSUBs only, a T_PTROBJ is optimized to a T_PTRREF. This means the class check is skipped.
- T_REF_IV_REF
-
NOT YET
- T_REF_IV_PTR
-
Similar to T_PTROBJ in that the pointer is blessed into a scalar object.
The difference is that when the object is passed back into XSit must be of the correct type (inheritance is not supported) while T_PTROBJ supports inheritance.
The pointer is blessed into a class that is derived from the name of type of the pointer but with all '*' in the name replaced with 'Ptr'.
For "DESTROY" XSUBs only, a T_REF_IV_PTR is optimized to a T_PTRREF. This means the class check is skipped.
- T_PTRDESC
-
NOT YET
- T_REFREF
-
Similar to T_PTRREF, except the pointer stored in the referenced scalar
is dereferenced and copied to the output variable. This means that
T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR. All clear?
Only the
INPUTpart of this is implemented (Perl toXSUB) and there are no known users in core or onCPAN. - T_REFOBJ
-
Like T_REFREF, except it does strict type checking (inheritance is not
supported).
For "DESTROY" XSUBs only, a T_REFOBJ is optimized to a T_REFREF. This means the class check is skipped.
- T_OPAQUEPTR
-
This can be used to store bytes in the string component of the
SV.Here the representation of the data is irrelevant to perl and the bytes themselves are just stored in theSV.It is assumed that the C variable is a pointer (the bytes are copied from that memory location). If the pointer is pointing to something that is represented by 8 bytes then those 8 bytes are stored in theSV(and length() will report a value of 8). This entry is similar to T_OPAQUE.
In principle the unpack() command can be used to convert the bytes back to a number (if the underlying type is known to be a number).
This entry can be used to store a C structure (the number of bytes to be copied is calculated using the C "sizeof" function) and can be used as an alternative to T_PTRREF without having to worry about a memory leak (since Perl will clean up the
SV). - T_OPAQUE
-
This can be used to store data from non-pointer types in the string
part of an SV.It is similar to T_OPAQUEPTR except that the typemap retrieves the pointer directly rather than assuming it is being supplied. For example, if an integer is imported into Perl using T_OPAQUE rather than T_IV the underlying bytes representing the integer will be stored in theSVbut the actual integer value will not be available. i.e. The data is opaque to perl.
The data may be retrieved using the "unpack" function if the underlying type of the byte stream is known.
T_OPAQUE supports input and output of simple types. T_OPAQUEPTR can be used to pass these bytes back into C if a pointer is acceptable.
- Implicit array
-
xsubpp supports a special syntax for returning
packed C arrays to perl. If the XSreturn type is given as
array(type, nelem)
xsubpp will copy the contents of "nelem * sizeof(type)" bytes from
RETVALto anSVand push it onto the stack. This is only really useful if the number of items to be returned is known at compile time and you don't mind having a string of bytes in yourSV.Use T_ARRAY to push a variable number of arguments onto the return stack (they won't be packed as a single string though).This is similar to using T_OPAQUEPTR but can be used to process more than one element.
- T_PACKED
-
Calls user-supplied functions for conversion. For "OUTPUT"
(XSUBto Perl), a function named "XS_pack_$ntype" is called with the output Perl scalar and the C variable to convert from. $ntype is the normalized C type that is to be mapped to Perl. Normalized means that all "*" are replaced by the string "Ptr". The return value of the function is ignored.
Conversely for "INPUT" (Perl to
XSUB) mapping, the function named "XS_unpack_$ntype" is called with the input Perl scalar as argument and the return value is cast to the mapped C type and assigned to the output C variable.An example conversion function for a typemapped struct "foo_t *" might be:
static void XS_pack_foo_tPtr(SV *out, foo_t *in) { dTHX; /* alas, signature does not include pTHX_ */ HV* hash = newHV(); hv_stores(hash, "int_member", newSViv(in->int_member)); hv_stores(hash, "float_member", newSVnv(in->float_member)); /* ... */ /* mortalize as thy stack is not refcounted */ sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash))); }
The conversion from Perl to C is left as an exercise to the reader, but the prototype would be:
static foo_t * XS_unpack_foo_tPtr(SV *in);
Instead of an actual C function that has to fetch the thread context using "dTHX", you can define macros of the same name and avoid the overhead. Also, keep in mind to possibly free the memory allocated by "XS_unpack_foo_tPtr".
- T_PACKEDARRAY
-
T_PACKEDARRAY is similar to T_PACKED. In fact, the "INPUT" (Perl
to XSUB) typemap is identical, but the "OUTPUT" typemap passes an additional argument to the "XS_pack_$ntype" function. This third parameter indicates the number of elements in the output so that the function can handle C arrays sanely. The variable needs to be declared by the user and must have the name "count_$ntype" where $ntype is the normalized C type name as explained above. The signature of the function would be for the example above and "foo_t **":
static void XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);
The type of the third parameter is arbitrary as far as the typemap is concerned. It just has to be in line with the declared variable.
Of course, unless you know the number of elements in the "sometype **" C array, within your
XSUB,the return value from "foo_t ** XS_unpack_foo_tPtrPtr(...)" will be hard to decipher. Since the details are all up to theXSauthor (the typemap user), there are several solutions, none of which particularly elegant. The most commonly seen solution has been to allocate memory for N+1 pointers and assign "NULL" to the (N+1)th to facilitate iteration.Alternatively, using a customized typemap for your purposes in the first place is probably preferable.
- T_DATAUNIT
-
NOT YET
- T_CALLBACK
-
NOT YET
- T_ARRAY
-
This is used to convert the perl argument list to a C array
and for pushing the contents of a C array onto the perl
argument stack.
The usual calling signature is
@out = array_func( @in );
Any number of arguments can occur in the list before the array but the input and output arrays must be the last elements in the list.
When used to pass a perl list to C the
XSwriter must provide a function (named after the array type but with 'Ptr' substituted for '*') to allocate the memory required to hold the list. A pointer should be returned. It is up to theXSwriter to free the memory on exit from the function. The variable "ix_$var" is set to the number of elements in the new array.When returning a C array to Perl the
XSwriter must provide an integer variable called "size_$var" containing the number of elements in the array. This is used to determine how many elements should be pushed onto the return argument stack. This is not required on input since Perl knows how many arguments are on the stack when the routine is called. Ordinarily this variable would be called "size_RETVAL".Additionally, the type of each element is determined from the type of the array. If the array uses type "intArray *" xsubpp will automatically work out that it contains variables of type "int" and use that typemap entry to perform the copy of each element. All pointer '*' and 'Array' tags are removed from the name to determine the subtype.
- T_STDIO
- This is used for passing perl filehandles to and from C using "FILE *" structures.
- T_INOUT
-
This is used for passing perl filehandles to and from C using
"PerlIO *" structures. The file handle can used for reading and
writing. This corresponds to the "+<" mode, see also T_IN
and T_OUT.
See perliol for more information on the Perl
IOabstraction layer. Perl must have been built with "-Duseperlio".There is no check to assert that the filehandle passed from Perl to C was created with the right "open()" mode.
Hint: The perlxstut tutorial covers the T_INOUT, T_IN, and T_OUT
XStypes nicely. - T_IN
- Same as T_INOUT, but the filehandle that is returned from C to Perl can only be used for reading (mode "<").
- T_OUT
-
Same as T_INOUT, but the filehandle that is returned from C to Perl
is set to use the open mode "+>".