Table of Contents
m3gdb is a modified version of the well-known gdb debugger, with added support for the Modula-3 programming language. Much of the function of m3gdb is the same as gdb, and this article makes no attempt to duplicate information found in existing gdb documentation . Instead, it documents properties of m3gdb that add to or differ from gdb. The reader is assumed to be familiar the gdb documentation, or able to consult it as necessary. Most of the commands, command line options, environment variables, etc. of m3gdb are the same as for gdb. The differences lie in the syntax and semantics of expressions, linespecs, etc. and in the output m3gdb produces, when Modula-3 code is involved.
Throughout, the term "gdb" will be used in statements that apply only to unmodified gdb. The term "m3gdb" will be used in statements that are specific to m3gdb. The term "(m3)gdb" will be used to state properties that m3gdb inherits from gdb, and thus are the same in both debuggers.
gdb is a multi-language debugger. In deriving (m3)gdb from gdb, none of gdb's function is removed. Only new support for Modula-3 is added. Thus, (m3)gdb is also a multi-language debugger, with one additional supported programming language. (m3)gdb further supports debugging mixed-language programs, where modules written in different languages are linked together.
m3gdb supports code compiled by the SRC, PM3, EZM3, and CM3 Modula-3 compilers. A single m3gdb executable dynamically detects and adapts to code compiled by any of these compilers. The SRC, PM3, and EZM3 compilers differ very little, as far as m3gdb is concerned, so the phrase "PM3 et. al." will be used in statements that apply equally to any of these three Modula-3 compilers. More information on the various compilers and other language implementation alternatives and be found in this section.
Despite its support for all four Modula-3 compilers, the current, maintained
version of m3gdb is kept only in the CM3 source repositories.
Within the repository it is found at
It is entirely written in C, so a working Modula-3 compiler is not
required to build it.
However, it is integrated into CM3,
and this integration uses
and the Modula-3 build system.
Thus, the easiest way to build it is, with a working installed CM3,
go into subdirectory
cm3/scripts and execute
This will build and install m3gdb in the normal CM3
bin directory, usually
the installed executable will be named
This will go through the usual configure process, which will build
a debugger that both executes on and debugs programs on the machine
the build process executes on.
Building by this method will always repeat the configuration
process, which can be annoyingly time-consuming if you are doing
development work on m3gdb and have only modified a source
file or two.
Once the configure step has been done on a given machine,
it is safe to disable it for future recompiles on the same machine.
You can do this by uncommenting the line
%quick = 1, found near the top of
Note that this is quake code, where the
"%" is the comment-start character.
To build m3gdb without using a Modula-3 compiler, go in to
cm3/m3-sys/m3gdb and execute
./configure and make.
This is the usual C build process. The compiled executable
will then be found in
from whence you can move or copy it as desired.
(m3)gdb has a source file
that is mechanically generated from
Normally, the CM3 distribution will contain an up-to-date
ada-lex.c, but if not, you may need to
have flex installed to build m3gdb.
There are also several
files that need bison to regenerate their
*.c files, following the same
In order to be able to debug variables of Modula-3 types
m3gdb needs to be compiled by a C compiler that has an
integer size at least as wide as each of these Modula-3 types.
As with all debuggers, (m3)gdb requires that code to be debugged
be compiled with certain options that ask the compiler to insert
debug information in the object modules and executable.
(m3)gdb needs this information to
know things like the names, memory locations, and types of variables.
Without such information, (m3)gdb can still function, but only
at the machine instruction level,
or in very limited ways at the source code level.
m3gdb recognizes only the
information format. In CM3, the latter is required for full
To check whether an object or executable file has the right
kind of debug information,
execute objdump -G <filename>|head.
The file contains the needed debug information iff you
see the line
"Contents of .stab section:".
By default, the CM3 distribution is set up to compile with
debug information, at least on some platforms.
There are mechanisms for specifying either on the cm3
command line or in the
m3makefile, that it should
be produced, but these default to on, and there does not seem
to be a way to turn them off.
However, debug information can be turned off or on in
another way by omitting or adding the line
if debug args += "-gstabs+" end
in the quake
This is found in the CM3 configuration file, usually located at
For example, to enable debug information, it should look someting like:
proc m3_backend (source, object, optimize, debug) is local args = [ "-quiet", source, "-o", object, "-fPIC", "-m32", "-fno-reorder-blocks" ] if optimize args += "-O3" end if debug args += "-gstabs+" end if M3_PROFILING args += "-p" end return try_exec (m3back, args) end
This line has been missing in the distributed configuration file for some targets, at some times in the past few years.
By default, PM3 et. al. produce debug information.
If your installation is not giving debug information, add the line
option("debuginfo","T") to your
To suppress generation of debug information, add the line
option("debuginfo","") to your
In C, every expression is also a statement and may return a
result and/or have side-effects.
These semantics are reflected in gdb, which was first a C debugger.
In particular, the print command accepts an argument that is an
expression, which it both evaluates, printing the result, and carries out
its side-effects on the debugee program.
The expression could have no result,
if its type is void, in which case, the
print command displays
The user can execute a C assignment or call by typing it as
the argument to the print command.
This facility is, in effect, an interpreter for a significant
subset of the language, implemented by gdb.
Many of the Modula-3-specific extensions that m3gdb provides are
interpretation capabilities for Modula-3.
m3gdb follows the C-like pattern for the print
command by allowing its argument to be either an expression, an
assignment statement, or a call statement.
Note that an expression could also be a call on a function
procedure or method.
If the argument is a Modula-3 expression, the print command
evaluates it and prints the result.
If the expression contains a function call, there could also
be side-effects as well.
If the argument is a statement, the command executes the statement and
Section Expressions describes
the expressions m3gdb can evaluate.
When the main program is written in Modula-3,
the m3gdb start command will
execute all compiler-generated initializations and all
module initialization bodies in the
runtime system, which
is in library
and always used when there is Modula-3 code.
(but see this note).
Execution will stop before any of the other module initialization
For code compiled by PM3 et. al., m3gdb accomplishes this by setting
a breakpoint in procedure
and the stop after the start command will
appear as hitting this breakpoint.
RunMainBodies is located in library
libm3core, and if it is dynamically linked,
it may not have been loaded at the time you type the
start command, thus requiring you to go
through the usual ritual of allowing the breakpoint to be made pending,
until the containing library is loaded.
y to the question.
The breakpoint will be resolved automatically.
For CM3-compiled code, m3gdb accomplishes this by setting
a breakpoint in
Main.i3, and the stop
start command will
appear as hitting this breakpoint.
(m3)gdb has commands, e.g., the break command, that take an argument called a linespec. A linespec is used to denote an executable place in the code. There is overlap beween an expression and a linespec, because either can refer to a module, interface, procedure, or (anonymous) block. In a linespec, a procedure or block directly denotes its first executable statement, while any of the four can be used as qualifier in path eventually leading to a procedure or block. In an expression, a procedure can be named in a call, actual parameter, or as a procedure constant, an interface or module can similarly be used as a qualifier, leading to a procedure.
However, there are significant differences in (m3)gdb's syntax and semantics of linespecs and expressions. An expression is more general in one respect in that it can denote a variable, formal parameter, type, field, method, or procedure. But an expression needs to have a context where the debugee program is stopped, in order to get, e.g., values of variables, and, more fundamentally, to imply which language the expression is to be interpreted in.
In contrast, a complete linespec can only denote an
While this cannot be a variable, formal, type, field, or
method, it must be possible to make sense of it independent of
any execution context.
This means it could be a specification in the syntax of any of the
languages supported by (m3)gdb.
It could also have the form
<sourceFileName> could contain dots.
So (m3)gdb's parsing and analysis of expressions and linespecs
is quite different.
Nonetheless, m3gdb attempts to make them behave the same, for
cases that can occur in either kind of denotation.
A linespec with the form
denotes the specified line number of the source file. This
is no different from other gdb-supported languages.
If this line is not an executable place, (m3)gdb will use
a nearby line that is.
A linespec can take the form of a list of components, each of which is an identifier or a decimal number, separated by dots, with embedded white space allowed between the tokens. This is a kind of fully-qualified path to a procedure or block. The first component must be an identifier and must denote an interface, module, or procedure. A subsequent identifier denotes a procedure by that name, declared local to the unit denoted by the prefix. A number n denotes the n-th anonymous block nested immediately inside the unit denoted by the prefix. This system allows specification of any procedure or block.
If the debugee program is running but stopped, and thus has an execution context, m3gdb first tries to interpret the first identifier in that context, using the scope rules of Modula-3. If that fails, m3gdb next tries to interpret the first identifier as an interface name, looking in the entire link closure of the program. If that fails, m3gdb finally tries to interpret the first identifier as a module name, again looking in the entire link closure of the program. If the first identifier refers to a module in a not-yet-loaded, dynamically linked library, this will fail, and (m3)gdb will try to find another way to interpret the linespec.
A subsequent identifier that denotes anything other than a procedure is a failed attempt at interpreting the linespec. Likewise, a block number that would denote a nonexistent block is a failure. Although a procedure can be referred to using an interface as prefix, any further components of the linespec are interpreted relative to the body of the procedure, i.e., within the module that exports the procedure. A procedure within a module will be found if it is either explicitly declared in the module or in an exported interface of the module. This means that, when the exporting module has a different name from an interface, a procedure can usually be referenced using either the interface name or the module name as prefix.
The initialization block of a module can
be specified as either the module name alone, or with the form
<moduleName>.1, i.e., the first
(and only, in this case) block inside the module.
As a prefix of a longer linespec (i.e., to specify some
procedure or block inside the module initialization block),
the latter form is required, because, for example,
denotes a procedure declared local to the module, not to
the module's initialization block.
See this note about
hitting breakpoints specified in this way.
m3gdb is inconsistent about what it does when an interpretation attempt fails. Sometimes it tries the next way of looking up the first identifier as Modula-3 code. This usually happens when the failure occurs within the first two components. Sometimes it abandons trying to interpret the entire linespec as Modula-3 code, but allows gdb to try other ways to interpret it, in other languages. This usually happens when the first identifier can't be found and when the component list is ill-formed. Sometimes it displays an error message and gives up altogether. This happens for bad components after the second. This inconsistency is considered a bug, but it is not obvious what the best semantics are.
All the Modula-3 compilers emit mangled linker names for procedures, and such a name can be used in a linespec as an alternative way to identify a procedure. The mangled name for any procedure is similar to the component list linespecs above, with each dot replaced by two underscores. The entire name is a single linker name, so no embedded white space is allowed. The first component is always the module the procedure's body is declared in. The block numbers have no leading zeros. It is possible for a programmer to spoof such a name by an identifier with double underscores, causing confusion to m3gdb.
The following list gives Modula-3 code constructs that are supported, to at least some degree, in m3gdb expressions. Where m3gdb semantics are not identical to Modula-3 semantics, subsequent subsections describe the differences. All of these constructs are treated as expressions by m3gdb. but some are statements in Modula-3.
+, and binary
+, and unary
m3gdb generally follows Modula-3's rules for looking up an unqualified identifier, with some exceptions.
The debug symbol data emitted by the compilers contains no information
about identifiers declared in a
EXCEPTION declaration, so you can't refer to
these in an m3gdb expression.
(But you can refer to
values of enumeration types.)
This can further change the semantics of identifier lookup.
For example, in true Modula-3 code, there could be a reference to a
named constant that is declared in some inner scope.
In an m3gdb expression, the same identifier might end up denoting
an entirely different declaration by the same name, in some outer
Since m3gdb doesn't know the constant exists, it will
find the identifier in the outer scope.
When a normal Modula-3 scope lookup of an identifier fails, m3gdb looks for an interface or module by that name, in the entire link closure of the program, excluding any not-yet-loaded dynamically linked libraries. The name of an interface or module, by itself, has no meaning in an m3gdb expression, but it can be followed by a dot and used to name a procedure, variable, or type declared in the interface or module. Identifiers known in a module via exported interfaces can be named in this way, in addition to those declared directly in the module. If there is both an interface and a module by the same name, the interface is searched first, but a procedure named in that way is treated as referring to the procedure body.
If, perversely, there were a module that did not export a same-named interface and both contained different declarations of the same identifier, this lookup order would mean you could not name the meaning declared in the module, unless the program context were somewhere inside the module.
For PM3 et. al., m3gdb fully supports references to variables that are declared in some scope outer to the referencing context.
For CM3, the debug information for following static links is inadequate. So long as nested procedures are called only as procedure constants, it should work correctly. If a nested procedure is called through a formal parameter, m3gdb might find the wrong instance of a statically containing procedure, when accessing the containing procedure's variables nonlocally. For now, m3gdb warns whenever accessing variables nonlocally. Also, see this note.
This problem does not apply to fully global variables, since they do not require the static link mechanism to address them.
Dot selections in m3gdb follow most of Modula-3's rules. You can select:
You can not select a constant or exception of an interface or module. You can not select a method name of an object type.
m3gdb displays the value of a procedure in both of two ways: a qualified path as in a linespec, and a source file and line number.
A procedure can be referred to in an m3gdb expression, either as part of a call, to pass the procedure (constant) as an actual parameter, or to assign it to a variable.
m3gdb does not check assignability of two procedure types. This is relevant for assignments and passing parameters. If both are procedure types, it assumes they are assignable. It warns when it makes this liberal assumption. Bad things will almost certainly happen if you abuse this.
A type can be referred to in an m3gdb expression.
The primary use is as parameter to builtin functions such as
Only named types work, either builtin or declared in a
m3gdb does not recognize Modula-3 type constructors.
m3gdb displays the value of a named type as a qualified path, as in a linespec, but ending with a type name.
m3gdb displays values of type
and accepts them in expressions.
It supports both the form used in PM3 et. al. and the form used in CM3.
If the program was compiled by CM3, it will handle wide
TEXT literals and also the type
WIDECHAR and its literals.
Normally, m3gdb displays
values in Modula-3 lexical syntax, with double
quotes, escape sequences, and, if appropriate, a leading
However, the /k option in a print command will
cause it instead to display the
value's internal data structure.
For PM3 et. al., this is a traced reference to an open array of
For CM3, this will be one of the several object subtypes of
m3gdb properly recognizes and displays values of CM3 type
Here, it takes the length of the text from the appropriate field,
instead of from the declared type, which contains a fixed array whose
length is almost always far too long.
Even without specifying the /k option, if you happen to know what the internal representation is, you can apply appropriate operators to it, e.g., dereferencing, field selection, subscripting, or even method calls.
There are some cases where, in order to evaluate/execute a user-typed
expression, m3gdb has to actually allocate an object in the heap
of the debugee program.
Examples are assigning a user-typed
to a variable or passing it as an actual parameter.
This can happen unexpectedly if the expression in the m3gdb command
contains the Modula-3 concatenation operator
which m3gdb evaluates by calling
in the debugee process.
This will alter the debugee's execution environment it a subtle way.
Usually this will not matter, and the garbage collector will eventually
collect such objects after they become inaccessible.
Such operations will not work if you are only debugging a
core file and don't have an executing
If you type print "ABC" & "DEF",
m3gdb will execute three calls in the target program.
Text.FromChars to get the
TEXT values allocated and one on
Text.Cat to do the concatenation.
TEXT strings will be allocated
in the debugee program's traced heap,
but will immediately become garbage, available for collection.
m3gdb always treats a heap object as having its allocated type, not the static type of the expression used to refer to it. This allows you to select fields, etc. of the actual object.
LOOPHOLE applied to a heap object is an
This would allow you to access a field or method of a supertype
that was hidden by a different but same-named field or method
in the allocated type.
m3gdb does not support array constructors.
For ordinal types, m3gdb can't distinguish a
parameter from a
READONLY parameter, given the
information available from any of the compilers.
In this case, it assumes
VAR, and requires
the actual and formal to have identical types.
This assumption makes
work (i.e., the actual can be changed by the called procedure), but
means m3gdb's type rule is overly strict for
In the latter case, if the actual is
assignable but not identical and you really need to pass it, you
will have to use a
LOOPHOLE on the actual
parameter in the call.
On the other hand, if you use the
and the mode is really
VAR, bad things could happen.
For all other classes of types, either m3gdb has a way to
or it doesn't matter.
m3gdb's liberal type rules allow the integer binary
(i.e., two-operand) arithmetic operations
to be performed on any subrange type and any reference type,
as well as
and mixtures thereof.
ABS and unary
can only be applied to an
LONGINT, or floating operand.
They won't even work on a
or a formal parameter passed by reference.
Since it is semantically an identity, m3gdb just ignores
+ without even bothering to type check it.
m3gdb supports code compiled by the SRC, PM3, EZM3, and CM3 Modula-3 compilers. A single m3gdb executable dynamically detects and adapts to code compiled by any of these compilers.
Programs that link together Modula-3 code produced by a mixture of CM3 and PM3 et. al. compilers are likely to confuse m3gdb's detection of what compiler was used. These compilers have significant differences in their runtime organization, and m3gdb assumes there is not a mixture. Any such problem behaviour can even depend on the order in which dynamically-linked libraries are brought in. In fact, mixing code like this is likely to cause other problems, even without m3gdb's involvement.
There are two different code generators in use by the various compilers. One is derived from the well-known gcc compiler, with modest modifications to support Modula-3. It inherits much of gcc's repertoire of supported targets and its range of optimization options. For PM3 et. al., it is derived from gcc 2.7.2. For CM3, it is derived from gcc 4.3.0. The other code generator is an i386-specific code generator, written in Modula-3 and designed for fast compilation.
There are three different thread implementations for Modula-3 threads.
and is implemented entirely within the runtime library
It subschedules the process thread that is provided by the
operating system, among the Modula-3 threads.
In recent years, it has been undermined by security-motivated changes
Only a few platforms have an updated version.
The second is much newer and uses the library
It integrates scheduling of Modula-3 threads with other threads.
It also is adapted to multi-core and multi-chip SMP systems.
The third uses Windows native threads and is used only on
There are two different garbage collectors. Both are capable of incremental collection, i.e., running collection activity in a parallel thread to the running program, thus avoiding unexpected pauses in execution. The older, virtual-memory-synchronized collector uses virtual memory to detect heap changes that would otherwise undermine the correctness of the collector. The newer, compiler-assisted collector uses compiler-inserted notifications for the same purpose.
m3gdb has a new command "Info Modula-3". This will tell you which compiler and which code generator were used to compile the debugee program and which threads implementation and which garbage collector are in use. In case you have trouble remembering which spelling of the language name to use, it accepts the cartesian product of "Modula" spelled out or abbreviated with a single "M", lowercase/uppercase "M", and with/without the hyphen. A mixture of implementations will confuse it. It may be unable to get some of the information before the runtime system has been initialized.
The virtual-memory-synchronized garbage collector works by
asking the operating system to artificially hardware-protect certain
memory areas from access and to notify it when such and access occurs.
The notification is through a segment fault that the garbage
collector catches and handles.
Unfortunately, m3gdb can't distinguish this artificial segment
fault from a normal one, resulting in numerous false stops of
the debugee program.
m3gdb prevents this by automatically disabling incremental
collection when debugging with this collector.
This is the equivalent of manually adding
to the command line arguments, which you can also do redundantly
You can reenable incremental collection by typing the command
runtime system has been initialized.
libpthread thread implementation uses
By default, (m3)gdb stops when this signal is received,
resulting in numerous false stops.
When this thread implementation is in use, m3gdb automatically
executes the command
"handle SIG64 nostop noprint pass",
before the debugee program starts to run.
This causes m3gdb to silently pass this signal to the program,
preventing the false stops.
If you want to reverse this, for example, to debug this thread
implementation, type the command
"handle SIG64 stop print pass".
Some debugging commands can fail if executed too early, before certain initializations have happened. Understanding initialization can matter to you if you want to debug module body code or the runtime system itself, or try to execute calls in m3gdb commands before everything has been initialized.
m3gdb uses addresses found in debug information in executable files to address global variables and procedures. On some targets, executable code uses a different mechanism that addresses global variables and procedures indirectly, using pointers that are also stored in global locations. These pointers are initialized by compiler/linker-generated machine code, and this happens during program startup.
You can access a global variable with an m3gdb command any time, although its value may not yet be initialized. However, if you call a global procedure using an m3gdb command, and it or any other procedure it calls, directly or indirectly, accesses a global variable, it may use one of these pointers. If the pointer has not yet been initialized, this will almost certainly cause your program to suffer a segment fault that would never happen if you only executed compiler-generated code.
Other problems of more varied nature can also occur if the module body code, written by the Modula-3 programmer, has not been executed when you use an m3gdb command to call a procedure too early.
Using the m3gdb
will ensure that the runtime system has been initialized.
After that, m3gdb calls on procedures that are in
libm3core and in the closure should be safe.
To use m3gdb safely to make calls in your own code, you need
to be sure execution has proceeded at least through the first
line of the body of the module that exports
If you link in a library, all the code of the entire library is loaded
into your address space, and all the debug data is available to m3gdb,
even for modules that are in the library but not in the
IMPORT/EXPORT closure of the main program.
This means you can call procedures in such modules and access their
global variables from m3gdb commands.
However, the compilers take care of initialization
only for modules in the closure, so such modules will
never be initialized. Similarly, only certain modules in
libm3core are initialized as part of
the runtime system.
As an example, you can always call
RTTypeFP.FromFingerprint, because it
And this could be a useful thing to want to do during
a debug session (e.g., when debugging something involving pickles).
RTTypeFP is not initialized as part
of the runtime system and is usually not named in any
in a typical program, and therefore is not in the closure.
So this call, in an m3gdb command, will result in a
segment fault while trying allocate a heap object of a type
Even the needed type definition is present in memory,
but the needed pointer to it is not set up.
To make this work, you have to put
IMPORT RTTypeFP; somewhere in your
program and recompile and also postpone the m3gdb call
until the necessary initialization has taken place.
In CM3-compiled code, if you put a breakpoint at, e.g.,
Mod.1 rather than using a
source code line number, you will see somewhat strange behaviour.
The CM3 runtime system invokes module bodies multiple times.
The compiler translates them so they do different things
on different invocations.
The compiler also gives them a parameter named
mode, whose value, in part, determines what
the module body code will do.
If execution stops at a breakpoint such as
0, the programmer-written
Modula-3 code of the module body will not be executed during this
invocation, and the initialization of the runtime system
might not have been done either.
This can happen more than once.
mode has value
then the Modula-3 code itself is about to be executed.
You can work around this by using a linespec with a source code line number in your break command.
There are many things you can do to aid debugging, just by using m3gdb commands to print and set variables and to call procedures in the runtime system.
As an example, RTTypeFP.FromFingerprint could be useful when debugging pickles themselves, or a program that uses them. However, see here for a caveat on how to use it.
In CM3, there are problems in displaying variables and formal parameters that are nonlocally accessed somewhere in the Modula-3 code, (in addition to this problem.) If a formal parameter is accessed nonlocally anywhere in the program, accessing it locally in m3gdb by the commands info arg, frame, or backtrace will display an incorrect value, while info loc will give two values for such a parameter, only one of which is correct. The print command appears to be correct in such cases, as far as tested to date.
m3gdb was originally developed as a modification to gdb, version 4.17, done at DEC's Stanford Research Center, by unknown authors. Over the years, the modifications have been moved to gdb versions 184.108.40.206, 5.3, 6.3, and 6.4, along with many enhancements, by Antony Hosking and Rodney Bates. Any additional information about history or contributors would be welcomed.
This document was originally written in October of 2008 by Rodney M. Bates, email@example.com.