This lists configuration options that explicitly affect the behavior of the decompiler or its output, independent of the code that is being decompiled. The bulk of these are accessible by selecting the Code Browser menu
and then picking the Decompiler sub-folder. These options are associated with the particular tool (Code Browser) being used and will apply to decompilation of any Program being analyzed by that tool. The three categories of options are:
Another source of options can be accessed by selecting the Code Browser menu
and the picking the Decompiler tab. These “Program Options” are specific to the particular Program being analyzed.
These options govern what resources are available to the Plug-in and the decompiler engine but do not affect how analysis is performed or results are displayed.
Decompilation results for a single function can be compute intensive to produce. This option specifies the number of functions whose decompilation results can be cached simultaneously. When navigating to a function that has been recently cached, as when navigating back and forth between a few functions, a new decompilation is not triggered.
This is a limit on the number of bytes that can be produced by the decompiler process as output when decompiling a single function. A payload includes the actual characters to be displayed in the window, additional token markup, symbol information, and other details of the underlying syntax tree. The limit is specified in megabytes of data. If the limit is exceeded for a single function, decompilation is aborted for that function, and an error message "Decompiler results exceeded payload limit ..." is displayed.
This option sets an upper limit on the number of seconds the decompiler spends attempting
to analyze one function before aborting.
It is currently not enforced for the Decompilation
Window. Instead it applies to the DecompilerSwitchAnalyzer, the analyzeHeadless
command, scripts, or other
plug-ins that make use of the decompiler service.
These options directly affect how the decompiler performs its analysis, either by toggling specific analysis passes or changing how it treats various annotations.
When deciding if an individual stack location has become dead, the decompiler must consider aliases, pointers onto the stack that could be used to modify the location within a called function. One strong heuristic the decompiler uses is; if the user has explicitly created a variable on the stack between the base location referenced by the pointer and the individual stack location, then the decompiler can assume that the pointer is not an alias of the stack location. The alias is blocked by the explicit variable. However, if the user's explicit variable is labeling something that isn't really an explicit variable, like a field within a larger structure for instance, the decompiler may incorrectly consider the stack location as dead and start removing live code.
In order to support the exploratory labeling of stack locations, the user can use this setting to specify what data-types should be considered blocking. The four options are:
Selecting None is the equivalent of turning off the heuristic. Selecting anything except All Data-types allows users to safely label small variables without knowing immediately if the stack location is part of a larger structure or array.
When this is toggled on, the decompiler eliminates code that it
considers unreachable. This usually happens when, due to constant propagation and other
analysis, the decompiler decides that a boolean value controlling a conditional branch can
only take one possible value and removes the branch corresponding to the other value. Toggling
this to off lets the user see the dead code, which is typically demarcated
by the control-flow structure -- if (false) { ... }
.
When toggled on, the decompiler treats instructions whose semantics
have been formally marked unimplemented as if they do
nothing (no operation). Crucially, control-flow falls through to the next instruction.
In this case, the decompiler inserts the warning "Control flow ignored unimplemented
instructions" as a comment in the function header, but the exact point at which
instruction was ignored may not be clear.
If this option is toggled off, the decompiler inserts the built-in
function halt_unimplemented()
at the point of the unimplemented instruction, and
control-flow does not fall through.
When toggled on, the decompiler infers a data-type for constants it determines are likely pointers. In the basic heuristic, each constant is considered as an address, and if that address starts a known data or function element in the program, the constant is assumed to be a pointer. The constants are treated like any other source of data-type information, and the inferred data-types are freely propagated by the decompiler to other parts of the function.
When toggled on, the decompiler treats any values in memory marked read-only as constant. If a read-only memory location is explicitly referenced by the function being decompiled, it is considered to be unchanging, and the initial value present in the Program is pulled in to the data-flow of the function as a constant. Due to Constant Propagation and other transformations, read-only memory can have a large effect on decompiler output.
Typically as part of the import process, Ghidra marks memory blocks as read-only if they are tagged as such by a section header or other meta-data in the original binary. Users can actively set whether specific memory regions are considered read-only through the Memory Manager, and individual data elements can be marked as constant via the Mutability setting (See “Data Mutability”).
This toggles whether the decompiler attempts to simplify double precision arithmetic operations, where a single logical operation is split into two parts, calculating the high and low pieces of the result in separate instructions. Decompiler support for this kind of transform is currently limited, and only certain constructions are simplified.
When this option is active, the decompiler simplifies code sequences containing
predicated instructions. A predicated instruction is executed
conditionally based on a boolean value, the predicate,
and a sequence of instructions can share the same predicate. The decompiler merges the
resulting if/else
blocks that share the same predicate so that the condition is only
printed once.
When toggled on, the decompiler employs in-place assignment operators,
such as +=
and <<=
, in its output syntax.
These options do not change the decompiler's analysis but only affect how the results are presented.
Assign the background color for the Decompiler window.
Assign colors to the different types of language tokens emitted by the decompiler. These include:
Assign the color to any characters emitted by the decompiler that do not fall into one of token types listed above. This includes delimiter characters like commas and parentheses as well as various operator characters.
Assign the background color used to highlight the token currently under the cursor in a Decompiler Window.
Assign the background color used to highlight characters matching the current Find pattern. See “Find ...”.
Set the number of characters that comment lines are indented within decompiler output. This applies only to comments within the body of the function being displayed. Comments at the head of the function are not indented.
Set the language syntax used to delimit comments emitted as part of decompiler output. For C and Java,
the choices are /* C style comments */
and // C++ style comments
.
Set whether the syntax for type casts is emitted in decompiler output. If this is toggled on, type cast syntax is never displayed, even when rules of the language require it. So individual statements may no longer be formally accurate.
Set whether a specific kind of comment can be incorporated into decompiler output. Comments in Ghidra are categorized based on their placement within the Listing Window, and the decompiler in general tries to display comments where appropriate. See the discussion in “Comments”. Each kind of comment has its own toggle and can be individually included or excluded from decompiler output.
Toggle whether the decompiler emits comments at the head (before the beginning) of a function. The header is built from Plate comments placed at the entry point of the function. See the discussion in “Comments”. The inclusion of other Plate comments is controlled by the Display PLATE comments toggle, described above.
Toggle whether line numbers are displayed in any Decompiler Window. If toggled on, each Decompiler Window reserves space to display a numbers down the left side of the window, labeling each line of output produced by the decompiler. Line numbers are associated with the window itself and are not formally part of the decompiler's output.
Control how the decompiler displays namespace information associated with function and variable symbols. The possible settings are:
The Minimally setting, which is the default, will only emit the portion of the namespace path necessary to distinguish the symbol from other symbols with the same base name used by the function, or if a portion of the path is completely outside the function's scope.
The Never setting never displays any of the namespace path under any circumstances and may produce output that is ambiguous and doesn't formally parse.
Toggle whether decompiler generated WARNING comments are displayed as part of the output. The decompiler generates these comments, independent of those laid down by users, to indicate unusual conditions or possible errors (See “Warning Comments”).
Set the typeface used to render characters in any Decompiler Window. Indentation is generally clearer using a monospaced (fixed width) font, but any font available to the system can be used. The size of the font can also be controlled from this option.
Set how integer constants are formatted in the decompiler output. The possible settings are:
For Best Fit, a representation is selected based on how close it is to either a round decimal value (10, 100, 1000, etc.) or a round hexadecimal value (0x10, 0x100, 0x1000, etc.)
Set the maximum number of characters in a line of code emitted by the decompiler before a line break is forced. The decompiler will not split an individual token across lines. So line breaks frequently will come before the maximum number of characters is reached, and technically a single token can extend the line beyond the maximum.
Set the amount of indenting used to print statements within a nested scope in the decompiler output. Each level of nesting (for function bodies, loop bodies, if/else bodies, etc.) bodies adds this number characters.
Set how null pointers are displayed in decompiler output. If this is toggled
on, the decompiler will print a constant pointer value of zero (a null pointer)
using the special token NULL
. Otherwise the pointer value is represented with the '0' character,
which is then type cast into a pointer.
Set whether the calling convention is printed as part of the function declaration in decompiler output. If this option is turned on, the name of the calling convention is printed just prior to the return value data-type within the function declaration. All functions in Ghidra have an associated calling convention (or prototype model) that is used during decompiler analysis. See the discussion in “Prototype Model”.
Changes to these options affect only the current Program being analyzed.
Sets the calling convention (prototype model) used when decompiling a function where
the convention is not known (i.e. marked as "unknown"). Many architectures have multiple
calling conventions, __stdcall
, __thiscall
etc. See the
discussion in “Prototype Model”.