|
NAME | SIZE AND OTHER LIMITATIONS | AUTHOR | REVISION | COLOPHON |
|
|
|
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
PCRE2 - Perl-compatible regular expressions (revised API)
There are some size limitations in PCRE2 but it is hoped that they
will never in practice be relevant.
The maximum size of a compiled pattern is approximately 64
thousand code units for the 8-bit and 16-bit libraries if PCRE2 is
compiled with the default internal linkage size, which is 2 bytes
for these libraries. If you want to process regular expressions
that are truly enormous, you can compile PCRE2 with an internal
linkage size of 3 or 4 (when building the 16-bit library, 3 is
rounded up to 4). See the README file in the source distribution
and the pcre2build documentation for details. In these cases the
limit is substantially larger. However, the speed of execution is
slower. In the 32-bit library, the internal linkage size is always
4.
The maximum length of a source pattern string is essentially
unlimited; it is the largest number a PCRE2_SIZE variable can
hold. However, the program that calls pcre2_compile() can specify
a smaller limit.
The maximum length (in code units) of a subject string is one less
than the largest number a PCRE2_SIZE variable can hold. PCRE2_SIZE
is an unsigned integer type, usually defined as size_t. Its
maximum value (that is ~(PCRE2_SIZE)0) is reserved as a special
indicator for zero-terminated strings and unset offsets.
All values in repeating quantifiers must be less than 65536.
There are two different limits that apply to branches of
lookbehind assertions. If every branch in such an assertion
matches a fixed number of characters, the maximum length of any
branch is 65535 characters. If any branch matches a variable
number of characters, then the maximum matching length for every
branch is limited. The default limit is set at compile time,
defaulting to 255, but can be changed by the calling program.
There is no limit to the number of parenthesized groups, but there
can be no more than 65535 capture groups, and there is a limit to
the depth of nesting of parenthesized subpatterns of all kinds.
This is imposed in order to limit the amount of system stack used
at compile time. The default limit can be specified when PCRE2 is
built; if not, the default is set to 250. An application can
change this limit by calling pcre2_set_parens_nest_limit() to set
the limit in a compile context.
The maximum length of name for a named capture group is 32 code
units, and the maximum number of such groups is 10000.
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or
(*THEN) verb is 255 code units for the 8-bit library and 65535
code units for the 16-bit and 32-bit libraries.
The maximum length of a string argument to a callout is the
largest number a 32-bit unsigned integer can hold.
The maximum amount of heap memory used for matching is controlled
by the heap limit, which can be set in a pattern or in a match
context. The default is a very large number, effectively
unlimited.
Philip Hazel
Retired from University Computing Service
Cambridge, England.
Last updated: 16 August 2023
Copyright (c) 1997-2023 University of Cambridge.
This page is part of the PCRE (Perl Compatible Regular
Expressions) project. Information about the project can be found
at ⟨http://www.pcre.org/⟩. If you have a bug report for this
manual page, see
⟨http://bugs.exim.org/enter_bug.cgi?product=PCRE⟩. This page was
obtained from the tarball fetched from
⟨https://github.com/PhilipHazel/pcre2.git⟩ on 2025-08-11. If you
discover any rendering problems in this HTML version of the page,
or you believe there is a better or more up-to-date source for the
page, or you have corrections or improvements to the information
in this COLOPHON (which is not part of the original manual page),
send a mail to [email protected]
PCRE2 10.46-DEV 16 August 2023 PCRE2LIMITS(3)