Skip to content

cgmemmgr: CodeModel-aware memory management#60915

Open
xal-0 wants to merge 12 commits intoJuliaLang:masterfrom
xal-0:codemodel-cgmemmgr
Open

cgmemmgr: CodeModel-aware memory management#60915
xal-0 wants to merge 12 commits intoJuliaLang:masterfrom
xal-0:codemodel-cgmemmgr

Conversation

@xal-0
Copy link
Copy Markdown
Member

@xal-0 xal-0 commented Feb 3, 2026

As described in #60245, we're unable to use the optimized memory manager on architectures where the default CodeModel is not Large. Since mmap() will generally put new mappings close to each other, allocations will succeed until the address space becomes fragmented enough that an RW allocation lands sufficiently far from the corresponding RX allocation, such that a relocation in the code fails.

This PR implements a new JLJITLinkMemoryManager, which is intended to have lower memory use than orc::MapperJITLinkMemoryManager and support non-large code models, unlike the old Julia CG memory manager. We accomplish this by reserving a block in the address space that is smaller than the maximum distance for relocations on the current architecture, putting the code and data sections inside this block:

    v-- 2^JULIA_CGMEM_BLOCK_SIZE --v

 ---+---------------------+--------+----
    | RX                  | RW     |
 ---+---------------------+--------+----

We make allocations inside the RX and RW regions (read-only data goes into the RX region for compactness) until one or the other is full, then we unmap unused pages and make a new block elsewhere. The size of a block can be set with JULIA_CGMEM_BLOCK_SIZE, while the ratio of RX to RW within the block is fixed, based on empirical measurements. If an allocation requires more space than is available in a single block, contiguous pages are allocated at an arbitrary address for it.

TODOs:

  • Basic stress tests with small block size
  • Test all implemented ROBlockMappers
  • Add a fallback to MapperJITLinkMemoryManager. This is important on mac, where it's more likely that we aren't able to make a temporary file for DualBlockMapper.
  • Better tests that intentionally fragment the address space
  • Add a ProtectionKeyMapper that uses MAP_JIT on macOS and pkeys on linux. Still want to do this, but can do it later.

@xal-0 xal-0 force-pushed the codemodel-cgmemmgr branch from 73c4991 to d19dd1a Compare February 3, 2026 18:09
@xal-0 xal-0 force-pushed the codemodel-cgmemmgr branch from 02a49e6 to 7b7d65a Compare February 3, 2026 20:18
@xal-0 xal-0 added performance Must go faster compiler:llvm For issues that relate to LLVM labels Feb 4, 2026
@gbaraldi
Copy link
Copy Markdown
Member

gbaraldi commented Feb 4, 2026

We should look into upstreaming this

@xal-0 xal-0 force-pushed the codemodel-cgmemmgr branch 2 times, most recently from 9a88404 to 0d6d4d4 Compare February 5, 2026 01:45
# endif
# ifdef MAP_NORESERVE
flags |= MAP_NORESERVE;
# endif
Copy link
Copy Markdown
Member

@vtjnash vtjnash Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding MAP_FIXED_NOREPLACE too, where that exists (Linux mainly, though treated as equivalent to MAP_FIXED in 4.17 to 4.19 by accident)

};
// Until Win10+, we'll put the allocation wherever.
(void)addr;
void *ret = VirtualAlloc(nullptr, size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we mimicking the mmap API and passing the hint here too?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To work around not having MEM_RESERVE_PLACEHOLDER until we bump the minimum Windows version, I have map_reserve return 0. CodeAllocator::map_blocks still calls map_rw with a non-zero address (it takes the 0 from map_reserve and adds the size of the RX region), so I'd have to ifdef that in two places instead.

Comment on lines +176 to +179
#ifdef _OS_WINDOWS_
// Noop: we can do better when we bump the minimum OS version to Windows 10,
// where we can use VirtualAlloc2 with MEM_RESERVE_PLACEHOLDER and
// MemExtendedParameterAddressRequirements.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, an API to do exactly this feature, that looks very nice to have MemExtendedParameterAddressRequirements

@xal-0 xal-0 marked this pull request as ready for review February 18, 2026 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler:llvm For issues that relate to LLVM performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants