aboutsummaryrefslogtreecommitdiff
path: root/ARMeilleure
AgeCommit message (Collapse)Author
2023-01-12Ptc: Check process architecture (#4272)1.1.545merry
2023-01-12Arm64: Cpu feature detection (#4264)1.1.544merry
* Arm64: Cpu feature detection * Ptc: Add Arm64 feature info * nits * simplify CheckSysctlName * restore some macos flags * feedback
2023-01-10Implement JIT Arm64 backend (#4114)1.1.536gdkchan
* Implement JIT Arm64 backend * PPTC version bump * Address some feedback from Arm64 JIT PR * Address even more PR feedback * Remove unused IsPageAligned function * Sync Qc flag before calls * Fix comment and remove unused enum * Address riperiperi PR feedback * Delete Breakpoint IR instruction that was only implemented for Arm64
2023-01-05Make PPTC state non-static (#4157)1.1.507gdkchan
* Make PPTC state non-static * DiskCacheLoadState can be null
2022-12-27Use new ArgumentNullException and ObjectDisposedException throw-helper API ↵1.1.493Berkan Diler
(#4163)
2022-12-24Some minor cleanups and optimizations (#4174)1.1.489Berkan Diler
* Replace Array.Clear(x, 0, x.Length) with Array.Clear(x) * Use DateTime.UnixEpoch field * Replace SHA256.ComputeHash calls with static SHA256.HashData call More performant and avoids the need to initialize a SHA256 instance.
2022-12-21Fix CPU FCVTN instruction implementation (slow path) (#4159)1.1.487gdkchan
* Fix CPU FCVTN instruction implementation (slow path) * PPTC version bump
2022-12-21ARMeilleure: Hash _data pointer instead of value for Operand (#4156)1.1.482riperiperi
I noticed a weirdly high cost for dictionary accesses from MarkLabel etc. Turns out that the hash code was always the same for labels, so the whole point of having a dictionary was missed and it was putting everything in the same bucket. I made it always hash the _data pointer as that's a good source of identifiable and "random" data.
2022-12-19Eliminate zero-extension moves in more cases on 32-bit games (#4140)1.1.480gdkchan
* Eliminate zero-extension moves in more cases on 32-bit games * PPTC version bump * Revert X86Optimizer changes
2022-12-18Revert "ARMeilleure: Add initial support for AVX512(EVEX encoding) (#3663)" ↵1.1.479gdkchan
(#4145) This reverts commit 295fbd0542a93ac50e558054a3f0c8c64286b764.
2022-12-18ARMeilleure: Add initial support for AVX512(EVEX encoding) (#3663)1.1.478Wunk
* ARMeilleure: Add AVX512{F,VL,DQ,BW} detection Add `UseAvx512Ortho` and `UseAvx512OrthoFloat` optimization flags as short-hands for `F+VL` and `F+VL+DQ`. * ARMeilleure: Add initial support for EVEX instruction encoding Does not implement rounding, or exception controls. * ARMeilleure: Add `X86Vpternlogd` Accelerates the vector-`Not` instruction. * ARMeilleure: Add check for `OSXSAVE` for AVX{2,512} * ARMeilleure: Add check for `XCR0` flags Add XCR0 register checks for AVX and AVX512F, following the guidelines from section 14.3 and 15.2 from the Intel Architecture Software Developer's Manual. * ARMeilleure: Increment InternalVersion * ARMeilleure: Remove redundant `ReProtect` and `Dispose`, formatting * ARMeilleure: Move XCR0 procedure to GetXcr0Eax * ARMeilleure: Add `XCR0` to `FeatureInfo` structure * ARMeilleure: Utilize `ReadOnlySpan` for Xcr0 assembly Avoids an additional allocation * ARMeilleure: Formatting fixes
2022-12-15Replace `DllImport` usage with `LibraryImport` (#4084)1.1.472Isaac Marovitz
* Replace usage of `DllImport` with `LibraryImport` * Mark methods as `partial` * Marshalling * More `partial` & marshalling * More `partial` and marshalling * More partial and marshalling * Update GdiPlusHelper to LibraryImport * Unicorn * More Partial * Marshal * Specify EntryPoint * Specify EntryPoint * Change GlobalMemoryStatusEx to LibraryImport * Change RegisterClassEx to LibraryImport * Define EntryPoints * Update Ryujinx.Ava/Ui/Controls/Win32NativeInterop.cs Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com> * Update Ryujinx.Graphics.Nvdec.FFmpeg/Native/FFmpegApi.cs Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com> * Move return mashal * Remove calling convention specification * Remove calling conventions * Update Ryujinx.Common/SystemInfo/WindowsSystemInfo.cs Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com> * Update Ryujinx/Modules/Updater/Updater.cs Co-authored-by: Mary-nyan <thog@protonmail.com> * Update Ryujinx.Ava/Modules/Updater/Updater.cs Co-authored-by: Mary-nyan <thog@protonmail.com> Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com> Co-authored-by: Mary-nyan <thog@protonmail.com>
2022-12-10Fix Lambda Explicit Type Specification Warning (#4090)1.1.460Isaac Marovitz
2022-12-05Make structs readonly when applicable (#4002)1.1.426Andrey Sukharev
* Make all structs readonly when applicable. It should reduce amount of needless defensive copies * Make structs with trivial boilerplate equality code record structs * Remove unnecessary readonly modifiers from TextureCreateInfo * Make BitMap structs readonly too
2022-11-18Use ReadOnlySpan<byte> compiler optimization in more places (#3853)1.1.358Berkan Diler
* Use ReadOnlySpan<byte> compiler optimization in more places * Revert changes in ShaderBinaries.cs * Remove unused using; * Use ReadOnlySpan<byte> compiler optimization in more places
2022-11-16Update units of memory from decimal to binary prefixes (#3716)1.1.349Wunk
`MB` and `GB` can either be interpreted as having base-10 units, or base-2. `MiB` and `GiB` removes this discrepancy so that units of memory are always interpreted using base-2 units.
2022-11-09infra: Migrate to .NET 7 (#3795)1.1.339Mary-nyan
* Update readme to mention .NET 7 * infra: Migrate to .NET 7 .NET 7 is still in preview but this prepare for the release coming up next month. * Use Random.Shared in CreateRandom * Move UInt128Utils.cs to Ryujinx.Common project * Fix inverted parameters in System.UInt128 constructor * Fix Visual Studio complains on Ryujinx.Graphics.Vic * time: Fix missing alignment enforcement in SystemClockContext Fixes at least Smash * time: Fix missing alignment enforcement in SteadyClockContext Fix games (like recent version of Smash) using time shared memory * Switch to .NET 7.0.100 release * Enable Tiered PGO * Ensure CreateId validity requirements are meet when doing random generation Also enforce correct packing layout for other Mii structures. This fix a Mario Kart 8 crashes related to the default Miis.
2022-10-19Do not clear the rejit queue when overlaps count is equal to 0. (#3721)1.1.319LDj3SNuD
* Do not clear the rejit queue when overlaps count is equal to 0. * Ptc and PtcProfiler must be invalidated. * Revert "Ptc and PtcProfiler must be invalidated." This reverts commit f5b0ad9d7dc3c0b3a0da184de4d04d7234939c81. * Fix #3710 slow path due to #3701.
2022-10-19A32: Implement VCVTT, VCVTB (#3710)1.1.315merry
* A32: Implement VCVTT, VCVTB * A32: F16C implementation of VCVTT/VCVTB
2022-10-19A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V in… (#3712)1.1.314LDj3SNuD
* A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V instructions; they use "Round to Nearest with Ties to Away" rounding mode not supported in x86. All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. The titles Mario Strikers and Super Smash Bros. U. use these instructions intensively. * Update Ptc.cs * A32: Add fast path for Vcvta_RM, Vrinta_RM and Vrinta_V instructions aswell.
2022-10-02ARMeilleure: Add `gfni` acceleration (#3669)1.1.286Wunk
* ARMeilleure: Add `GFNI` detection This is intended for utilizing the `gf2p8affineqb` instruction * ARMeilleure: Add `gf2p8affineqb` Not using the VEX or EVEX-form of this instruction is intentional. There are `GFNI`-chips that do not support AVX(so no VEX encoding) such as Tremont(Lakefield) chips as well as Jasper Lake. https://github.com/InstLatx64/InstLatx64/blob/13df339fe7150b114929f71b19a6b2fe72fc751e/GenuineIntel/GenuineIntel00806A1_Lakefield_LC_InstLatX64.txt#L1297-L1299 https://github.com/InstLatx64/InstLatx64/blob/13df339fe7150b114929f71b19a6b2fe72fc751e/GenuineIntel/GenuineIntel00906C0_JasperLake_InstLatX64.txt#L1252-L1254 * ARMeilleure: Add `gfni` acceleration of `Rbit_V` Passes all `Rbit_V*` unit tests on my `i9-11900k` * ARMeilleure: Add `gfni` acceleration of `S{l,r}i_V` Also added a fast-path for when the shift amount is greater than the size of the element. * ARMeilleure: Add `gfni` acceleration of `Shl_V` and `Sshr_V` * ARMeilleure: Increment InternalVersion * ARMeilleure: Fix Intrinsic and Assembler Table alignment `gf2p8affineqb` is the longest instruction name I know of. It shouldn't get any wider than this. * ARMeilleure: Remove SSE2+SHA requirement for GFNI * ARMeilleure Add `X86GetGf2p8LogicalShiftLeft` Used to generate GF(2^8) 8x8 bit-matrices for bit-shifting for the `gf2p8affineqb` instruction. * ARMeilleure: Append `FeatureInfo7Ecx` to `FeatureInfo`
2022-09-20Fpsr and Fpcr freed. (#3701)1.1.279LDj3SNuD
* Implemented in IR the managed methods of the Saturating region ... ... of the SoftFallback class (the SatQ ones). The need to natively manage the Fpcr and Fpsr system registers is still a fact. Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Ptc.InternalVersion = 3665 * Addressed PR feedback. * Implemented in IR the managed methods of the ShlReg region of the SoftFallback class. It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Fpsr and Fpcr freed. Handling/isolation of Fpsr and Fpcr via register for IR and via memory for Tests and Threads, with synchronization to context exchanges (explicit for SoftFloat); without having to call managed methods. Thanks to the inlining work of the previous two PRs and others in this. Tests performed locally in both release and debug modes, in both lowcq and highcq, with FastFP to true and false (explicit FP tests included). Tested with the title Tony Hawk's PS. Depends on shlreg. * Update InstEmitSimdHelper.cs * De-magic Masks. Remove the Stride and Len flags; Fpsr.NZCV are A32 only, then moved to Fpscr: this leads to emitting less IR in reference to Get/Set Fpsr/Fpcr/Fpscr methods in reference to Mrs/Msr (A64) and Vmrs/Vmsr (A32) instructions. * Addressed PR feedback.
2022-09-19Implemented in IR the managed methods of the ShlReg region of the ↵1.1.273LDj3SNuD
SoftFallback class. (#3700) * Implemented in IR the managed methods of the Saturating region ... ... of the SoftFallback class (the SatQ ones). The need to natively manage the Fpcr and Fpsr system registers is still a fact. Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Ptc.InternalVersion = 3665 * Addressed PR feedback. * Implemented in IR the managed methods of the ShlReg region of the SoftFallback class. It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Update InstEmitSimdHelper.cs
2022-09-14A32/T32/A64: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD) ↵1.1.272merry
(#3694) * OpCodeTable: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD) * A64: Remove catch-all Hint instruction * T16: Handle unallocated hint instructions Some thumb tests execute these assuming that they're nops. * T32: Fill out other Hint instructions * A32: Fill out other hint instructions
2022-09-13Implement PLD and SUB (imm16) on T32, plus UADD8, SADD8, USUB8 and SSUB8 on ↵1.1.269gdkchan
both A32 and T32 (#3693)
2022-09-13T32: Implement Asimd instructions (#3692)1.1.268merry
2022-09-13Fix increment on Arm32 NEON VLDn/VSTn instructions with regs > 1 (#3695)1.1.266gdkchan
* Fix increment on Arm32 NEON VLDn/VSTn instructions with regs > 1 * PPTC version bump * PR feedback
2022-09-11Implement VRINT (vector) Arm32 NEON instructions (#3691)1.1.263gdkchan
2022-09-10T32: Add Vfp instructions (#3690)1.1.262merry
2022-09-10Implement Thumb (32-bit) memory (ordered), multiply, extension and bitfield ↵1.1.261gdkchan
instructions (#3687) * Implement Thumb (32-bit) memory (ordered), multiply and bitfield instructions * Remove public from interface * Fix T32 BL immediate and implement signed and unsigned extend instructions
2022-09-09Add ADD (zx imm12), NOP, MOV (rs), LDA, TBB, TBH, MOV (zx imm16) and CLZ ↵1.1.256gdkchan
thumb instructions (#3683) * Add ADD (zx imm12), NOP, MOV (register shifted), LDA, TBB, TBH, MOV (zx imm16) and CLZ thumb instructions, fix LDRD, STRD, CBZ, CBNZ and BLX (reg) * Bump PPTC version
2022-09-09Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, ↵1.1.255gdkchan
VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions (#3677) * Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions * PPTC version * Fix VQADD/VQSUB * Improve MRC/MCR handling and exception messages In case data is being recompiled as code, we don't want to throw at emit stage, instead we should only throw if it actually tries to execute
2022-09-08Clean up rejit queue (#2751)1.1.253FICTURE7
2022-09-08Implemented in IR the managed methods of the Saturating region ... (#3665)1.1.252LDj3SNuD
* Implemented in IR the managed methods of the Saturating region ... ... of the SoftFallback class (the SatQ ones). The need to natively manage the Fpcr and Fpsr system registers is still a fact. Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Ptc.InternalVersion = 3665 * Addressed PR feedback.
2022-08-26Optimize kernel memory block lookup and consolidate RBTree implementations ↵1.1.240gdkchan
(#3410) * Implement intrusive red-black tree, use it for HLE kernel block manager * Implement TreeDictionary using IntrusiveRedBlackTree * Implement IntervalTree using IntrusiveRedBlackTree * Implement IntervalTree (on Ryujinx.Memory) using IntrusiveRedBlackTree * Make PredecessorOf and SuccessorOf internal, expose Predecessor and Successor properties on the node itself * Allocation free tree node lookup
2022-08-25ARMeilleure: Hardware accelerate SHA256 (#3585)1.1.230merry
* ARMeilleure/HardwareCapabilities: Add Sha * ARMeilleure/Intrinsic: Add X86Sha256Rnds2 * ARmeilleure: Hardware accelerate SHA256H/SHA256H2 * ARMeilleure/Intrinsic: Add X86Sha256Msg1, X86Sha256Msg2 * ARMeilleure/Intrinsic: Add X86Palignr * ARMeilleure: Hardware accelerate SHA256SU0, SHA256SU1 * PTC: Bump InternalVersion
2022-08-25Implement some 32-bit Thumb instructions (#3614)1.1.229gdkchan
* Implement some 32-bit Thumb instructions * Optimize OpCode32MemMult using PopCount
2022-08-19A few minor documentation fixes. (#3599)1.1.224Nicholas Rodine
* A few minor documentation fixes. * Removed more invalid inheritdoc instances.
2022-08-18Removed unused usings. (#3593)1.1.223Nicholas Rodine
* Removed unused usings. * Added back using, now that it's used. * Removed extra whitespace.
2022-08-14PreAllocator: Check if instruction supports a Vex prefix in ↵1.1.215merry
IsVexSameOperandDestSrc1 (#3587)
2022-08-05Implement Arm32 Sha256 and MRS Rd, CPSR instructions (#3544)1.1.208gdkchan
* Implement Arm32 Sha256 and MRS Rd, CPSR instructions * Add tests using Arm64 outputs
2022-07-29Move partial unmap handler to the native signal handler (#3437)1.1.199riperiperi
* Initial commit with a lot of testing stuff. * Partial Unmap Cleanup Part 1 * Fix some minor issues, hopefully windows tests. * Disable partial unmap tests on macos for now Weird issue. * Goodbye magic number * Add COMPlus_EnableAlternateStackCheck for tests `COMPlus_EnableAlternateStackCheck` is needed for NullReferenceException handling to work on linux after registering the signal handler, due to how dotnet registers its own signal handler. * Address some feedback * Force retry when memory is mapped in memory tracking This case existed before, but returning `false` no longer retries, so it would crash immediately after unprotecting the memory... Now, we return `true` to deliberately retry. This case existed before (was just broken by this change) and I don't really want to look into fixing the issue right now. Technically, this means that on guest code partial unmaps will retry _due to this_ rather than hitting the handler. I don't expect this to cause any issues. This should fix random crashes in Xenoblade Chronicles 2. * Use IsRangeMapped * Suppress MockMemoryManager.UnmapEvent warning This event is not signalled by the mock memory manager. * Remove 4kb mapping
2022-07-06Implement CPU FCVT Half <-> Double conversion variants (#3439)1.1.165gdkchan
* Half <-> Double conversion support * Add tests, fast path and deduplicate SoftFloat code * PPTC version
2022-06-05Extend uses count from ushort to uint on Operand Data structure (#3374)1.1.140gdkchan
2022-05-31Refactor CPU interface to allow the implementation of other CPU emulators ↵1.1.134gdkchan
(#3362) * Refactor CPU interface * Use IExecutionContext interface on SVC handler, change how CPU interrupts invokes the handlers * Make CpuEngine take a ITickSource rather than returning one The previous implementation had the scenario where the CPU engine had to implement the tick source in mind, like for example, when we have a hypervisor and the game can read CNTPCT on the host directly. However given that we need to do conversion due to different frequencies anyway, it's not worth it. It's better to just let the user pass the tick source and redirect any reads to CNTPCT to the user tick source * XML docs for the public interfaces * PPTC invalidation due to NativeInterface function name changes * Fix build of the CPU tests * PR feedback
2022-05-02Support memory aliasing (#2954)1.1.110gdkchan
* Back to the origins: Make memory manager take guest PA rather than host address once again * Direct mapping with alias support on Windows * Fixes and remove more of the emulated shared memory * Linux support * Make shared and transfer memory not depend on SharedMemoryStorage * More efficient view mapping on Windows (no more restricted to 4KB pages at a time) * Handle potential access violations caused by partial unmap * Implement host mapping using shared memory on Linux * Add new GetPhysicalAddressChecked method, used to ensure the virtual address is mapped before address translation Also align GetRef behaviour with software memory manager * We don't need a mirrorable memory block for software memory manager mode * Disable memory aliasing tests while we don't have shared memory support on Mac * Shared memory & SIGBUS handler for macOS * Fix typo + nits + re-enable memory tests * Set MAP_JIT_DARWIN on x86 Mac too * Add back the address space mirror * Only set MAP_JIT_DARWIN if we are mapping as executable * Disable aliasing tests again (still fails on Mac) * Fix UnmapView4KB (by not casting size to int) * Use ref counting on memory blocks to delay closing the shared memory handle until all blocks using it are disposed * Address PR feedback * Make RO hold a reference to the guest process memory manager to avoid early disposal Co-authored-by: nastys <nastys@users.noreply.github.com>
2022-04-21T32: Implement load/store single (immediate) (#3186)1.1.106merry
* T32: Implement load/store single (immediate) * tests * tidy formatting * address comments
2022-04-09Fix tail merge from block with conditional jump to multiple returns (#3267)1.1.100gdkchan
* Fix tail merge from block with conditional jump to multiple returns * PPTC version bump
2022-03-19InstEmitMemoryEx: Barrier after write on ordered store (#3193)1.1.77merry
* InstEmitMemoryEx: Barrier after write on ordered store * increment ptc version * 32
2022-03-11KThread: Fix GetPsr mask (#3180)1.1.65merry
* ExecutionContext: GetPstate / SetPstate * Put it in NativeContext * KThread: Fix GetPsr mask * ExecutionContext: Turn methods into Pstate property * Address nit