Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc tweaks and fixes to the interpreter #2125

Open
wants to merge 121 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
121 commits
Select commit Hold shift + click to select a range
065573f
fix writebacks overwriting registers swapped with spsr
Jaklyy May 31, 2024
960f063
improve data aborts for ldm
Jaklyy Jun 2, 2024
63d4b78
improve implementation
Jaklyy Jun 2, 2024
b5c1ee3
implement stm
Jaklyy Jun 2, 2024
5e760a1
slightly cleaner code
Jaklyy Jun 2, 2024
c2a57b7
fix stmd(a/b) writeback
Jaklyy Jun 3, 2024
1e8194e
fix ldr and str
Jaklyy Jun 4, 2024
317a8c6
data abort handling for (almost) all (arm) instructions
Jaklyy Jun 5, 2024
1871c48
fix double data aborts with strd
Jaklyy Jun 5, 2024
7c3108e
handle swp instruction aborts
Jaklyy Jun 5, 2024
13ae96b
simple thumb instructions (untested but probably right)
Jaklyy Jun 5, 2024
d6cd189
rework data abort handling for ldm/stm; implement thumb stmia+push
Jaklyy Jun 6, 2024
8bc7e45
thumb ldmia/pop data aborts
Jaklyy Jun 6, 2024
bd3611b
unaligned registers with strd/ldrd raise an exception
Jaklyy Jun 8, 2024
2b0ed45
fully implement r15 stores being +12 of addr
Jaklyy Jun 8, 2024
7350762
idk why it took me two tries to get these instructions to work properly
Jaklyy Jun 8, 2024
0c88720
fix some more instructions?
Jaklyy Jun 8, 2024
8191f92
mcr is also affected
Jaklyy Jun 8, 2024
5f97dfc
fix bits fixed to 0 for pu region sizing being set
Jaklyy Jun 8, 2024
3699768
most cpsr bits can't actually be updated (or at least can't be read?)
Jaklyy Jun 8, 2024
659763f
clarification
Jaklyy Jun 8, 2024
849d4e5
imma be real, i have no idea what is going on here
Jaklyy Jun 9, 2024
b846c6f
remove out of date comments
Jaklyy Jun 9, 2024
be60c68
more weirdness
Jaklyy Jun 9, 2024
b90d5c2
what the actual F*** is going on
Jaklyy Jun 9, 2024
ae0824f
it all makes sense now...
Jaklyy Jun 9, 2024
ca04710
ldrd is just ldm
Jaklyy Jun 10, 2024
3ddccde
verified
Jaklyy Jun 10, 2024
048b0b8
swp/swpb jumps work on the arm 7?
Jaklyy Jun 10, 2024
4221810
verify writable msr bits
Jaklyy Jun 11, 2024
5a174a2
track interlock cycles for load instructions
Jaklyy Jun 14, 2024
aa1217a
track interlock cycles for the ALU
Jaklyy Jun 14, 2024
a973c0b
initial implementation of interlock cycles
Jaklyy Jun 15, 2024
4495576
don't do interlocks for the arm7
Jaklyy Jun 15, 2024
debaaa0
fix performance regression for disabling interlock emulation path
Jaklyy Jun 15, 2024
5b37ca7
implement correct/guess interlocks for remaining instructions
Jaklyy Jun 17, 2024
f00f1f6
im smart
Jaklyy Jun 17, 2024
a9e2c7e
implement two regs i missed
Jaklyy Jun 17, 2024
c5258d6
verify interlocks for alu and load/store
Jaklyy Jun 17, 2024
e6ba407
correct interlocked reg for umlal
Jaklyy Jun 18, 2024
f1b71fe
implement configurable vram bus width
Jaklyy Jun 24, 2024
3583d82
disable interlock emulation, needs more research
Jaklyy Jun 24, 2024
109bbed
improve ldm timings
Jaklyy Jun 24, 2024
dbe00e7
improve stm timings
Jaklyy Jun 25, 2024
541e1e6
proper timings for ldr/str
Jaklyy Jun 25, 2024
c5b035a
SWP and SWPB use the same behavior as STR on the ARM9
Jaklyy Jun 25, 2024
88e5584
fix clz r15
Jaklyy Jun 27, 2024
0060958
Merge remote-tracking branch 'upstream/master' into jump-after-writeback
Jaklyy Jul 3, 2024
a549977
fix clz for realsies
Jaklyy Jul 4, 2024
bd1665c
minor timing tweaks
Jaklyy Jul 4, 2024
ea429a1
improve interlock emulation
Jaklyy Jul 4, 2024
0f02c0b
disable interlock emulation again again
Jaklyy Jul 6, 2024
3837506
doesn't really matter but idk it's more correct?
Jaklyy Jul 6, 2024
e2be0b4
actually no it was not more correct
Jaklyy Jul 7, 2024
1fdac1d
...why am i checking for dtcm?
Jaklyy Jul 11, 2024
038ffa3
revert the *entire* interlock implemention
Jaklyy Jul 12, 2024
4fcd52e
someday i will learn to test things before pushing them
Jaklyy Jul 12, 2024
789ef21
improve timings for S variants of multiply instructions on arm9
Jaklyy Jul 13, 2024
764ee9e
improve timings further
Jaklyy Jul 13, 2024
4f6db5a
Merge remote-tracking branch 'upstream/master' into jump-after-writeback
Jaklyy Jul 18, 2024
36f4f2c
Revert "improve timings further"
Jaklyy Jul 19, 2024
13578a3
Revert "improve timings for S variants of multiply instructions on arm9"
Jaklyy Jul 19, 2024
7cd50e7
fix some multiply timings
Jaklyy Jul 19, 2024
3c936d8
improve mrs, mrc timings
Jaklyy Aug 3, 2024
2e421e2
cache should be disabled when pu is disabled
Jaklyy Jul 27, 2024
4b703d2
improve msr timings for arm9
Jaklyy Jul 22, 2024
ab2a8f1
revert timing tweaks, finish thumb interwork code
Jaklyy Aug 4, 2024
346ac13
forgot to remove a thingy when removing timing reworks
Jaklyy Aug 4, 2024
fe69cfa
Merge remote-tracking branch 'upstream/master' into interpreter-fixes
Jaklyy Aug 5, 2024
587958e
Improve accuracy of prefetch aborts
Jaklyy Aug 5, 2024
0dc619d
Revert "Improve accuracy of prefetch aborts"
Jaklyy Aug 5, 2024
eedd280
Reapply "Improve accuracy of prefetch aborts"
Jaklyy Aug 5, 2024
a85b2bf
tweak when irqs are triggered and fix prefetch aborts
Jaklyy Aug 5, 2024
332a39d
fix JIT being borked
Jaklyy Aug 5, 2024
40e8e8e
rework single load/stores to use a shared instruction
Jaklyy Aug 23, 2024
f692e73
the docs lied to me (again)
Jaklyy Aug 26, 2024
a9aad74
implement user mode load/stores
Jaklyy Aug 27, 2024
be290da
de-duplicate swp(b)
Jaklyy Aug 27, 2024
685c482
try not forgetting about stores lol
Jaklyy Aug 28, 2024
0003821
apparently i never tested this
Jaklyy Aug 29, 2024
f0bd2b9
Merge remote-tracking branch 'upstream/master' into interpreter-fixes
Jaklyy Aug 30, 2024
c5ac682
improve data abort handling further
Jaklyy Sep 12, 2024
a0d7113
very minor optimization attempt
Jaklyy Sep 13, 2024
3b9a9e4
multiply instructions can't write to r15
Jaklyy Sep 16, 2024
ac8c942
sat add/sub also fail to jump
Jaklyy Sep 16, 2024
e2f3dd1
clarify
Jaklyy Sep 16, 2024
e5654ec
r15 mrc mrs
Jaklyy Sep 16, 2024
89e8549
implement comparison instrs w/ rd == 15
Jaklyy Sep 17, 2024
6ebabde
implement changing thumb bit. and bkpt ig
Jaklyy Sep 18, 2024
45f87a1
prevent t bit changes without pipeline flush on arm7
Jaklyy Sep 20, 2024
c133814
some day i will remember to test before pushing
Jaklyy Sep 20, 2024
7afa805
slightly better code
Jaklyy Sep 20, 2024
157e9c5
reimplement changing t bit with arm7
Jaklyy Sep 20, 2024
8d451df
misaligned pc..........
Jaklyy Sep 21, 2024
7b0d71d
Revert T bit changing support for arm7
Jaklyy Sep 22, 2024
8af790b
ldm/str with empty rlist
Jaklyy Sep 23, 2024
3b73f21
str r15 is incremented by +2/+4 oop
Jaklyy Sep 23, 2024
7fb18b1
clean up code
Jaklyy Sep 24, 2024
e1d4fbe
i can't reproduce this anymore
Jaklyy Sep 24, 2024
3065141
probably not faster
Jaklyy Sep 24, 2024
a11208e
oops
Jaklyy Sep 25, 2024
19e0b18
Merge remote-tracking branch 'upstream/master' into interpreter-fixes
Jaklyy Sep 30, 2024
53b38c3
ok no it didn't lie to me
Jaklyy Oct 10, 2024
3870216
correction:
Jaklyy Oct 10, 2024
93dce82
implement cmp with "rd == 15" on arm9
Jaklyy Oct 10, 2024
787d0c9
mrc r15 updates flags
Jaklyy Oct 10, 2024
e0e78a2
make empty r-list instructions a bit nicer
Jaklyy Oct 12, 2024
5f003eb
fix builds with jit disabled
Jaklyy Oct 16, 2024
3c7db9b
correct thumb multiply timings
Jaklyy Nov 6, 2024
3d49f5f
arm7 muls carry flag emulation.
Jaklyy Nov 6, 2024
3bd6274
Merge remote-tracking branch 'upstream/master' into interpreter-fixes
Jaklyy Nov 6, 2024
ef5de60
t blx long with bit 0 set should raise an exception
Jaklyy Nov 7, 2024
5091061
improve accuracy of prefetch abort handling slightly
Jaklyy Nov 8, 2024
60a819c
correct handling of T bit changes w/o pipeline flush on arm9
Jaklyy Nov 8, 2024
676f471
fix edge case with thumb prefetch aborts
Jaklyy Nov 8, 2024
9f8cf8d
ldm base writeback fails with r15
Jaklyy Nov 9, 2024
e4dd913
arm7 RORs unaligned ldr(s)h
Jaklyy Nov 9, 2024
bdc3151
T_LDR_SPREL does ROR + misc cleanup
Jaklyy Nov 9, 2024
ec241a8
im smrat :D
Jaklyy Nov 9, 2024
fce0555
slightly fix error in writeback handling
Jaklyy Nov 10, 2024
9d92b87
r15 writeback is very weird with ldr/str
Jaklyy Nov 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 78 additions & 47 deletions src/ARM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@ void ARM::DoSavestate(Savestate* file)
file->VarArray(R_ABT, 3*sizeof(u32));
file->VarArray(R_IRQ, 3*sizeof(u32));
file->VarArray(R_UND, 3*sizeof(u32));
file->Var32(&CurInstr);
file->Var64(&CurInstr);
#ifdef JIT_ENABLED
if (file->Saving && NDS.IsJITEnabled())
{
Expand All @@ -232,7 +232,7 @@ void ARM::DoSavestate(Savestate* file)
FillPipeline();
}
#endif
file->VarArray(NextInstr, 2*sizeof(u32));
file->VarArray(NextInstr, 2*sizeof(u64));

file->Var32(&ExceptionBase);

Expand Down Expand Up @@ -344,12 +344,6 @@ void ARMv5::JumpTo(u32 addr, bool restorecpsr)
CPSR &= ~0x20;
}

if (!(PU_Map[addr>>12] & 0x04))
{
PrefetchAbort();
return;
}

NDS.MonitorARM9Jump(addr);
}

Expand Down Expand Up @@ -518,6 +512,7 @@ void ARM::UpdateMode(u32 oldmode, u32 newmode, bool phony)
}
}

template <CPUExecuteMode mode>
void ARM::TriggerIRQ()
{
if (CPSR & 0x80)
Expand All @@ -529,7 +524,12 @@ void ARM::TriggerIRQ()
UpdateMode(oldcpsr, CPSR);

R_IRQ[2] = oldcpsr;
R[14] = R[15] + (oldcpsr & 0x20 ? 2 : 0);
#ifdef JIT_ENABLED
if constexpr (mode == CPUExecuteMode::JIT)
R[14] = R[15] + (oldcpsr & 0x20 ? 2 : 0);
else
#endif
R[14] = R[15] - (oldcpsr & 0x20 ? 0 : 4);
JumpTo(ExceptionBase + 0x18);

// ARDS cheat support
Expand All @@ -540,6 +540,11 @@ void ARM::TriggerIRQ()
NDS.AREngine.RunCheats();
}
}
template void ARM::TriggerIRQ<CPUExecuteMode::Interpreter>();
template void ARM::TriggerIRQ<CPUExecuteMode::InterpreterGDB>();
#ifdef JIT_ENABLED
template void ARM::TriggerIRQ<CPUExecuteMode::JIT>();
#endif

void ARMv5::PrefetchAbort()
{
Expand All @@ -550,17 +555,8 @@ void ARMv5::PrefetchAbort()
CPSR |= 0x97;
UpdateMode(oldcpsr, CPSR);

// this shouldn't happen, but if it does, we're stuck in some nasty endless loop
// so better take care of it
if (!(PU_Map[ExceptionBase>>12] & 0x04))
{
Log(LogLevel::Error, "!!!!! EXCEPTION REGION NOT EXECUTABLE. THIS IS VERY BAD!!\n");
NDS.Stop(Platform::StopReason::BadExceptionRegion);
return;
}

R_ABT[2] = oldcpsr;
R[14] = R[15] + (oldcpsr & 0x20 ? 2 : 0);
R[14] = R[15] - (oldcpsr & 0x20 ? 0 : 4);
JumpTo(ExceptionBase + 0x0C);
}

Expand Down Expand Up @@ -599,7 +595,13 @@ void ARMv5::Execute()
{
Halted = 0;
if (NDS.IME[0] & 0x1)
TriggerIRQ();
{
#ifdef JIT_ENABLED
if constexpr (mode == CPUExecuteMode::JIT) TriggerIRQ<mode>();
else
#endif
IRQ = 1;
}
}
else
{
Expand Down Expand Up @@ -634,7 +636,7 @@ void ARMv5::Execute()
{
// this order is crucial otherwise idle loops waiting for an IRQ won't function
if (IRQ)
TriggerIRQ();
TriggerIRQ<mode>();

if (Halted || IdleLoop)
{
Expand Down Expand Up @@ -662,10 +664,18 @@ void ARMv5::Execute()
NextInstr[0] = NextInstr[1];
if (R[15] & 0x2) { NextInstr[1] >>= 16; CodeCycles = 0; }
else NextInstr[1] = CodeRead32(R[15], false);

// actually execute
u32 icode = (CurInstr >> 6) & 0x3FF;
ARMInterpreter::THUMBInstrTable[icode](this);


if (IRQ && !(CPSR & 0x80)) TriggerIRQ<mode>();
else if (CurInstr > 0xFFFFFFFF) [[unlikely]] // handle aborted instructions
{
PrefetchAbort();
}
else [[likely]] // actually execute
{
u32 icode = (CurInstr >> 6) & 0x3FF;
ARMInterpreter::THUMBInstrTable[icode](this);
}
}
else
{
Expand All @@ -677,9 +687,14 @@ void ARMv5::Execute()
CurInstr = NextInstr[0];
NextInstr[0] = NextInstr[1];
NextInstr[1] = CodeRead32(R[15], false);


// actually execute
if (CheckCondition(CurInstr >> 28))
if (IRQ && !(CPSR & 0x80)) TriggerIRQ<mode>();
else if (CurInstr & ((u64)1<<63)) [[unlikely]] // handle aborted instructions
{
PrefetchAbort();
}
else if (CheckCondition(CurInstr >> 28)) [[likely]] // actually execute
{
u32 icode = ((CurInstr >> 4) & 0xF) | ((CurInstr >> 16) & 0xFF0);
ARMInterpreter::ARMInstrTable[icode](this);
Expand All @@ -688,6 +703,10 @@ void ARMv5::Execute()
{
ARMInterpreter::A_BLX_IMM(this);
}
else if ((CurInstr & 0x0FF000F0) == 0x01200070)
{
ARMInterpreter::A_BKPT(this); // always passes regardless of condition code
}
else
AddCycles_C();
}
Expand All @@ -704,10 +723,8 @@ void ARMv5::Execute()
/*if (NDS::IF[0] & NDS::IE[0])
{
if (NDS::IME[0] & 0x1)
TriggerIRQ();
TriggerIRQ<mode>();
}*/
if (IRQ) TriggerIRQ();

}

NDS.ARM9Timestamp += Cycles;
Expand Down Expand Up @@ -739,7 +756,10 @@ void ARMv4::Execute()
{
Halted = 0;
if (NDS.IME[1] & 0x1)
TriggerIRQ();
{
if constexpr (mode == CPUExecuteMode::JIT) TriggerIRQ<mode>();
else IRQ = 1;
}
}
else
{
Expand Down Expand Up @@ -773,7 +793,7 @@ void ARMv4::Execute()
if (StopExecution)
{
if (IRQ)
TriggerIRQ();
TriggerIRQ<mode>();

if (Halted || IdleLoop)
{
Expand Down Expand Up @@ -801,9 +821,13 @@ void ARMv4::Execute()
NextInstr[0] = NextInstr[1];
NextInstr[1] = CodeRead16(R[15]);

// actually execute
u32 icode = (CurInstr >> 6);
ARMInterpreter::THUMBInstrTable[icode](this);
if (IRQ && !(CPSR & 0x80)) TriggerIRQ<mode>();
else
{
// actually execute
u32 icode = (CurInstr >> 6);
ARMInterpreter::THUMBInstrTable[icode](this);
}
}
else
{
Expand All @@ -816,8 +840,8 @@ void ARMv4::Execute()
NextInstr[0] = NextInstr[1];
NextInstr[1] = CodeRead32(R[15]);

// actually execute
if (CheckCondition(CurInstr >> 28))
if (IRQ && !(CPSR & 0x80)) TriggerIRQ<mode>();
else if (CheckCondition(CurInstr >> 28)) // actually execute
{
u32 icode = ((CurInstr >> 4) & 0xF) | ((CurInstr >> 16) & 0xFF0);
ARMInterpreter::ARMInstrTable[icode](this);
Expand All @@ -838,9 +862,8 @@ void ARMv4::Execute()
/*if (NDS::IF[1] & NDS::IE[1])
{
if (NDS::IME[1] & 0x1)
TriggerIRQ();
TriggerIRQ<mode>();
}*/
if (IRQ) TriggerIRQ();
}

NDS.ARM7Timestamp += Cycles;
Expand Down Expand Up @@ -1113,70 +1136,78 @@ u32 ARMv5::ReadMem(u32 addr, int size)
}
#endif

void ARMv4::DataRead8(u32 addr, u32* val)
bool ARMv4::DataRead8(u32 addr, u32* val)
{
*val = BusRead8(addr);
DataRegion = addr;
DataCycles = NDS.ARM7MemTimings[addr >> 15][0];
return true;
}

void ARMv4::DataRead16(u32 addr, u32* val)
bool ARMv4::DataRead16(u32 addr, u32* val)
{
addr &= ~1;

*val = BusRead16(addr);
DataRegion = addr;
DataCycles = NDS.ARM7MemTimings[addr >> 15][0];
return true;
}

void ARMv4::DataRead32(u32 addr, u32* val)
bool ARMv4::DataRead32(u32 addr, u32* val)
{
addr &= ~3;

*val = BusRead32(addr);
DataRegion = addr;
DataCycles = NDS.ARM7MemTimings[addr >> 15][2];
return true;
}

void ARMv4::DataRead32S(u32 addr, u32* val)
bool ARMv4::DataRead32S(u32 addr, u32* val)
{
addr &= ~3;

*val = BusRead32(addr);
DataCycles += NDS.ARM7MemTimings[addr >> 15][3];
return true;
}

void ARMv4::DataWrite8(u32 addr, u8 val)
bool ARMv4::DataWrite8(u32 addr, u8 val)
{
BusWrite8(addr, val);
DataRegion = addr;
DataCycles = NDS.ARM7MemTimings[addr >> 15][0];
return true;
}

void ARMv4::DataWrite16(u32 addr, u16 val)
bool ARMv4::DataWrite16(u32 addr, u16 val)
{
addr &= ~1;

BusWrite16(addr, val);
DataRegion = addr;
DataCycles = NDS.ARM7MemTimings[addr >> 15][0];
return true;
}

void ARMv4::DataWrite32(u32 addr, u32 val)
bool ARMv4::DataWrite32(u32 addr, u32 val)
{
addr &= ~3;

BusWrite32(addr, val);
DataRegion = addr;
DataCycles = NDS.ARM7MemTimings[addr >> 15][2];
return true;
}

void ARMv4::DataWrite32S(u32 addr, u32 val)
bool ARMv4::DataWrite32S(u32 addr, u32 val)
{
addr &= ~3;

BusWrite32(addr, val);
DataCycles += NDS.ARM7MemTimings[addr >> 15][3];
return true;
}


Expand Down
Loading
Loading