-
Notifications
You must be signed in to change notification settings - Fork 13
Added support for SMP. Implementation of ARCONNECT. #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Modules implemented: - Inter-core Interrupt Unit (ICI) - Interrupt Distribution Unit (IDU) (not fully tested) - Global Free-Running Counter (GFRC) (not tested)
|
Cannot compile it on Ubuntu 20.04: For the record: |
abrodkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, quad-core Linux on HS3x/4x does boot and runs hackbench, which is pretty cool itself!
$ ./build/qemu-system-arc -M virt -serial mon:stdio -display none -kernel vmlinux -cpu archs -smp 4
Linux version 5.17.13 (abrodkin@abrodkin-5550) (arc-buildroot-linux-gnu-gcc.br_real (Buildroot 2022.05-rc2-58-g5821e96bd3) 11.3.0, GNU ld (GNU Binutils) 2.38) #2 SMP PREEMPT Mon Jun 6 15:30:47 +04 2022
Memory @ 80000000 [512M]
OF: fdt: Machine model: snps,zebu_hs-smp
earlycon: uart8250 at MMIO32 0xf0000000 (options '115200n8')
printk: bootconsole [uart8250] enabled
Failed to get possible-cpus from dtb, pretending all 4 cpus exist
archs-intc : 16 priority levels (default 1) FIRQ (not used)
IDENTITY : ARCVER [0x54] ARCNUM [0x0] CHIPID [0xffff]
processor [0] : HS38 R3.10a (ARCv2 ISA)
Timers : Timer0 Timer1
ISA Extn : mpy[opt 7]
MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used)
I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache : 64K, 2way/set, 64B Line, PIPT
Peripherals : 0xc0000000
Vector Table : 0x80000000
Extn [SMP] : ARConnect (v0): 4 cores with IDU
Zone ranges:
Normal [mem 0x0000000080000000-0x000000009fffffff]
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000080000000-0x000000009fffffff]
Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
percpu: Embedded 6 pages/cpu s14848 r8192 d26112 u49152
pcpu-alloc: s14848 r8192 d26112 u49152 alloc=6*8192
pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
Built 1 zonelists, mobility grouping on. Total pages: 65248
Kernel command line: earlycon=uart8250,mmio32,0xf0000000,115200n8 console=ttyS0,115200n8 debug print-fatal-signals=1
Dentry cache hash table entries: 65536 (order: 5, 262144 bytes, linear)
Inode-cache hash table entries: 32768 (order: 4, 131072 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 513600K/524288K available (3636K kernel code, 595K rwdata, 776K rodata, 1560K init, 241K bss, 10688K reserved, 0K cma-reserved)
rcu: Preemptible hierarchical RCU implementation.
rcu: RCU event tracing is enabled.
Trampoline variant of Tasks RCU enabled.
rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
NR_IRQS: 512
MCIP: IDU supports 4 common irqs
Global-64-bit-Ctr clocksource not detected
Failed to initialize '/gfrc': -6
Console: colour dummy device 80x25
sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every 21474836475000000ns
Calibrating delay loop... 190.87 BogoMIPS (lpj=954368)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 0, 8192 bytes, linear)
Mountpoint-cache hash table entries: 2048 (order: 0, 8192 bytes, linear)
cblist_init_generic: Setting adjustable number of callback queues.
cblist_init_generic: Setting shift to 2 and lim to 1.
rcu: Hierarchical SRCU implementation.
smp: Bringing up secondary CPUs ...
Idle Task [1] (ptrval)
Trying to bring up CPU1 ...
archs-intc : 16 priority levels (default 1) FIRQ (not used)
IDENTITY : ARCVER [0x54] ARCNUM [0x1] CHIPID [0xffff]
processor [1] : HS38 R3.10a (ARCv2 ISA)
Timers : Timer0 Timer1
ISA Extn : mpy[opt 7]
MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used)
I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache : 64K, 2way/set, 64B Line, PIPT
Peripherals : 0xc0000000
Vector Table : 0x80000000
Extn [SMP] : ARConnect (v0): 4 cores with IDU
## CPU1 LIVE ##: Executing Code...
Idle Task [2] (ptrval)
Trying to bring up CPU2 ...
archs-intc : 16 priority levels (default 1) FIRQ (not used)
IDENTITY : ARCVER [0x54] ARCNUM [0x2] CHIPID [0xffff]
processor [2] : HS38 R3.10a (ARCv2 ISA)
Timers : Timer0 Timer1
ISA Extn : mpy[opt 7]
MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used)
I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache : 64K, 2way/set, 64B Line, PIPT
Peripherals : 0xc0000000
Vector Table : 0x80000000
Extn [SMP] : ARConnect (v0): 4 cores with IDU
## CPU2 LIVE ##: Executing Code...
Idle Task [3] (ptrval)
Trying to bring up CPU3 ...
archs-intc : 16 priority levels (default 1) FIRQ (not used)
IDENTITY : ARCVER [0x54] ARCNUM [0x3] CHIPID [0xffff]
processor [3] : HS38 R3.10a (ARCv2 ISA)
Timers : Timer0 Timer1
ISA Extn : mpy[opt 7]
MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used)
I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache : 64K, 2way/set, 64B Line, PIPT
Peripherals : 0xc0000000
Vector Table : 0x80000000
Extn [SMP] : ARConnect (v0): 4 cores with IDU
## CPU3 LIVE ##: Executing Code...
smp: Brought up 1 node, 4 CPUs
devtmpfs: initialized
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
futex hash table entries: 1024 (order: 4, 131072 bytes, linear)
NET: Registered PF_NETLINK/PF_ROUTE protocol family
DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
NET: Registered PF_INET protocol family
IP idents hash table entries: 8192 (order: 3, 65536 bytes, linear)
tcp_listen_portaddr_hash hash table entries: 1024 (order: 0, 12288 bytes, linear)
TCP established hash table entries: 4096 (order: 1, 16384 bytes, linear)
TCP bind hash table entries: 4096 (order: 2, 32768 bytes, linear)
TCP: Hash tables configured (established 4096 bind 4096)
UDP hash table entries: 256 (order: 0, 8192 bytes, linear)
UDP-Lite hash table entries: 256 (order: 0, 8192 bytes, linear)
NET: Registered PF_UNIX/PF_LOCAL protocol family
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
arc-pct fpga:pct: use noncoherent DMA ops
This core does not have performance counters!
workingset: timestamp_bits=30 max_order=16 bucket_order=0
io scheduler mq-deadline registered
io scheduler kyber registered
simple-pm-bus fpga: use noncoherent DMA ops
Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled
of_serial f0000000.serial: use noncoherent DMA ops
printk: console [ttyS0] disabled
f0000000.serial: ttyS0 at MMIO 0xf0000000 (irq = 1, base_baud = 3125000) is a 16550A
printk: console [ttyS0] enabled
printk: console [ttyS0] enabled
printk: bootconsole [uart8250] disabled
printk: bootconsole [uart8250] disabled
NET: Registered PF_PACKET protocol family
NET: Registered PF_KEY protocol family
Freeing unused kernel image (initmem) memory: 1560K
This architecture does not have kernel memory protection.
Run /init as init process
with arguments:
/init
with environment:
HOME=/
TERM=linux
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Saving random seed: random: dd: uninitialized urandom read (32 bytes read)
OK
Starting network: OK
Welcome to Buildroot
buildroot login: random: crng init done
buildroot login: root
# hackbench
Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks)
Each sender will pass 100 messages of 100 bytes
Time: 5.000
|
Still, would be good to see:
|
|
This was only tested for arcv2. ARConnect might not even be enabled on v3. Just doesn't properly complain about using -smp option |
|
@cupertinomiranda AFAIK it should be exactly the same for ARCv3. And that's what I have in the log: |
|
OK, with that trivial fix: diff --git a/target/arc/regs-detail.def b/target/arc/regs-detail.def
index d0ab800f30..3ce3bf1a67 100644
--- a/target/arc/regs-detail.def
+++ b/target/arc/regs-detail.def
@@ -406,9 +406,9 @@ DEF(0x545, ARC_OPCODE_ARC700, NONE, aux_cabac_misc2)
/* ARConnect */
DEF (0xd0, ARC_OPCODE_ARCALL, NONE, mcip_bcr)
-DEF(0x600, ARC_OPCODE_ARCV2, NONE, mcip_cmd)
-DEF(0x601, ARC_OPCODE_ARCV2, NONE, mcip_wdata)
-DEF(0x602, ARC_OPCODE_ARCV2, NONE, mcip_readback)
+DEF(0x600, ARC_OPCODE_ARCALL, NONE, mcip_cmd)
+DEF(0x601, ARC_OPCODE_ARCALL, NONE, mcip_wdata)
+DEF(0x602, ARC_OPCODE_ARCALL, NONE, mcip_readback)
DEF(0x700, ARC_OPCODE_ARCALL, NONE, smart_control)
/*I have much more success, see: So basically, it works exactly as on a UP HS5x - as it fails on user-space stuff. Great work, anyways! |
|
@cupertinomiranda FWIW rebased on top of today's |
Modules implemented: