Xeon Phi Co-processor Testing, Part 1 - Rocket Science
Search This Blog
Wednesday, March 21, 2018
Xeon Phi Co-processor Testing, part 1
I won an auction for 12 71S1p xeon phi coprocessor cards.
These are awesome...basically a 61 cores @ ~1.1GHz linux server with on-chip ultra high bandwidth memory in a double wide pcie card package. They're very picky about motherboards and thermals though. The motherboard must have "above 4G decoding" or "large memory BAR support" or something like that. Most supermicro's do, and all of the ASUS WS's seem to. Even then, it's not guaranteed all phis will work. The "p" phis are passively cooled, which means they're really meant for server applications. You can create some cooling fan ducts for them, or buy or 3D print them. I tested them with another DL380p G8 (with 2x E5-2690's I later harvested) I had purchased. I bought the complete GPU package (see this post) for it, but the DL380p G8 only supports the 5110p Phi, which has a lower wattage rating than the 71S1p. I hacked (shorted the sense pins) an extra ATX PSU to power them, with the cables coming out of the other PCI riser's slots so I could close the lid. The testing and firmware update procedure was fairly straightforward once I figured it out. This can probably be adapted for your own system. Most of this follows the readme text file and user guide that comes with the MPSS software. - Update DL380p's firmware
- Install CentOS 7
- Install a Phi
- Change bios settings to enable large BAR support (in advanced menu) and set fans to max.
- Disable SELinux (re-enable after done testing Phi's.
- login as root, create RSA key so can use SSH later. ssh-keygen . You want to do this before configuring MPSS for the first time, otherwise you have to manually load the key (see readme text file)
- Download the MPSS software, readme, and user guide. If your firmware is older than that in the readme, try starting with an older MPSS. If you're using a kernel that isn't listed, then you can recompile the rpms using the instructions in the readme.
- Install MPSS (see the readme and user guide). I suggest rebooting.
lspci | grep -i Co-processorThat will tell you which PCI port/slot thing its in. Mine was 24:00.0, so I did:
lspci -s 24:00.0 -vvIf lspci doesn't recognize it, then there's a problem with your card (assuming your motherboard is compatible). A likely culprit is thermal overload, especially if you're trying to use a passive "P" card without a cooling system. I actually went back to bios and enabled maximum cooling to help with this. If you have a desktop, you'll need to construct a custom cooling system (see above). Another possibility is that the card isn't seated well. Try reseating it. When none of that worked, I gave up on the card. I'm sure there is more advanced troubleshooting you could do, but I just don't know how to do it. Intel tech support seems to be pretty good, so it might be worth asking them. Next, type:
modprobe micThis starts the mic process. If you have just installed or reinstalled MPSS, then you need to do:
micctrl --initdefaultsThen:
micflash -getversionThis must be 375 for the latest MPSS release. Mine were 390. Then:
micctrl -sThis should return "ready". I'm not sure what to do if it does not. Run:
micinfo -group BoardThis should return a bunch of information about your Phi, though not all of it will be available because MPSS isn't running yet. Next:
micflash -update -device all -smcbootloaderThen restart the host, and:
modprobe mic
micflash -getversionThis should show the new firmware version. Next, start MPSS:
systemctl start mpssNow you should be able to ssh into the Phi's filesystem:
ssh mic0If that didn't work, you need to see the readme section about ssh keys and loading them. Now, from the host, run:
miccheckThis should show all passes. Then run:
micinfoThis will show a lot of information about your Phi. You can launch a monitoring gui with:
micsmcThat's it. If your Phi passed all of that, you should be able to install software on it. I haven't done this yet...that will be the topic of another post. You should also go to /etc/sysconfig/network-scripts/micX and change "onboot" to "no". I can't remember the exact reason for this, but it's in my notes.
For my lot, after all was said and done, 8/12 were recognized by lspci and tested to be working. The lspci recognition was spotty, though...probably because these weren't 5110p's. I managed to sell all of them for a profit. I kept one, the 7110p that was in the lot. While probably not useful for conventional CFD, it might be useful for something like OpenLB or anything super vectorizable that needs more umph per core than a GPU can provide. They're also supposedly really good for mining some cryptocurrencies, though I haven't tried. Labels: cluster, homelab No comments:
Post a Comment
Newer Post Older Post Home Subscribe to: Post Comments (Atom)About Me
Jed Storey I like designing and building things. I need more space and machine tools... View my complete profilePages
- Home
Projects
- 3D printer (37)
- cluster (33)
- Cubex (10)
- Cubexy (6)
- DBF (1)
- Edgerton Class 2010 (5)
- EHB (7)
- ELB (40)
- garage shop (14)
- home improvement (6)
- homelab (48)
- L3 Rocket (4)
- Life (1)
- LITE (3)
- maglev (4)
- Mechatronics (7)
- Meteorites (1)
- MHD (1)
- night light joule thief (3)
- openbsd (6)
- python (2)
- Rockets (5)
- Sailboats (2)
- table saw (3)
- tools (1)
- TV (1)
- wanhao i3 (13)
- woodworking (7)
Links
- ELB Design Review
Subscribe To | Followers |
Blog Archive
- ► 2024 (1)
- ► January (1)
- ► 2023 (1)
- ► November (1)
- ► 2022 (6)
- ► July (1)
- ► June (1)
- ► April (1)
- ► January (3)
- ► 2021 (6)
- ► December (2)
- ► November (1)
- ► July (3)
- ► 2020 (16)
- ► December (3)
- ► September (2)
- ► May (1)
- ► April (5)
- ► January (5)
- ► 2019 (10)
- ► December (2)
- ► February (8)
- ► 2016 (2)
- ► October (1)
- ► June (1)
- ► 2015 (5)
- ► May (1)
- ► January (4)
- ► 2014 (16)
- ► December (7)
- ► November (1)
- ► October (1)
- ► September (5)
- ► August (1)
- ► June (1)
- ► 2012 (2)
- ► May (1)
- ► January (1)
- ► 2011 (25)
- ► November (1)
- ► October (2)
- ► September (1)
- ► August (7)
- ► July (6)
- ► June (3)
- ► April (1)
- ► February (1)
- ► January (3)
- ► 2010 (30)
- ► December (1)
- ► November (4)
- ► October (1)
- ► September (3)
- ► August (4)
- ► July (6)
- ► June (7)
- ► May (4)
Từ khóa » Phi 71s1p
-
Xeon Phi 71S1P Specifications? - Intel Communities
-
Intel Xeon Phi 71S1P 8GB RAM 1.1GHz 61 Core ... - EBay
-
Xeon Phi - Wikipedia
-
Xeon Phi 71S1P Drivers - Programs, Apps And Websites
-
Intel Xeon Phi 71S1P: 61 Cœurs @ 1.1GHz, 8 Go RAM - PicClick FR
-
Intel Xeon Phi 7120P Specs - GPU Database - TechPowerUp
-
Intel Xeon Phi 71S1P 8GB RAM 1.1GHz 61 Core CoProcessor PCI-e ...
-
[FS] [US-FL] 100+ Intel Xeon Phi 7-ES, 71S1P, 7110P Co-processor ...
-
FS: 80+ Intel Xeon Phi 7-ES, 71S1P, 7110P Co-processor PCIe Cards
-
VGA Shaped Intel CPU With 61 Cores - Intel Xeon Phi - YouTube
-
Installing The Compute Package For Intel Xeon Phi On The ... - IBM
-
Intel Xeon Phi 7120P Coprocessor : Electronics
-
The End Of Xeon Phi – It's Xeon And Maybe GPUs From Here