Skip to content

Conversation

@ratailor
Copy link
Collaborator

@ratailor ratailor commented Jun 4, 2020

This change adds fpga-rsu ansible-role to install
OPAE FPGA packages on hosts where fpga hardware
is already installed.

This change adds fpga-rsu ansible-role to install
OPAE FPGA packages on hosts where fpga hardware
is already installed.
lsmod | grep pac_n3000_net
register: verify_opae_driver

- name: Verify module is loaded in kernel and update flash

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, a few comments:

  1. Depending on the currently installed firmware version, the upgrade procedure is different.
    see https://www.intel.com/content/www/us/en/programmable/documentation/xgz1560360700260.html#hpo1573151952874

  2. There are a number of checks that should be performed before firmware upgrade. From the same document above:

Note: These upgrades erase the Static Region (SR) root entry hash and any CSK cancellation IDs previously programmed in the flash of the Intel® FPGA PAC N3000.
Remember:
Stop any service or daemon accessing the FPGA or XL710 before updating the Intel® FPGA PAC N3000 such as fpgad.
PLDM requests may return stale data. Avoid Host PLDM requests.
Ensure cooling requirements are met. The server can reboot if the FPGA Core temperature exceeds 95°C. For more information, refer to Cooling Requirements.

Tip: Before you proceed with upgrade, ensure that the FPGA Die Temperature is below 80°C using the following command:

sudo fpgainfo bmc

If it is higher than the threshold value, increase the fan speed to improve thermal condition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, perhaps its good to only attempt to perform RPM installation if we know the packages work on RHEL-8.

The current packages work on RHEL-7 only.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Marcelo for review.

We are only trying to install rpm in this and I will update the README to improve doc and update the pull-request.

@matosatti
Copy link

matosatti commented Jun 24, 2020 via email

@ratailor
Copy link
Collaborator Author

On Tue, Jun 23, 2020 at 10:19:22PM -0700, ratailor wrote: @ratailor commented on this pull request. > + when: opae_packages_installed.rc != 0 + + - name: Verify opae and intel packages installed + shell: |- + set -o pipefail + rpm -qa | grep 'opae' + when: opae_packages_installed.rc == 0 + + - name: Verify opae driver installation + shell: |- + set -o pipefail + lsmod | grep fpga + lsmod | grep pac_n3000_net + register: verify_opae_driver + +- name: Verify module is loaded in kernel and update flash Thank you Marcelo for review. We are only trying to install rpm in this and I will update the README to improve doc and update the pull-request.
Right, but: 1) Installing the RPM triggers a firmware update. 2) To perform the firmware update a number of steps must be done (they are not difficult to do, i can help you find out the exact details if needed... drop me an email). So i assume this automation code has to perform those steps (stop any daemons using the card, check temperature).

Yes, it would be good, if you could provide the steps to perform after RPM installation is complete.


-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: #3 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants