Skip to content

Commit

Permalink
Merge branch 'Add-new-PCI_DEV_FLAGS_NO_RELAXED_ORDERING-flag'
Browse files Browse the repository at this point in the history
Ding Tianhong says:

====================
Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

Some devices have problems with Transaction Layer Packets with the Relaxed
Ordering Attribute set.  This patch set adds a new PCIe Device Flag,
PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known
devices with Relaxed Ordering issues, and a use of this new flag by the
cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex
Ports.

It's been years since I've submitted kernel.org patches, I appolgise for the
almost certain submission errors.

v2: Alexander point out that the v1 was only a part of the whole solution,
    some platform which has some issues could use the new flag to indicate
    that it is not safe to enable relaxed ordering attribute, then we need
    to clear the relaxed ordering enable bits in the PCI configuration when
    initializing the device. So add a new second patch to modify the PCI
    initialization code to clear the relaxed ordering enable bit in the
    event that the root complex doesn't want relaxed ordering enabled.

    The third patch was base on the v1's second patch and only be changed
    to query the relaxed ordering enable bit in the PCI configuration space
    to allow the Chelsio NIC to send TLPs with the relaxed ordering attributes
    set.

    This version didn't plan to drop the defines for Intel Drivers to use the
    new checking way to enable relaxed ordering because it is not the hardest
    part of the moment, we could fix it in next patchset when this patches
    reach the goal.

v3: Redesigned the logic for pci_configure_relaxed_ordering when configuration,
    If a PCIe device didn't enable the relaxed ordering attribute default,
    we should not do anything in the PCIe configuration, otherwise we
    should check if any of the devices above us do not support relaxed
    ordering by the PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag, then base on
    the result if we get a return that indicate that the relaxed ordering
    is not supported we should update our device to disable relaxed ordering
    in configuration space. If the device above us doesn't exist or isn't
    the PCIe device, we shouldn't do anything and skip updating relaxed ordering
    because we are probably running in a guest.

v4: Rename the functions pcie_get_relaxed_ordering and pcie_disable_relaxed_ordering
    according John's suggestion, and modify the description, use the true/false
    as the return value.

    We shouldn't enable relaxed ordering attribute by the setting in the root
    complex configuration space for PCIe device, so fix it for cxgb4.

    Fix some format issues.

v5: Removed the unnecessary code for some function which only return the bool
    value, and add the check for VF device.

    Make this patch set base on 4.12-rc5.

v6: Fix the logic error in the need to enable the relaxed ordering attribute for cxgb4.

v7: The cxgb4 drivers will enable the PCIe Capability Device Control[Relaxed
    Ordering Enable] in PCI Probe() routine, this will break our current
    solution for some platform which has problematic when enable the relaxed
    ordering attribute. According to the latest recommendations, remove the
    enable_pcie_relaxed_ordering(), although it could not cover the Peer-to-Peer
    scene, but we agree to leave this problem until we really trigger it.

    Make this patch set base on 4.12 release version.

v8: Change the second patch title and description to make it more reasonable,
    add the acked-by from Alex and Ashok.

    Add a new patch to enable the Relaxed Ordering Attribute for cxgb4vf driver.

    Make this patch set base on 4.13-rc2.

v9: The document (https://software.intel.com/sites/default/files/managed/9e/
    bc/64-ia-32-architectures-optimization-manual.pdf) indicate that the Xeon
    processors based on Broadwell/Haswell microarchitecture has the problem
    with Relaxed Ordering Attribute enabled, so add the whole list Device ID
    from Intel to the patch.

v10: Significant rework based on Bjorn's feedback, reorganize the first 2 patches,
     now the Intel and AMD erratum soc has been divided to the different patches,
     rename the pcie_relaxed_ordering_supported() to pcie_relaxed_ordering_enabled(),
     and no need to check every intervening switch except the root ports, update
     some commits.

v11: We shouldn't let the Intel engineer to acked the AMD's erratum patch, fix the
     funny mistake.
====================

Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
davem330 committed Aug 15, 2017
2 parents 59a361b + b629276 commit bae514a
Show file tree
Hide file tree
Showing 9 changed files with 178 additions and 8 deletions.
1 change: 1 addition & 0 deletions drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
Original file line number Diff line number Diff line change
Expand Up @@ -529,6 +529,7 @@ enum { /* adapter flags */
USING_SOFT_PARAMS = (1 << 6),
MASTER_PF = (1 << 7),
FW_OFLD_CONN = (1 << 9),
ROOT_NO_RELAXED_ORDERING = (1 << 10),
};

enum {
Expand Down
23 changes: 17 additions & 6 deletions drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -4654,11 +4654,6 @@ static void print_port_info(const struct net_device *dev)
dev->name, adap->params.vpd.id, adap->name, buf);
}

static void enable_pcie_relaxed_ordering(struct pci_dev *dev)
{
pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
}

/*
* Free the following resources:
* - memory used for tables
Expand Down Expand Up @@ -4908,7 +4903,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
}

pci_enable_pcie_error_reporting(pdev);
enable_pcie_relaxed_ordering(pdev);
pci_set_master(pdev);
pci_save_state(pdev);

Expand Down Expand Up @@ -4947,6 +4941,23 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
adapter->msg_enable = DFLT_MSG_ENABLE;
memset(adapter->chan_map, 0xff, sizeof(adapter->chan_map));

/* If possible, we use PCIe Relaxed Ordering Attribute to deliver
* Ingress Packet Data to Free List Buffers in order to allow for
* chipset performance optimizations between the Root Complex and
* Memory Controllers. (Messages to the associated Ingress Queue
* notifying new Packet Placement in the Free Lists Buffers will be
* send without the Relaxed Ordering Attribute thus guaranteeing that
* all preceding PCIe Transaction Layer Packets will be processed
* first.) But some Root Complexes have various issues with Upstream
* Transaction Layer Packets with the Relaxed Ordering Attribute set.
* The PCIe devices which under the Root Complexes will be cleared the
* Relaxed Ordering bit in the configuration space, So we check our
* PCIe configuration space to see if it's flagged with advice against
* using Relaxed Ordering.
*/
if (!pcie_relaxed_ordering_enabled(pdev))
adapter->flags |= ROOT_NO_RELAXED_ORDERING;

spin_lock_init(&adapter->stats_lock);
spin_lock_init(&adapter->tid_release_lock);
spin_lock_init(&adapter->win0_lock);
Expand Down
5 changes: 3 additions & 2 deletions drivers/net/ethernet/chelsio/cxgb4/sge.c
Original file line number Diff line number Diff line change
Expand Up @@ -2719,6 +2719,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
struct fw_iq_cmd c;
struct sge *s = &adap->sge;
struct port_info *pi = netdev_priv(dev);
int relaxed = !(adap->flags & ROOT_NO_RELAXED_ORDERING);

/* Size needs to be multiple of 16, including status entry. */
iq->size = roundup(iq->size, 16);
Expand Down Expand Up @@ -2772,8 +2773,8 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,

flsz = fl->size / 8 + s->stat_len / sizeof(struct tx_desc);
c.iqns_to_fl0congen |= htonl(FW_IQ_CMD_FL0PACKEN_F |
FW_IQ_CMD_FL0FETCHRO_F |
FW_IQ_CMD_FL0DATARO_F |
FW_IQ_CMD_FL0FETCHRO_V(relaxed) |
FW_IQ_CMD_FL0DATARO_V(relaxed) |
FW_IQ_CMD_FL0PADEN_F);
if (cong >= 0)
c.iqns_to_fl0congen |=
Expand Down
1 change: 1 addition & 0 deletions drivers/net/ethernet/chelsio/cxgb4vf/adapter.h
Original file line number Diff line number Diff line change
Expand Up @@ -408,6 +408,7 @@ enum { /* adapter flags */
USING_MSI = (1UL << 1),
USING_MSIX = (1UL << 2),
QUEUES_BOUND = (1UL << 3),
ROOT_NO_RELAXED_ORDERING = (1UL << 4),
};

/*
Expand Down
18 changes: 18 additions & 0 deletions drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -2888,6 +2888,24 @@ static int cxgb4vf_pci_probe(struct pci_dev *pdev,
*/
adapter->name = pci_name(pdev);
adapter->msg_enable = DFLT_MSG_ENABLE;

/* If possible, we use PCIe Relaxed Ordering Attribute to deliver
* Ingress Packet Data to Free List Buffers in order to allow for
* chipset performance optimizations between the Root Complex and
* Memory Controllers. (Messages to the associated Ingress Queue
* notifying new Packet Placement in the Free Lists Buffers will be
* send without the Relaxed Ordering Attribute thus guaranteeing that
* all preceding PCIe Transaction Layer Packets will be processed
* first.) But some Root Complexes have various issues with Upstream
* Transaction Layer Packets with the Relaxed Ordering Attribute set.
* The PCIe devices which under the Root Complexes will be cleared the
* Relaxed Ordering bit in the configuration space, So we check our
* PCIe configuration space to see if it's flagged with advice against
* using Relaxed Ordering.
*/
if (!pcie_relaxed_ordering_enabled(pdev))
adapter->flags |= ROOT_NO_RELAXED_ORDERING;

err = adap_init0(adapter);
if (err)
goto err_unmap_bar;
Expand Down
3 changes: 3 additions & 0 deletions drivers/net/ethernet/chelsio/cxgb4vf/sge.c
Original file line number Diff line number Diff line change
Expand Up @@ -2205,6 +2205,7 @@ int t4vf_sge_alloc_rxq(struct adapter *adapter, struct sge_rspq *rspq,
struct port_info *pi = netdev_priv(dev);
struct fw_iq_cmd cmd, rpl;
int ret, iqandst, flsz = 0;
int relaxed = !(adapter->flags & ROOT_NO_RELAXED_ORDERING);

/*
* If we're using MSI interrupts and we're not initializing the
Expand Down Expand Up @@ -2300,6 +2301,8 @@ int t4vf_sge_alloc_rxq(struct adapter *adapter, struct sge_rspq *rspq,
cpu_to_be32(
FW_IQ_CMD_FL0HOSTFCMODE_V(SGE_HOSTFCMODE_NONE) |
FW_IQ_CMD_FL0PACKEN_F |
FW_IQ_CMD_FL0FETCHRO_V(relaxed) |
FW_IQ_CMD_FL0DATARO_V(relaxed) |
FW_IQ_CMD_FL0PADEN_F);

/* In T6, for egress queue type FL there is internal overhead
Expand Down
43 changes: 43 additions & 0 deletions drivers/pci/probe.c
Original file line number Diff line number Diff line change
Expand Up @@ -1762,13 +1762,56 @@ static void pci_configure_extended_tags(struct pci_dev *dev)
PCI_EXP_DEVCTL_EXT_TAG);
}

/**
* pcie_relaxed_ordering_enabled - Probe for PCIe relaxed ordering enable
* @dev: PCI device to query
*
* Returns true if the device has enabled relaxed ordering attribute.
*/
bool pcie_relaxed_ordering_enabled(struct pci_dev *dev)
{
u16 v;

pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &v);

return !!(v & PCI_EXP_DEVCTL_RELAX_EN);
}
EXPORT_SYMBOL(pcie_relaxed_ordering_enabled);

static void pci_configure_relaxed_ordering(struct pci_dev *dev)
{
struct pci_dev *root;

/* PCI_EXP_DEVICE_RELAX_EN is RsvdP in VFs */
if (dev->is_virtfn)
return;

if (!pcie_relaxed_ordering_enabled(dev))
return;

/*
* For now, we only deal with Relaxed Ordering issues with Root
* Ports. Peer-to-Peer DMA is another can of worms.
*/
root = pci_find_pcie_root_port(dev);
if (!root)
return;

if (root->dev_flags & PCI_DEV_FLAGS_NO_RELAXED_ORDERING) {
pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
PCI_EXP_DEVCTL_RELAX_EN);
dev_info(&dev->dev, "Disable Relaxed Ordering because the Root Port didn't support it\n");
}
}

static void pci_configure_device(struct pci_dev *dev)
{
struct hotplug_params hpp;
int ret;

pci_configure_mps(dev);
pci_configure_extended_tags(dev);
pci_configure_relaxed_ordering(dev);

memset(&hpp, 0, sizeof(hpp));
ret = pci_get_hp_params(dev, &hpp);
Expand Down
89 changes: 89 additions & 0 deletions drivers/pci/quirks.c
Original file line number Diff line number Diff line change
Expand Up @@ -4015,6 +4015,95 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(0x1797, 0x6868, PCI_CLASS_NOT_DEFINED, 8,
DECLARE_PCI_FIXUP_CLASS_EARLY(0x1797, 0x6869, PCI_CLASS_NOT_DEFINED, 8,
quirk_tw686x_class);

/*
* Some devices have problems with Transaction Layer Packets with the Relaxed
* Ordering Attribute set. Such devices should mark themselves and other
* Device Drivers should check before sending TLPs with RO set.
*/
static void quirk_relaxedordering_disable(struct pci_dev *dev)
{
dev->dev_flags |= PCI_DEV_FLAGS_NO_RELAXED_ORDERING;
dev_info(&dev->dev, "Disable Relaxed Ordering Attributes to avoid PCIe Completion erratum\n");
}

/*
* Intel Xeon processors based on Broadwell/Haswell microarchitecture Root
* Complex has a Flow Control Credit issue which can cause performance
* problems with Upstream Transaction Layer Packets with Relaxed Ordering set.
*/
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f01, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f02, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f03, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f04, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f05, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f06, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f07, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f08, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f09, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0a, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0b, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0c, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0d, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0e, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f01, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f02, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f03, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f04, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f05, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f06, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f07, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f08, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f09, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0a, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0b, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0c, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0d, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0e, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);

/*
* The AMD ARM A1100 (AKA "SEATTLE") SoC has a bug in its PCIe Root Complex
* where Upstream Transaction Layer Packets with the Relaxed Ordering
* Attribute clear are allowed to bypass earlier TLPs with Relaxed Ordering
* set. This is a violation of the PCIe 3.0 Transaction Ordering Rules
* outlined in Section 2.4.1 (PCI Express(r) Base Specification Revision 3.0
* November 10, 2010). As a result, on this platform we can't use Relaxed
* Ordering for Upstream TLPs.
*/
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a00, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a01, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a02, PCI_CLASS_NOT_DEFINED, 8,
quirk_relaxedordering_disable);

/*
* Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same
* values for the Attribute as were supplied in the header of the
Expand Down
3 changes: 3 additions & 0 deletions include/linux/pci.h
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,8 @@ enum pci_dev_flags {
* the direct_complete optimization.
*/
PCI_DEV_FLAGS_NEEDS_RESUME = (__force pci_dev_flags_t) (1 << 11),
/* Don't use Relaxed Ordering for TLPs directed at this device */
PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 12),
};

enum pci_irq_reroute_variant {
Expand Down Expand Up @@ -1125,6 +1127,7 @@ bool pci_check_pme_status(struct pci_dev *dev);
void pci_pme_wakeup_bus(struct pci_bus *bus);
void pci_d3cold_enable(struct pci_dev *dev);
void pci_d3cold_disable(struct pci_dev *dev);
bool pcie_relaxed_ordering_enabled(struct pci_dev *dev);

/* PCI Virtual Channel */
int pci_save_vc_state(struct pci_dev *dev);
Expand Down

0 comments on commit bae514a

Please sign in to comment.