Discussion:
[PATCH v10 01/10] Fix ioport_map() for !CONFIG_GENERIC_IOMAP cases.
(too old to reply)
Liviu Dudau
2014-09-08 14:00:02 UTC
Permalink
The inline version of ioport_map() that gets used when !CONFIG_GENERIC_IOMAP
is wrong. It returns a mapped (i.e. virtual) address that can start from
zero and completely ignores the PCI_IOBASE and IO_SPACE_LIMIT that most
architectures that use !CONFIG_GENERIC_MAP define.

Signed-off-by: Liviu Dudau <***@arm.com>
Reviewed-by: Catalin Marinas <***@arm.com>
Acked-by: Arnd Bergmann <***@arndb.de>
Tested-by: Tanmay Inamdar <***@apm.com>
---
include/asm-generic/io.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 975e1cc..2e2161b 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -331,7 +331,7 @@ static inline void iounmap(void __iomem *addr)
#ifndef CONFIG_GENERIC_IOMAP
static inline void __iomem *ioport_map(unsigned long port, unsigned int nr)
{
- return (void __iomem *) port;
+ return (void __iomem *)(PCI_IOBASE + (port & IO_SPACE_LIMIT));
}

static inline void ioport_unmap(void __iomem *p)
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:00:02 UTC
Permalink
If the firmware has not assigned all the bus resources and
we are not just probing the PCIe busses, it makes sense to
assign the unassigned resources in pci_scan_root_bus().

Cc: Bjorn Helgaas <***@google.com>
Cc: Arnd Bergmann <***@arndb.de>
Cc: Jason Gunthorpe <***@obsidianresearch.com>
Cc: Rob Herring <robh+***@kernel.org>
Signed-off-by: Liviu Dudau <***@arm.com>
---
drivers/pci/probe.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef891d2..508cf61 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1953,6 +1953,9 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
if (!found)
pci_bus_update_busn_res_end(b, max);

+ if (!pci_has_flag(PCI_PROBE_ONLY))
+ pci_assign_unassigned_bus_resources(b);
+
pci_bus_add_devices(b);
return b;
}
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Suravee Suthikulpanit
2014-09-12 10:20:02 UTC
Permalink
Post by Liviu Dudau
If the firmware has not assigned all the bus resources and
we are not just probing the PCIe busses, it makes sense to
assign the unassigned resources in pci_scan_root_bus().
---
drivers/pci/probe.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef891d2..508cf61 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1953,6 +1953,9 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
if (!found)
pci_bus_update_busn_res_end(b, max);
+ if (!pci_has_flag(PCI_PROBE_ONLY))
+ pci_assign_unassigned_bus_resources(b);
+
pci_bus_add_devices(b);
return b;
}
Liviu,

Besides the check for PCI_PROBE_ONLY here, I think we also need to avoid calling
"pci_enable_resources()" in the "driver/pci/pci.c: pcibios_enable_device()" for
PCI_PROBE_ONLY mode since the resource is not assigned by Linux. Otherwise, the
"drivers/pci/setup-res.c: pci_enable_resource()" would fail w/ error:

can't enable device: BAR ..... not assigned

Actually, in "arch/arm/kernel/bios32.c:", the weak "pcibios_enable_device()" function
also has the check for PCI_PROBE_ONLY mode before calling pci_enable_resources().

Thanks,

Suravee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-12 10:40:02 UTC
Permalink
Post by Suravee Suthikulpanit
Post by Liviu Dudau
If the firmware has not assigned all the bus resources and
we are not just probing the PCIe busses, it makes sense to
assign the unassigned resources in pci_scan_root_bus().
---
drivers/pci/probe.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef891d2..508cf61 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1953,6 +1953,9 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
if (!found)
pci_bus_update_busn_res_end(b, max);
+ if (!pci_has_flag(PCI_PROBE_ONLY))
+ pci_assign_unassigned_bus_resources(b);
+
pci_bus_add_devices(b);
return b;
}
Liviu,
Besides the check for PCI_PROBE_ONLY here, I think we also need to avoid calling
"pci_enable_resources()" in the "driver/pci/pci.c: pcibios_enable_device()" for
PCI_PROBE_ONLY mode since the resource is not assigned by Linux. Otherwise, the
can't enable device: BAR ..... not assigned
Hmm, are you sure that is not because the host bridge resources have not been
requested? pci-host-generic.c uses PCI_PROBE_ONLY and yet it works with the series as
is. But it does request_resource() for the host bridge resources inside the driver.

Best regards,
Liviu
Post by Suravee Suthikulpanit
Actually, in "arch/arm/kernel/bios32.c:", the weak "pcibios_enable_device()" function
also has the check for PCI_PROBE_ONLY mode before calling pci_enable_resources().
Thanks,
Suravee
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:00:03 UTC
Permalink
Provide a function to parse the PCI DT ranges that can be used to
create a pci_host_bridge structure together with its associated
bus.

Cc: Bjorn Helgaas <***@google.com>
Cc: Arnd Bergmann <***@arndb.de>
Cc: Grant Likely <***@linaro.org>
Cc: Rob Herring <robh+***@kernel.org>
Cc: Catalin Marinas <***@arm.com>
Signed-off-by: Liviu Dudau <***@arm.com>
---
drivers/of/of_pci.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/of_pci.h | 11 ++++++
2 files changed, 113 insertions(+)

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index a107edb..36701da 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -1,7 +1,9 @@
#include <linux/kernel.h>
#include <linux/export.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_pci.h>
+#include <linux/slab.h>

static inline int __of_pci_pci_compare(struct device_node *node,
unsigned int data)
@@ -123,6 +125,106 @@ int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
}
EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);

+/**
+ * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from DT
+ * @dev: device node of the host bridge having the range property
+ * @busno: bus number associated with the bridge root bus
+ * @bus_max: maximum number of busses for this bridge
+ * @resources: list where the range of resources will be added after DT parsing
+ * @io_base: pointer to a variable that will contain on return the physical
+ * address for the start of the I/O range.
+ *
+ * It is the callers job to free the @resources list.
+ *
+ * This function will parse the "ranges" property of a PCI host bridge device
+ * node and setup the resource mapping based on its content. It is expected
+ * that the property conforms with the Power ePAPR document.
+ *
+ */
+int of_pci_get_host_bridge_resources(struct device_node *dev,
+ unsigned char busno, unsigned char bus_max,
+ struct list_head *resources, resource_size_t *io_base)
+{
+ struct resource *res;
+ struct resource *bus_range;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
+ char range_type[4];
+ int err;
+
+ bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
+ if (!bus_range)
+ return -ENOMEM;
+
+ pr_info("PCI host bridge %s ranges:\n", dev->full_name);
+
+ err = of_pci_parse_bus_range(dev, bus_range);
+ if (err) {
+ bus_range->start = busno;
+ bus_range->end = bus_max;
+ bus_range->flags = IORESOURCE_BUS;
+ pr_info(" No bus range found for %s, using %pR\n",
+ dev->full_name, &bus_range);
+ } else {
+ if (bus_range->end > bus_range->start + bus_max)
+ bus_range->end = bus_range->start + bus_max;
+ }
+ pci_add_resource(resources, bus_range);
+
+ /* Check for ranges property */
+ err = of_pci_range_parser_init(&parser, dev);
+ if (err)
+ goto parse_failed;
+
+ pr_debug("Parsing ranges property...\n");
+ for_each_of_pci_range(&parser, &range) {
+ /* Read next ranges element */
+ if ((range.flags & IORESOURCE_TYPE_BITS) == IORESOURCE_IO)
+ snprintf(range_type, 4, " IO");
+ else if ((range.flags & IORESOURCE_TYPE_BITS) == IORESOURCE_MEM)
+ snprintf(range_type, 4, "MEM");
+ else
+ snprintf(range_type, 4, "err");
+ pr_info(" %s %#010llx..%#010llx -> %#010llx\n", range_type,
+ range.cpu_addr, range.cpu_addr + range.size - 1,
+ range.pci_addr);
+
+ /*
+ * If we failed translation or got a zero-sized region
+ * then skip this range
+ */
+ if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+ continue;
+
+ res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!res) {
+ err = -ENOMEM;
+ goto parse_failed;
+ }
+
+ err = of_pci_range_to_resource(&range, dev, res);
+ if (err) {
+ kfree(res);
+ goto parse_failed;
+ }
+
+ if (resource_type(res) == IORESOURCE_IO) {
+ if (*io_base)
+ pr_warn("More than one I/O resource converted. CPU offset for old range lost!\n");
+ *io_base = range.cpu_addr;
+ }
+
+ pci_add_resource_offset(resources, res, res->start - range.pci_addr);
+ }
+
+ return 0;
+
+parse_failed:
+ pci_free_resource_list(resources);
+ return err;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
#ifdef CONFIG_PCI_MSI

static LIST_HEAD(of_pci_msi_chip_list);
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 3a3824c..a4b8a85 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -16,6 +16,10 @@ int of_pci_get_devfn(struct device_node *np);
int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing);
+int of_pci_get_host_bridge_resources(struct device_node *dev,
+ unsigned char busno, unsigned char bus_max,
+ struct list_head *resources, resource_size_t *io_base);
+
#else
static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
{
@@ -50,6 +54,13 @@ of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
{
return -1;
}
+
+int of_pci_get_host_bridge_resources(struct device_node *dev,
+ unsigned char busno, unsigned char bus_max,
+ struct list_head *resources, resource_size_t *io_base);
+{
+ return -EINVAL;
+}
#endif

#if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Lorenzo Pieralisi
2014-09-09 13:40:03 UTC
Permalink
Post by Liviu Dudau
Provide a function to parse the PCI DT ranges that can be used to
create a pci_host_bridge structure together with its associated
bus.
---
drivers/of/of_pci.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/of_pci.h | 11 ++++++
2 files changed, 113 insertions(+)
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index a107edb..36701da 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -1,7 +1,9 @@
#include <linux/kernel.h>
#include <linux/export.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_pci.h>
+#include <linux/slab.h>
static inline int __of_pci_pci_compare(struct device_node *node,
unsigned int data)
@@ -123,6 +125,106 @@ int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
}
EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);
+/**
+ * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from DT
+ * address for the start of the I/O range.
+ *
+ *
+ * This function will parse the "ranges" property of a PCI host bridge device
+ * node and setup the resource mapping based on its content. It is expected
+ * that the property conforms with the Power ePAPR document.
You should also define what it returns and when.
Post by Liviu Dudau
+ *
+ */
+int of_pci_get_host_bridge_resources(struct device_node *dev,
+ unsigned char busno, unsigned char bus_max,
+ struct list_head *resources, resource_size_t *io_base)
+{
+ struct resource *res;
+ struct resource *bus_range;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
+ char range_type[4];
+ int err;
+
+ bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
+ if (!bus_range)
+ return -ENOMEM;
+
+ pr_info("PCI host bridge %s ranges:\n", dev->full_name);
+
+ err = of_pci_parse_bus_range(dev, bus_range);
+ if (err) {
+ bus_range->start = busno;
+ bus_range->end = bus_max;
+ bus_range->flags = IORESOURCE_BUS;
+ pr_info(" No bus range found for %s, using %pR\n",
+ dev->full_name, &bus_range);
+ } else {
+ if (bus_range->end > bus_range->start + bus_max)
+ bus_range->end = bus_range->start + bus_max;
+ }
+ pci_add_resource(resources, bus_range);
This means that eg in the PCI generic host controller I cannot "filter"
the bus resource, unless I remove it, "filter" it, and add it again.

I certainly can't filter a resource that has been already added without
removing it first.

Thoughts ?
Post by Liviu Dudau
+
+ /* Check for ranges property */
+ err = of_pci_range_parser_init(&parser, dev);
+ if (err)
+ goto parse_failed;
+
+ pr_debug("Parsing ranges property...\n");
+ for_each_of_pci_range(&parser, &range) {
+ /* Read next ranges element */
+ if ((range.flags & IORESOURCE_TYPE_BITS) == IORESOURCE_IO)
+ snprintf(range_type, 4, " IO");
+ else if ((range.flags & IORESOURCE_TYPE_BITS) == IORESOURCE_MEM)
+ snprintf(range_type, 4, "MEM");
+ else
+ snprintf(range_type, 4, "err");
+ pr_info(" %s %#010llx..%#010llx -> %#010llx\n", range_type,
+ range.cpu_addr, range.cpu_addr + range.size - 1,
+ range.pci_addr);
+
+ /*
+ * If we failed translation or got a zero-sized region
+ * then skip this range
+ */
+ if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+ continue;
+
+ res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!res) {
+ err = -ENOMEM;
+ goto parse_failed;
+ }
+
+ err = of_pci_range_to_resource(&range, dev, res);
+ if (err) {
+ kfree(res);
You might want to add a label to free res to make things more uniform.
Post by Liviu Dudau
+ goto parse_failed;
+ }
+
+ if (resource_type(res) == IORESOURCE_IO) {
+ if (*io_base)
You do not zero io_base in the first place so you should ask the API
user to do that. Is 0 a valid value BTW ? If it is you've got to resort
to something else to detect multiple IO resources.

Lorenzo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Lorenzo Pieralisi
2014-09-10 16:40:01 UTC
Permalink
On Wed, Sep 10, 2014 at 04:32:33PM +0100, Liviu Dudau wrote:

[...]
Post by Lorenzo Pieralisi
Post by Liviu Dudau
+ /*
+ * If we failed translation or got a zero-sized region
+ * then skip this range
+ */
+ if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+ continue;
+
+ res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!res) {
+ err = -ENOMEM;
+ goto parse_failed;
+ }
+
+ err = of_pci_range_to_resource(&range, dev, res);
+ if (err) {
+ kfree(res);
You might want to add a label to free res to make things more uniform.
Sorry, not following you. How would a label help here?
It was just a suggestion so ignore it if you do not think it is cleaner.
It is to make code more uniform and undo operations in one place instead of
doing it piecemeal (you kfree the res here and jump to complete the clean-up,
whereas you might want to add a different label and a different goto
destination and carry out the kfree there).

I do not mind either, it is just what I noticed.
Post by Lorenzo Pieralisi
Post by Liviu Dudau
+ goto parse_failed;
+ }
+
+ if (resource_type(res) == IORESOURCE_IO) {
+ if (*io_base)
You do not zero io_base in the first place so you should ask the API
user to do that. Is 0 a valid value BTW ? If it is you've got to resort
to something else to detect multiple IO resources.
No, zero is not a valid value. It is the cpu_addr value from the IO range, I'm
hopying that no one is crazy enough to map PCI address space at CPU address zero.
Thanks for spotting the lack of initialisation though, I need to fix it.
Mmm...wasn't a trick question sorry :D

PCI host bridge /pci ranges:
IO 0x00000000..0x0000ffff -> 0x00000000
More than one I/O resource converted. CPU offset for old range lost!
MEM 0x41000000..0x7fffffff -> 0x41000000
pci-host-generic 40000000.pci: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bu-01]
pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
pci_bus 0000:00: root bus resource [mem 0x41000000-0x7fffffff]


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:00:04 UTC
Permalink
The ranges property for a host bridge controller in DT describes
the mapping between the PCI bus address and the CPU physical address.
The resources framework however expects that the IO resources start
at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
The conversion from pci ranges to resources failed to take that into account.

In the process move the function into drivers/of/address.c as it now
depends on pci_address_to_pio() code and make it return an error code.
Also fix all the drivers that depend on the old behaviour by fetching
the CPU physical address based on the port number.

Cc: Grant Likely <***@linaro.org>
Cc: Rob Herring <robh+***@kernel.org>
Cc: Arnd Bergmann <***@arndb.de>
Cc: Linus Walleij <***@linaro.org>
Cc: Thierry Reding <***@gmail.com>
Cc: Simon Horman <***@verge.net.au>
Cc: Catalin Marinas <***@arm.com>
Signed-off-by: Liviu Dudau <***@arm.com>
---
arch/arm/mach-integrator/pci_v3.c | 23 ++++++++++----------
drivers/of/address.c | 46 +++++++++++++++++++++++++++++++++++++++
drivers/pci/host/pci-tegra.c | 10 ++++++---
drivers/pci/host/pcie-rcar.c | 21 +++++++++++++-----
include/linux/of_address.h | 13 ++---------
5 files changed, 82 insertions(+), 31 deletions(-)

diff --git a/arch/arm/mach-integrator/pci_v3.c b/arch/arm/mach-integrator/pci_v3.c
index 05e1f73..3321e1b 100644
--- a/arch/arm/mach-integrator/pci_v3.c
+++ b/arch/arm/mach-integrator/pci_v3.c
@@ -660,6 +660,7 @@ static void __init pci_v3_preinit(void)
{
unsigned long flags;
unsigned int temp;
+ phys_addr_t io_address = pci_pio_to_address(io_mem.start);

pcibios_min_mem = 0x00100000;

@@ -701,7 +702,7 @@ static void __init pci_v3_preinit(void)
/*
* Setup window 2 - PCI IO
*/
- v3_writel(V3_LB_BASE2, v3_addr_to_lb_base2(io_mem.start) |
+ v3_writel(V3_LB_BASE2, v3_addr_to_lb_base2(io_address) |
V3_LB_BASE_ENABLE);
v3_writew(V3_LB_MAP2, v3_addr_to_lb_map2(0));

@@ -742,6 +743,7 @@ static void __init pci_v3_preinit(void)
static void __init pci_v3_postinit(void)
{
unsigned int pci_cmd;
+ phys_addr_t io_address = pci_pio_to_address(io_mem.start);

pci_cmd = PCI_COMMAND_MEMORY |
PCI_COMMAND_MASTER | PCI_COMMAND_INVALIDATE;
@@ -758,7 +760,7 @@ static void __init pci_v3_postinit(void)
"interrupt: %d\n", ret);
#endif

- register_isa_ports(non_mem.start, io_mem.start, 0);
+ register_isa_ports(non_mem.start, io_address, 0);
}

/*
@@ -867,33 +869,32 @@ static int __init pci_v3_probe(struct platform_device *pdev)

for_each_of_pci_range(&parser, &range) {
if (!range.flags) {
- of_pci_range_to_resource(&range, np, &conf_mem);
+ ret = of_pci_range_to_resource(&range, np, &conf_mem);
conf_mem.name = "PCIv3 config";
}
if (range.flags & IORESOURCE_IO) {
- of_pci_range_to_resource(&range, np, &io_mem);
+ ret = of_pci_range_to_resource(&range, np, &io_mem);
io_mem.name = "PCIv3 I/O";
}
if ((range.flags & IORESOURCE_MEM) &&
!(range.flags & IORESOURCE_PREFETCH)) {
non_mem_pci = range.pci_addr;
non_mem_pci_sz = range.size;
- of_pci_range_to_resource(&range, np, &non_mem);
+ ret = of_pci_range_to_resource(&range, np, &non_mem);
non_mem.name = "PCIv3 non-prefetched mem";
}
if ((range.flags & IORESOURCE_MEM) &&
(range.flags & IORESOURCE_PREFETCH)) {
pre_mem_pci = range.pci_addr;
pre_mem_pci_sz = range.size;
- of_pci_range_to_resource(&range, np, &pre_mem);
+ ret = of_pci_range_to_resource(&range, np, &pre_mem);
pre_mem.name = "PCIv3 prefetched mem";
}
- }

- if (!conf_mem.start || !io_mem.start ||
- !non_mem.start || !pre_mem.start) {
- dev_err(&pdev->dev, "missing ranges in device node\n");
- return -EINVAL;
+ if (ret < 0) {
+ dev_err(&pdev->dev, "missing ranges in device node\n");
+ return -EINVAL;
+ }
}

pci_v3.map_irq = of_irq_parse_and_map_pci;
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 2373a92..ff10b64 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -947,3 +947,49 @@ bool of_dma_is_coherent(struct device_node *np)
return false;
}
EXPORT_SYMBOL_GPL(of_dma_is_coherent);
+
+/*
+ * of_pci_range_to_resource - Create a resource from an of_pci_range
+ * @range: the PCI range that describes the resource
+ * @np: device node where the range belongs to
+ * @res: pointer to a valid resource that will be updated to
+ * reflect the values contained in the range.
+ *
+ * Returns EINVAL if the range cannot be converted to resource.
+ *
+ * Note that if the range is an IO range, the resource will be converted
+ * using pci_address_to_pio() which can fail if it is called too early or
+ * if the range cannot be matched to any host bridge IO space (our case here).
+ * To guard against that we try to register the IO range first.
+ * If that fails we know that pci_address_to_pio() will do too.
+ */
+int of_pci_range_to_resource(struct of_pci_range *range,
+ struct device_node *np, struct resource *res)
+{
+ int err;
+ res->flags = range->flags;
+ res->parent = res->child = res->sibling = NULL;
+ res->name = np->full_name;
+
+ if (res->flags & IORESOURCE_IO) {
+ unsigned long port = -1;
+ err = pci_register_io_range(range->cpu_addr, range->size);
+ if (err)
+ goto invalid_range;
+ port = pci_address_to_pio(range->cpu_addr);
+ if (port == (unsigned long)-1) {
+ err = -EINVAL;
+ goto invalid_range;
+ }
+ res->start = port;
+ } else {
+ res->start = range->cpu_addr;
+ }
+ res->end = res->start + range->size - 1;
+ return 0;
+
+invalid_range:
+ res->start = (resource_size_t)OF_BAD_ADDR;
+ res->end = (resource_size_t)OF_BAD_ADDR;
+ return err;
+}
diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 0fb0fdb..946935d 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -626,13 +626,14 @@ DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_relax_enable);
static int tegra_pcie_setup(int nr, struct pci_sys_data *sys)
{
struct tegra_pcie *pcie = sys_to_pcie(sys);
+ phys_addr_t io_start = pci_pio_to_address(pcie->io.start);

pci_add_resource_offset(&sys->resources, &pcie->mem, sys->mem_offset);
pci_add_resource_offset(&sys->resources, &pcie->prefetch,
sys->mem_offset);
pci_add_resource(&sys->resources, &pcie->busn);

- pci_ioremap_io(nr * SZ_64K, pcie->io.start);
+ pci_ioremap_io(nr * SZ_64K, io_start);

return 1;
}
@@ -737,6 +738,7 @@ static irqreturn_t tegra_pcie_isr(int irq, void *arg)
static void tegra_pcie_setup_translations(struct tegra_pcie *pcie)
{
u32 fpci_bar, size, axi_address;
+ phys_addr_t io_start = pci_pio_to_address(pcie->io.start);

/* Bar 0: type 1 extended configuration space */
fpci_bar = 0xfe100000;
@@ -749,7 +751,7 @@ static void tegra_pcie_setup_translations(struct tegra_pcie *pcie)
/* Bar 1: downstream IO bar */
fpci_bar = 0xfdfc0000;
size = resource_size(&pcie->io);
- axi_address = pcie->io.start;
+ axi_address = io_start;
afi_writel(pcie, axi_address, AFI_AXI_BAR1_START);
afi_writel(pcie, size >> 12, AFI_AXI_BAR1_SZ);
afi_writel(pcie, fpci_bar, AFI_FPCI_BAR1);
@@ -1520,7 +1522,9 @@ static int tegra_pcie_parse_dt(struct tegra_pcie *pcie)
}

for_each_of_pci_range(&parser, &range) {
- of_pci_range_to_resource(&range, np, &res);
+ err = of_pci_range_to_resource(&range, np, &res);
+ if (err < 0)
+ return err;

switch (res.flags & IORESOURCE_TYPE_BITS) {
case IORESOURCE_IO:
diff --git a/drivers/pci/host/pcie-rcar.c b/drivers/pci/host/pcie-rcar.c
index 4884ee5..61158e0 100644
--- a/drivers/pci/host/pcie-rcar.c
+++ b/drivers/pci/host/pcie-rcar.c
@@ -323,6 +323,7 @@ static void rcar_pcie_setup_window(int win, struct rcar_pcie *pcie)

/* Setup PCIe address space mappings for each resource */
resource_size_t size;
+ resource_size_t res_start;
u32 mask;

rcar_pci_write_reg(pcie, 0x00000000, PCIEPTCTLR(win));
@@ -335,8 +336,13 @@ static void rcar_pcie_setup_window(int win, struct rcar_pcie *pcie)
mask = (roundup_pow_of_two(size) / SZ_128) - 1;
rcar_pci_write_reg(pcie, mask << 7, PCIEPAMR(win));

- rcar_pci_write_reg(pcie, upper_32_bits(res->start), PCIEPARH(win));
- rcar_pci_write_reg(pcie, lower_32_bits(res->start), PCIEPARL(win));
+ if (res->flags & IORESOURCE_IO)
+ res_start = pci_pio_to_address(res->start);
+ else
+ res_start = res->start;
+
+ rcar_pci_write_reg(pcie, upper_32_bits(res_start), PCIEPARH(win));
+ rcar_pci_write_reg(pcie, lower_32_bits(res_start), PCIEPARL(win));

/* First resource is for IO */
mask = PAR_ENABLE;
@@ -363,9 +369,10 @@ static int rcar_pcie_setup(int nr, struct pci_sys_data *sys)

rcar_pcie_setup_window(i, pcie);

- if (res->flags & IORESOURCE_IO)
- pci_ioremap_io(nr * SZ_64K, res->start);
- else
+ if (res->flags & IORESOURCE_IO) {
+ phys_addr_t io_start = pci_pio_to_address(res->start);
+ pci_ioremap_io(nr * SZ_64K, io_start);
+ } else
pci_add_resource(&sys->resources, res);
}
pci_add_resource(&sys->resources, &pcie->busn);
@@ -935,8 +942,10 @@ static int rcar_pcie_probe(struct platform_device *pdev)
}

for_each_of_pci_range(&parser, &range) {
- of_pci_range_to_resource(&range, pdev->dev.of_node,
+ err = of_pci_range_to_resource(&range, pdev->dev.of_node,
&pcie->res[win++]);
+ if (err < 0)
+ return err;

if (win > RCAR_PCI_MAX_RESOURCES)
break;
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index f8cc7da..c9d70deb 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -23,17 +23,8 @@ struct of_pci_range {
#define for_each_of_pci_range(parser, range) \
for (; of_pci_range_parser_one(parser, range);)

-static inline void of_pci_range_to_resource(struct of_pci_range *range,
- struct device_node *np,
- struct resource *res)
-{
- res->flags = range->flags;
- res->start = range->cpu_addr;
- res->end = range->cpu_addr + range->size - 1;
- res->parent = res->child = res->sibling = NULL;
- res->name = np->full_name;
-}
-
+extern int of_pci_range_to_resource(struct of_pci_range *range,
+ struct device_node *np, struct resource *res);
/* Translate a DMA address from device space to CPU space */
extern u64 of_translate_dma_address(struct device_node *dev,
const __be32 *in_addr);
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:00:04 UTC
Permalink
Before commit 7b5436635800 the pci_host_bridge was created before the root bus.
As that commit has added a needless dependency on the bus for pci_alloc_host_bridge()
the creation order has been changed for no good reason. Revert the order of
creation as we are going to depend on the pci_host_bridge structure to retrieve the
domain number of the root bus.

Cc: Bjorn Helgaas <***@google.com>
Signed-off-by: Liviu Dudau <***@arm.com>
Acked-by: Grant Likely <***@linaro.org>
Reviewed-by: Catalin Marinas <***@arm.com>
Tested-by: Tanmay Inamdar <***@apm.com>
---
drivers/pci/probe.c | 32 +++++++++++++++++---------------
1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e3cf8a2..5ff72ec 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -515,7 +515,7 @@ static void pci_release_host_bridge_dev(struct device *dev)
kfree(bridge);
}

-static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
+static struct pci_host_bridge *pci_alloc_host_bridge(void)
{
struct pci_host_bridge *bridge;

@@ -524,7 +524,8 @@ static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
return NULL;

INIT_LIST_HEAD(&bridge->windows);
- bridge->bus = b;
+ bridge->dev.release = pci_release_host_bridge_dev;
+
return bridge;
}

@@ -1761,9 +1762,15 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
char bus_addr[64];
char *fmt;

+ bridge = pci_alloc_host_bridge();
+ if (!bridge)
+ return NULL;
+
+ bridge->dev.parent = parent;
+
b = pci_alloc_bus();
if (!b)
- return NULL;
+ goto err_out;

b->sysdata = sysdata;
b->ops = ops;
@@ -1772,26 +1779,19 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
if (b2) {
/* If we already got to this bus through a different bridge, ignore it */
dev_dbg(&b2->dev, "bus already known\n");
- goto err_out;
+ goto err_bus_out;
}

- bridge = pci_alloc_host_bridge(b);
- if (!bridge)
- goto err_out;
-
- bridge->dev.parent = parent;
- bridge->dev.release = pci_release_host_bridge_dev;
+ bridge->bus = b;
dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
error = pcibios_root_bridge_prepare(bridge);
- if (error) {
- kfree(bridge);
+ if (error)
goto err_out;
- }

error = device_register(&bridge->dev);
if (error) {
put_device(&bridge->dev);
- goto err_out;
+ goto err_bus_out;
}
b->bridge = get_device(&bridge->dev);
device_enable_async_suspend(b->bridge);
@@ -1848,8 +1848,10 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
class_dev_reg_err:
put_device(&bridge->dev);
device_unregister(&bridge->dev);
-err_out:
+err_bus_out:
kfree(b);
+err_out:
+ kfree(bridge);
return NULL;
}
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:00:05 UTC
Permalink
The handling of PCI domains (or PCI segments in ACPI speak) is
usually a straightforward affair but its implementation is
currently left to the architectural code, with pci_domain_nr(b)
querying the value of the domain associated with bus b.

This patch introduces CONFIG_PCI_DOMAINS_GENERIC as an
option that can be selected if an architecture want a
simple implementation where the value of the domain
associated with a bus is stored in struct pci_bus.

The architectures that select CONFIG_PCI_DOMAINS_GENERIC will
then have to implement pci_bus_assign_domain_nr() as a way
of setting the domain number associated with a root bus.
All child busses except the root bus will inherit the domain_nr
value from their parent.

Cc: Bjorn Helgaas <***@google.com>
Cc: Arnd Bergmann <***@arndb.de>
Signed-off-by: Catalin Marinas <***@arm.com>
[Renamed pci_set_domain_nr() to pci_bus_assign_domain_nr()]
Signed-off-by: Liviu Dudau <***@arm.com>
---
drivers/pci/probe.c | 11 ++++++++---
include/linux/pci.h | 21 +++++++++++++++++++++
2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 5ff72ec..ef891d2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -485,7 +485,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
}
}

-static struct pci_bus *pci_alloc_bus(void)
+static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
{
struct pci_bus *b;

@@ -500,6 +500,10 @@ static struct pci_bus *pci_alloc_bus(void)
INIT_LIST_HEAD(&b->resources);
b->max_bus_speed = PCI_SPEED_UNKNOWN;
b->cur_bus_speed = PCI_SPEED_UNKNOWN;
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+ if (parent)
+ b->domain_nr = parent->domain_nr;
+#endif
return b;
}

@@ -672,7 +676,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
/*
* Allocate a new bus, and inherit stuff from the parent..
*/
- child = pci_alloc_bus();
+ child = pci_alloc_bus(parent);
if (!child)
return NULL;

@@ -1768,13 +1772,14 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,

bridge->dev.parent = parent;

- b = pci_alloc_bus();
+ b = pci_alloc_bus(NULL);
if (!b)
goto err_out;

b->sysdata = sysdata;
b->ops = ops;
b->number = b->busn_res.start = bus;
+ pci_bus_assign_domain_nr(b, parent);
b2 = pci_find_bus(pci_domain_nr(b), bus);
if (b2) {
/* If we already got to this bus through a different bridge, ignore it */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 61978a4..a494e5d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -456,6 +456,9 @@ struct pci_bus {
unsigned char primary; /* number of primary bridge */
unsigned char max_bus_speed; /* enum pci_bus_speed */
unsigned char cur_bus_speed; /* enum pci_bus_speed */
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+ int domain_nr;
+#endif

char name[48];

@@ -1288,6 +1291,24 @@ static inline int pci_domain_nr(struct pci_bus *bus) { return 0; }
static inline int pci_proc_domain(struct pci_bus *bus) { return 0; }
#endif /* CONFIG_PCI_DOMAINS */

+/*
+ * Generic implementation for PCI domain support. If your
+ * architecture does not need custom management of PCI
+ * domains then this implementation will be used
+ */
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+static inline int pci_domain_nr(struct pci_bus *bus)
+{
+ return bus->domain_nr;
+}
+void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent);
+#else
+static inline void pci_bus_assign_domain_nr(struct pci_bus *bus,
+ struct device *parent)
+{
+}
+#endif
+
/* some architectures require additional setup to direct VGA traffic */
typedef int (*arch_set_vga_state_t)(struct pci_dev *pdev, bool decode,
unsigned int command_bits, u32 flags);
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:10:03 UTC
Permalink
Post by Liviu Dudau
The handling of PCI domains (or PCI segments in ACPI speak) is
usually a straightforward affair but its implementation is
currently left to the architectural code, with pci_domain_nr(b)
querying the value of the domain associated with bus b.
This patch introduces CONFIG_PCI_DOMAINS_GENERIC as an
option that can be selected if an architecture want a
simple implementation where the value of the domain
associated with a bus is stored in struct pci_bus.
The architectures that select CONFIG_PCI_DOMAINS_GENERIC will
then have to implement pci_bus_assign_domain_nr() as a way
of setting the domain number associated with a root bus.
All child busses except the root bus will inherit the domain_nr
value from their parent.
[Renamed pci_set_domain_nr() to pci_bus_assign_domain_nr()]
I'm probably the author here, but the patch log doesn't say so (I don't
mind, just a remark).
I've took you patch initially from an email fragment, sorry, should've
altered the commiter accordingly.

Best regards,
Liviu
--
Catalin
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:10:02 UTC
Permalink
This is needed for calls into OF code that parses PCI ranges.
It signals support for memory mapped PCI I/O accesses that
are described be device trees.

Cc: Russell King <***@arm.linux.org.uk>
Cc: Arnd Bergmann <***@arndb.de>
Cc: Rob Herring <robh+***@kernel.org>
Reviewed-by: Catalin Marinas <***@arm.com>
Signed-off-by: Liviu Dudau <***@arm.com>
---
arch/arm/include/asm/io.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index 3d23418..22b7529 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -178,6 +178,7 @@ static inline void __iomem *__typesafe_io(unsigned long addr)

/* PCI fixed i/o mapping */
#define PCI_IO_VIRT_BASE 0xfee00000
+#define PCI_IOBASE PCI_IO_VIRT_BASE

#if defined(CONFIG_PCI)
void pci_ioremap_set_mem_type(int mem_type);
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:10:03 UTC
Permalink
Some architectures do not have a simple view of the PCI I/O space
and instead use a range of CPU addresses that map to bus addresses.
For some architectures these ranges will be expressed by OF bindings
in a device tree file.

This patch introduces a pci_register_io_range() helper function with
a generic implementation that can be used by such architectures to
keep track of the I/O ranges described by the PCI bindings. If the
PCI_IOBASE macro is not defined that signals lack of support for PCI
and we return an error.

In order to retrieve the CPU address associated with an I/O port, a
new helper function pci_pio_to_address() is introduced. This will
search in the list of ranges registered with pci_register_io_range()
and return the CPU address that corresponds to the given port.

Cc: Grant Likely <***@linaro.org>
Cc: Arnd Bergmann <***@arndb.de>
Reviewed-by: Catalin Marinas <***@arm.com>
Acked-by: Rob Herring <***@kernel.org>
Signed-off-by: Liviu Dudau <***@arm.com>
---
drivers/of/address.c | 100 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/of_address.h | 2 +
2 files changed, 102 insertions(+)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index e371825..2373a92 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -5,6 +5,7 @@
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/pci_regs.h>
+#include <linux/slab.h>
#include <linux/string.h>

/* Max address size we deal with */
@@ -601,12 +602,111 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size,
}
EXPORT_SYMBOL(of_get_address);

+#ifdef PCI_IOBASE
+struct io_range {
+ struct list_head list;
+ phys_addr_t start;
+ resource_size_t size;
+};
+
+static LIST_HEAD(io_range_list);
+static DEFINE_SPINLOCK(io_range_lock);
+#endif
+
+/*
+ * Record the PCI IO range (expressed as CPU physical address + size).
+ * Return a negative value if an error has occured, zero otherwise
+ */
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
+{
+#ifdef PCI_IOBASE
+ struct io_range *range;
+ resource_size_t allocated_size = 0;
+
+ /* check if the range hasn't been previously recorded */
+ spin_lock(&io_range_lock);
+ list_for_each_entry(range, &io_range_list, list) {
+ if (addr >= range->start && addr + size <= range->start + size)
+ return 0;
+ allocated_size += range->size;
+ }
+ spin_unlock(&io_range_lock);
+
+ /* range not registed yet, check for available space */
+ if (allocated_size + size - 1 > IO_SPACE_LIMIT) {
+ /* if it's too big check if 64K space can be reserved */
+ if (allocated_size + SZ_64K - 1 > IO_SPACE_LIMIT)
+ return -E2BIG;
+
+ size = SZ_64K;
+ pr_warn("Requested IO range too big, new size set to 64K\n");
+ }
+
+ /* add the range to the list */
+ range = kzalloc(sizeof(*range), GFP_KERNEL);
+ if (!range)
+ return -ENOMEM;
+
+ range->start = addr;
+ range->size = size;
+
+ spin_lock(&io_range_lock);
+ list_add_tail(&range->list, &io_range_list);
+ spin_unlock(&io_range_lock);
+#endif
+
+ return 0;
+}
+
+phys_addr_t pci_pio_to_address(unsigned long pio)
+{
+ phys_addr_t address = (phys_addr_t)OF_BAD_ADDR;
+
+#ifdef PCI_IOBASE
+ struct io_range *range;
+ resource_size_t allocated_size = 0;
+
+ if (pio > IO_SPACE_LIMIT)
+ return address;
+
+ spin_lock(&io_range_lock);
+ list_for_each_entry(range, &io_range_list, list) {
+ if (pio >= allocated_size && pio < allocated_size + range->size) {
+ address = range->start + pio - allocated_size;
+ break;
+ }
+ allocated_size += range->size;
+ }
+ spin_unlock(&io_range_lock);
+#endif
+
+ return address;
+}
+
unsigned long __weak pci_address_to_pio(phys_addr_t address)
{
+#ifdef PCI_IOBASE
+ struct io_range *res;
+ resource_size_t offset = 0;
+ unsigned long addr = -1;
+
+ spin_lock(&io_range_lock);
+ list_for_each_entry(res, &io_range_list, list) {
+ if (address >= res->start && address < res->start + res->size) {
+ addr = res->start - address + offset;
+ break;
+ }
+ offset += res->size;
+ }
+ spin_unlock(&io_range_lock);
+
+ return addr;
+#else
if (address > IO_SPACE_LIMIT)
return (unsigned long)-1;

return (unsigned long) address;
+#endif
}

static int __of_address_to_resource(struct device_node *dev,
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index fb7b722..f8cc7da 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -55,7 +55,9 @@ extern void __iomem *of_iomap(struct device_node *device, int index);
extern const __be32 *of_get_address(struct device_node *dev, int index,
u64 *size, unsigned int *flags);

+extern int pci_register_io_range(phys_addr_t addr, resource_size_t size);
extern unsigned long pci_address_to_pio(phys_addr_t addr);
+extern phys_addr_t pci_pio_to_address(unsigned long pio);

extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
struct device_node *node);
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 14:10:02 UTC
Permalink
Add of_pci_get_domain_nr() to retrieve the PCI domain number
of a given device from DT. If the information is not present,
the function can be requested to allocate a new domain number.

Cc: Bjorn Helgaas <***@google.com>
Cc: Arnd Bergmann <***@arndb.de>
Cc: Grant Likely <***@linaro.org>
Cc: Rob Herring <robh+***@kernel.org>
Reviewed-by: Catalin Marinas <***@arm.com>
Signed-off-by: Liviu Dudau <***@arm.com>
---
drivers/of/of_pci.c | 34 ++++++++++++++++++++++++++++++++++
include/linux/of_pci.h | 7 +++++++
2 files changed, 41 insertions(+)

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..a107edb 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
}
EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);

+static atomic_t of_domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used. The local allocator can
+ * be requested to return a new domain_nr if the information is missing
+ * from the device tree.
+ *
+ * @node: device tree node with the domain information
+ * @allocate_if_missing: if DT lacks information about the domain nr,
+ * allocate a new number.
+ *
+ * Returns the associated domain number from DT, or a new domain number
+ * if DT information is missing and @allocate_if_missing is true. If
+ * @allocate_if_missing is false then the last allocated domain number
+ * will be returned.
+ */
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+ int domain;
+
+ domain = of_alias_get_id(node, "pci-domain");
+ if (domain == -ENODEV) {
+ if (allocate_if_missing)
+ domain = atomic_inc_return(&of_domain_nr);
+ else
+ domain = atomic_read(&of_domain_nr);
+ }
+
+ return domain;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);
+
#ifdef CONFIG_PCI_MSI

static LIST_HEAD(of_pci_msi_chip_list);
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index dde3a4a..3a3824c 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -15,6 +15,7 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
int of_pci_get_devfn(struct device_node *np);
int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing);
#else
static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
{
@@ -43,6 +44,12 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
{
return -EINVAL;
}
+
+static inline int
+of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+ return -1;
+}
#endif

#if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
--
2.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-08 15:00:02 UTC
Permalink
Post by Liviu Dudau
Add of_pci_get_domain_nr() to retrieve the PCI domain number
of a given device from DT. If the information is not present,
the function can be requested to allocate a new domain number.
---
drivers/of/of_pci.c | 34 ++++++++++++++++++++++++++++++++++
include/linux/of_pci.h | 7 +++++++
2 files changed, 41 insertions(+)
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..a107edb 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
}
EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
+static atomic_t of_domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used. The local allocator can
+ * be requested to return a new domain_nr if the information is missing
+ * from the device tree.
+ *
+ * allocate a new number.
+ *
+ * Returns the associated domain number from DT, or a new domain number
+ * will be returned.
+ */
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+ int domain;
+
+ domain = of_alias_get_id(node, "pci-domain");
+ if (domain == -ENODEV) {
+ if (allocate_if_missing)
+ domain = atomic_inc_return(&of_domain_nr);
+ else
+ domain = atomic_read(&of_domain_nr);
This function seems a bit broken to me. It is overloaded with too many
different outcomes. Think about how this would work if you have
multiple PCI buses and a mixture of having pci-domain aliases or not.
Aren't domain numbers global? Allocation should then start outside of
the range of alias ids.
Rob
Rob,

Would this version make more sense?

int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
{
int domain;

domain = of_alias_get_id(node, "pci-domain");
if (domain == -ENODEV) {
if (allocate_if_missing)
domain = atomic_inc_return(&of_domain_nr);
else
domain = atomic_read(&of_domain_nr);
} else {
/* remember the largest value seen */
int d = atomic_read(&of_domain_nr);
atomic_set(&of_domain_nr, max(domain, d));
}

return domain;
}
EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);

It would still create gaps and possible duplicates, but this is just a number
and trying to create a new root bus in an existing domain should fail. I have
no clue on how to generate unique values without parsing the DT and filling
a sparse array with values found there and then checking for allocated values
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.


Best regards,
Liviu
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Rob Herring
2014-09-08 15:30:02 UTC
Permalink
Post by Liviu Dudau
Post by Liviu Dudau
Add of_pci_get_domain_nr() to retrieve the PCI domain number
of a given device from DT. If the information is not present,
the function can be requested to allocate a new domain number.
---
drivers/of/of_pci.c | 34 ++++++++++++++++++++++++++++++++++
include/linux/of_pci.h | 7 +++++++
2 files changed, 41 insertions(+)
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..a107edb 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
}
EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
+static atomic_t of_domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used. The local allocator can
+ * be requested to return a new domain_nr if the information is missing
+ * from the device tree.
+ *
+ * allocate a new number.
+ *
+ * Returns the associated domain number from DT, or a new domain number
+ * will be returned.
+ */
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+ int domain;
+
+ domain = of_alias_get_id(node, "pci-domain");
+ if (domain == -ENODEV) {
+ if (allocate_if_missing)
+ domain = atomic_inc_return(&of_domain_nr);
+ else
+ domain = atomic_read(&of_domain_nr);
This function seems a bit broken to me. It is overloaded with too many
different outcomes. Think about how this would work if you have
multiple PCI buses and a mixture of having pci-domain aliases or not.
Aren't domain numbers global? Allocation should then start outside of
the range of alias ids.
Rob
Rob,
Would this version make more sense?
No.
Post by Liviu Dudau
int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
{
int domain;
domain = of_alias_get_id(node, "pci-domain");
if (domain == -ENODEV) {
if (allocate_if_missing)
domain = atomic_inc_return(&of_domain_nr);
else
domain = atomic_read(&of_domain_nr);
} else {
/* remember the largest value seen */
int d = atomic_read(&of_domain_nr);
atomic_set(&of_domain_nr, max(domain, d));
}
return domain;
}
EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);
It would still create gaps and possible duplicates, but this is just a number
and trying to create a new root bus in an existing domain should fail. I have
Is failure okay in that case?
Post by Liviu Dudau
no clue on how to generate unique values without parsing the DT and filling
a sparse array with values found there and then checking for allocated values
You really only need to know the maximum value and then start the
non-DT allocations from there.
Post by Liviu Dudau
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.
Generally, nothing should be accessing the same DT value frequently.
It should get cached somewhere.

I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT? This function
is just a weird mixture of data retrieval and allocation. I think you
need to separate it into 2 functions.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jason Gunthorpe
2014-09-08 16:40:01 UTC
Permalink
Post by Rob Herring
I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT?
ACPI calls them segments and the way Bjorn explained it to me at some moment was
that it was an attempt to split up a bus in different groups (or alternatively,
merge a few busses together). To be honest I haven't seen systems where the domain
is anything other than zero, but JasonG (or maybe Benjamin) were floating an
idea of using the domain number to identify physical slots.
A PCI domain qualifies the bus:device.function addressing so that we
can have duplicate BDFs in the system.

So, yes, they belong in DT - each 'top level' PCI node naturally
represents a unique set of BDF addressing below it, and could alias
other top level nodes. The first step to resolve a BDF to a DT node is
to find the correct 'top level' by mapping the domain number.

In an ideal world no small scale system should ever have domains. They
were primarily introduced for large scale NUMA system that actually
have more than 256 PCI busses.

However, from a DT prespective, we've been saying that if the SOC
presents a PCI-E root port that shares nothing with other root ports
(ie aperture, driver instance, config space) then those ports must be
represented by unique 'top level' DT nodes, and each DT node must be
assigned to a Linux PCI domain.

IMHO, designing such a SOC ignores the API guidelines provides by the
PCI-E spec itself, and I hope such a thing won't be considered SBSA
compatible..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Yijing Wang
2014-09-09 06:00:01 UTC
Permalink
Post by Rob Herring
Post by Liviu Dudau
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.
Generally, nothing should be accessing the same DT value frequently.
It should get cached somewhere.
The problem appears for DTs that don't have the pci-domain info. Then the cached
value is left at the default non-valid value and attempts to rescan the DT will
be made every time the function is called.
Post by Rob Herring
I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT?
ACPI calls them segments and the way Bjorn explained it to me at some moment was
that it was an attempt to split up a bus in different groups (or alternatively,
merge a few busses together). To be honest I haven't seen systems where the domain
is anything other than zero, but JasonG (or maybe Benjamin) were floating an
idea of using the domain number to identify physical slots.
PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
a auto increment domain value.

PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
......
status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
&segment);
if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
dev_err(&device->dev, "can't evaluate _SEG\n");
result = -ENODEV;
goto end;
}
.......

Thanks!
Yijing.
Post by Rob Herring
This function
is just a weird mixture of data retrieval and allocation. I think you
need to separate it into 2 functions.
It is meant to do allocation with the retrieval being a short-cut (or fine
control if you want).
I need to think a bit more for a better solution.
Best regards,
Liviu
Post by Rob Herring
Rob
--
Thanks!
Yijing

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-09 08:50:02 UTC
Permalink
Post by Yijing Wang
Post by Rob Herring
Post by Liviu Dudau
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.
Generally, nothing should be accessing the same DT value frequently.
It should get cached somewhere.
The problem appears for DTs that don't have the pci-domain info. Then the cached
value is left at the default non-valid value and attempts to rescan the DT will
be made every time the function is called.
Post by Rob Herring
I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT?
ACPI calls them segments and the way Bjorn explained it to me at some moment was
that it was an attempt to split up a bus in different groups (or alternatively,
merge a few busses together). To be honest I haven't seen systems where the domain
is anything other than zero, but JasonG (or maybe Benjamin) were floating an
idea of using the domain number to identify physical slots.
PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
a auto increment domain value.
Because you can have more than one hostbridge (rare, but not impossible) and unless you
want to join the two segments, you might want to give it a different domain.

Best regards,
Liviu
Post by Yijing Wang
PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
......
status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
&segment);
if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
dev_err(&device->dev, "can't evaluate _SEG\n");
result = -ENODEV;
goto end;
}
.......
Thanks!
Yijing.
Post by Rob Herring
This function
is just a weird mixture of data retrieval and allocation. I think you
need to separate it into 2 functions.
It is meant to do allocation with the retrieval being a short-cut (or fine
control if you want).
I need to think a bit more for a better solution.
Best regards,
Liviu
Post by Rob Herring
Rob
--
Thanks!
Yijing
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Yijing Wang
2014-09-09 09:40:02 UTC
Permalink
Post by Liviu Dudau
Post by Yijing Wang
Post by Rob Herring
Post by Liviu Dudau
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.
Generally, nothing should be accessing the same DT value frequently.
It should get cached somewhere.
The problem appears for DTs that don't have the pci-domain info. Then the cached
value is left at the default non-valid value and attempts to rescan the DT will
be made every time the function is called.
Post by Rob Herring
I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT?
ACPI calls them segments and the way Bjorn explained it to me at some moment was
that it was an attempt to split up a bus in different groups (or alternatively,
merge a few busses together). To be honest I haven't seen systems where the domain
is anything other than zero, but JasonG (or maybe Benjamin) were floating an
idea of using the domain number to identify physical slots.
PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
a auto increment domain value.
Because you can have more than one hostbridge (rare, but not impossible) and unless you
want to join the two segments, you might want to give it a different domain.
OK. Sorry, I have one last question, because domain will be used to calculate the address used to
access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
we would access the right registers when we use the auto increment domain vaule ?
Has there a mechanism to make sure system can access the correct registers by the domain ?

Thanks!
Yijing.
Post by Liviu Dudau
Best regards,
Liviu
Post by Yijing Wang
PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
......
status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
&segment);
if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
dev_err(&device->dev, "can't evaluate _SEG\n");
result = -ENODEV;
goto end;
}
.......
Thanks!
Yijing.
Post by Rob Herring
This function
is just a weird mixture of data retrieval and allocation. I think you
need to separate it into 2 functions.
It is meant to do allocation with the retrieval being a short-cut (or fine
control if you want).
I need to think a bit more for a better solution.
Best regards,
Liviu
Post by Rob Herring
Rob
--
Thanks!
Yijing
--
Thanks!
Yijing

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Bjorn Helgaas
2014-09-09 14:30:01 UTC
Permalink
Post by Yijing Wang
Post by Liviu Dudau
Post by Yijing Wang
Post by Rob Herring
Post by Liviu Dudau
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.
Generally, nothing should be accessing the same DT value frequently.
It should get cached somewhere.
The problem appears for DTs that don't have the pci-domain info. Then the cached
value is left at the default non-valid value and attempts to rescan the DT will
be made every time the function is called.
Post by Rob Herring
I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT?
ACPI calls them segments and the way Bjorn explained it to me at some moment was
that it was an attempt to split up a bus in different groups (or alternatively,
merge a few busses together). To be honest I haven't seen systems where the domain
is anything other than zero, but JasonG (or maybe Benjamin) were floating an
idea of using the domain number to identify physical slots.
PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
a auto increment domain value.
Because you can have more than one hostbridge (rare, but not impossible) and unless you
want to join the two segments, you might want to give it a different domain.
OK. Sorry, I have one last question, because domain will be used to calculate the address used to
access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
we would access the right registers when we use the auto increment domain vaule ?
Has there a mechanism to make sure system can access the correct registers by the domain ?
I think you're referring to config access via ECAM (PCIe r3.0, sec
7.2.2). In that case, I don't think the domain should be used to
compute the memory-mapped configuration address. Each host bridge is
in exactly one domain, and each host bridge has an associated ECAM
area base. The address calculation uses the bus number, device
number, function number, and register number to compute an offset into
the ECAM area.

So as long as the DT tells you the ECAM information for each host
bridge, that should be sufficient. The domain number is then just a
Linux convenience and is not tied to the platform as it is on ia64.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jason Gunthorpe
2014-09-09 15:50:04 UTC
Permalink
Post by Bjorn Helgaas
So as long as the DT tells you the ECAM information for each host
bridge, that should be sufficient. The domain number is then just a
Linux convenience and is not tied to the platform as it is on ia64.
I think this is right for DT systems - the domain is purely internal
to the kernel and userspace, it is used to locate the proper host
bridge driver instance, which contains the proper config accessor (and
register bases, etc).

AFAIK the main reason to have a DT alias to learn the domain number is
to make it stable so things like udev/etc can reliably match on the
PCI location.

This is similar to i2c, etc that use the alias scheme, so IMHO
whatever they do to assign ID's to drivers should be copied for domain
numbers.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Rob Herring
2014-09-10 02:50:01 UTC
Permalink
On Tue, Sep 9, 2014 at 10:41 AM, Jason Gunthorpe
Post by Jason Gunthorpe
Post by Bjorn Helgaas
So as long as the DT tells you the ECAM information for each host
bridge, that should be sufficient. The domain number is then just a
Linux convenience and is not tied to the platform as it is on ia64.
I think this is right for DT systems - the domain is purely internal
to the kernel and userspace, it is used to locate the proper host
bridge driver instance, which contains the proper config accessor (and
register bases, etc).
AFAIK the main reason to have a DT alias to learn the domain number is
to make it stable so things like udev/etc can reliably match on the
PCI location.
For what purpose?
Post by Jason Gunthorpe
This is similar to i2c, etc that use the alias scheme, so IMHO
whatever they do to assign ID's to drivers should be copied for domain
numbers.
IMO they should not. We really want to move away from aliases, not
expand their use. They are used for serial because there was no good
way to not break things like "console=ttyS0". I2C I think was more
internal, but may have been for i2c-dev. What are we going to break if
we don't have consistent domain numbering? If the domain goes into the
DT, I'd rather see it as part of the PCI root node. But I'm not
convinced it is needed.

It doesn't really sound like we have any actual need to solve this for
DT ATM. It's not clear to me if all buses should be domain 0 or a
simple incrementing index for each bus in absence of any firmware set
value.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jason Gunthorpe
2014-09-10 16:40:02 UTC
Permalink
Post by Rob Herring
On Tue, Sep 9, 2014 at 10:41 AM, Jason Gunthorpe
Post by Jason Gunthorpe
Post by Bjorn Helgaas
So as long as the DT tells you the ECAM information for each host
bridge, that should be sufficient. The domain number is then just a
Linux convenience and is not tied to the platform as it is on ia64.
I think this is right for DT systems - the domain is purely internal
to the kernel and userspace, it is used to locate the proper host
bridge driver instance, which contains the proper config accessor (and
register bases, etc).
AFAIK the main reason to have a DT alias to learn the domain number is
to make it stable so things like udev/etc can reliably match on the
PCI location.
For what purpose?
udev places PCI D:B:D.F's all over the place, eg:

$ ls /dev/serial/by-path/
pci-0000:00:1a.0-usb-0:1.4.2.5:1.0@
pci-0000:00:1a.0-usb-0:1.4.2.6:1.0@
$ ls /dev/disk/by-path/
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0@
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0-part1@
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0-part2@
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0-part3@
^^^^
domain number

It is part of the stable naming scheme udev uses - and that scheme is
predicated on the PCI location being stable.
Post by Rob Herring
IMO they should not. We really want to move away from aliases, not
expand their use. They are used for serial because there was no good
way to not break things like "console=ttyS0". I2C I think was more
internal, but may have been for i2c-dev. What are we going to break if
we don't have consistent domain numbering?
Well, DT needs some kind of solution to produce stable names for
things. It is no good if the names unpredictably randomize.

Lets use I2C as an example. My embedded systems have multiple I2C
busses, with multiple drivers (so load order is not easily ensured).
I am relying on DT aliases to force the bus numbers to stable values,
but if I don't do that - then how do I find things in user space?

$ ls -l /sys/bus/i2c/devices
lrwxrwxrwx 1 root root 0 Sep 10 16:12 i2c-2 -> ../../../devices/pci0000:00/0000:00:01.0/i2c_qsfp.4/i2c-2

Oh, I have to search based on the HW path, which includes the domain
number.

So that *must* be really stable, or user space has no hope of mapping
real hardware back to an I2C bus number.

The same is true for pretty much every small IDR scheme in the kernel
- they rely on the HW path for stable names.

As an aside, I think for embedded being able to directly specify
things like the bus number for I2C, ethX for ethernet, etc is very
valuable, I would be sad to see that go away.
Post by Rob Herring
DT, I'd rather see it as part of the PCI root node. But I'm not
convinced it is needed.
It doesn't really sound like we have any actual need to solve this for
DT ATM. It's not clear to me if all buses should be domain 0 or a
simple incrementing index for each bus in absence of any firmware set
value.
There are already ARM DTs in the kernel that require multiple domains,
so that ship has sailed.

I don't think it really matters where the number comes from, so long
as it doesn't change after a kernel upgrade, or with a different
module load order, or if the DT is recompiled, or whatever. It needs
to be stable.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Yijing Wang
2014-09-10 02:10:02 UTC
Permalink
Post by Bjorn Helgaas
Post by Yijing Wang
OK. Sorry, I have one last question, because domain will be used to calculate the address used to
access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
we would access the right registers when we use the auto increment domain vaule ?
Has there a mechanism to make sure system can access the correct registers by the domain ?
I think you're referring to config access via ECAM (PCIe r3.0, sec
7.2.2). In that case, I don't think the domain should be used to
compute the memory-mapped configuration address. Each host bridge is
in exactly one domain, and each host bridge has an associated ECAM
area base. The address calculation uses the bus number, device
number, function number, and register number to compute an offset into
the ECAM area.
So as long as the DT tells you the ECAM information for each host
bridge, that should be sufficient. The domain number is then just a
Linux convenience and is not tied to the platform as it is on ia64.
Hi Bjorn, you are right, thanks for your detailed explanation! :)

Thanks!
Yijing.
Post by Bjorn Helgaas
Bjorn
.
--
Thanks!
Yijing

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Yijing Wang
2014-09-10 01:50:02 UTC
Permalink
Post by Yijing Wang
OK. Sorry, I have one last question, because domain will be used to calculate the address used to
access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
we would access the right registers when we use the auto increment domain vaule ?
That's a good question and sides with Arnd's suggestion to try to mandate the presence of the PCI
domain in the DTS. However, by grepping through the source code, it looks like the architectures
that use the domain number for reading config registers (x86-based) are non-DT architectures,
while DT-aware arches seem to ignore the domain number except when printing out messages. Is that
another confirmation that most DT-aware architectures have only run with domain_nr = 0?
Arnd's suggestion is make sense to me, thanks for Bjorn's detailed explanation, now I know domain_nr
is purely internal to kernel in DT-aware platform, it's not needed when access PCI config space.

Thanks!
Yijing.
Post by Yijing Wang
Has there a mechanism to make sure system can access the correct registers by the domain ?
Not as such if you look with x86 glasses. With the exception of powerpc all other architecures
seem to happily assume domain_nr = 0 and ignore it in the computation of configuration registers
offsets.
Best regards,
Liviu
Post by Yijing Wang
Thanks!
Yijing.
Post by Liviu Dudau
Best regards,
Liviu
Post by Yijing Wang
PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
......
status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
&segment);
if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
dev_err(&device->dev, "can't evaluate _SEG\n");
result = -ENODEV;
goto end;
}
.......
Thanks!
Yijing.
Post by Liviu Dudau
This function
is just a weird mixture of data retrieval and allocation. I think you
need to separate it into 2 functions.
It is meant to do allocation with the retrieval being a short-cut (or fine
control if you want).
I need to think a bit more for a better solution.
Best regards,
Liviu
Post by Liviu Dudau
Rob
--
Thanks!
Yijing
--
Thanks!
Yijing
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Thanks!
Yijing

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Liviu Dudau
2014-09-10 13:10:02 UTC
Permalink
Post by Liviu Dudau
Post by Liviu Dudau
Add of_pci_get_domain_nr() to retrieve the PCI domain number
of a given device from DT. If the information is not present,
the function can be requested to allocate a new domain number.
---
drivers/of/of_pci.c | 34 ++++++++++++++++++++++++++++++++++
include/linux/of_pci.h | 7 +++++++
2 files changed, 41 insertions(+)
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..a107edb 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
}
EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
+static atomic_t of_domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used. The local allocator can
+ * be requested to return a new domain_nr if the information is missing
+ * from the device tree.
+ *
+ * allocate a new number.
+ *
+ * Returns the associated domain number from DT, or a new domain number
+ * will be returned.
+ */
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+ int domain;
+
+ domain = of_alias_get_id(node, "pci-domain");
+ if (domain == -ENODEV) {
+ if (allocate_if_missing)
+ domain = atomic_inc_return(&of_domain_nr);
+ else
+ domain = atomic_read(&of_domain_nr);
This function seems a bit broken to me. It is overloaded with too many
different outcomes. Think about how this would work if you have
multiple PCI buses and a mixture of having pci-domain aliases or not.
Aren't domain numbers global? Allocation should then start outside of
the range of alias ids.
Rob
Rob,
Would this version make more sense?
No.
Hi Rob,

Here is an updated version of the patch. I have implemented a separate function
for getting the max domain number from the device tree and use the result as
the start value + 1 for the allocator.

Best regards,
Liviu
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
Liviu Dudau
2014-09-08 16:10:02 UTC
Permalink
This is my version 10 of the attempt at adding support for generic PCI host
bridge controllers that make use of device tree information to
configure themselves. This version reverses v9's attempt to create one function
to drive the whole process of extracting the host bridge ranges, setup the
host bridge driver and then scan the root bus behind the host bridge. While it
would've been quite user friendly, I agree that it would've caused a lot of pain
in the future.
I would like to get ACKs for the remaining patches as I would like to integrate
this into -next in the following week.
This version marks an implementation break with the previous versions as
of_create_pci_host_bridge() is now gone. It gets replaced by
of_pci_get_host_bridge_resources() that only parses the DT and extracts the
relevant ranges and converts them to resources. The updated host bridge drivers
static int foohb_probe(struct platform_device *pdev)
{
struct device_node *dn = pdev->dev.of_node;
struct foohb_drv *drv;
resource_size_t io_base = 0; /* phys address for start of IO */
struct pci_bus *bus;
int err = 0;
LIST_HEAD(res);
.....
err = of_pci_get_host_bridge_resources(dn, 0, 255, &res, &io_base);
if (err)
goto err_handling;
err = foohb_setup(drv, ...., &res, &io_base);
if (err)
goto err_handling;
.....
pci_add_flags(....);
bus = pci_scan_root_bus(&pdev->dev, 0, &foohb_ops, drv, &res);
if (!bus)
goto err_handling;
....
return 0;
......
return err;
}
Forgot to mention: the git tree containing the code can be found here:

git://linux-arm.org/linux-ld.git for-upstream/pci_v10

Best regards,
Liviu
- Moved the DT parsing and assigning of IRQ patch from this series into arm64
specific patch. This keeps existing pcibios_add_device() unchanged and adds
an arch-specific version that can later be expanded to cater for dma_ops.
- Incorporated the fix for users of of_pci_range_to_resources() into the patch
that changes the behaviour for easier bisection.
- Added fixes for tegra and rcar host drivers in their usage of
of_pci_range_to_resources()
- Broke up of_create_pci_host_bridge() to remove the callback function. The
function left has been renamed into of_pci_get_host_bridge_resources(). The
added benefit of that is that the architectural hook for fixing up host bridge
resources now dissappears.
- Reshuffled the way pgprot_device gets introduced. It is now part of the patch
that adds pci_remap_iospace() function. The arm64 specific override is moved
into the arm64 patchset.
- Added a patch to pci_scan_root_bus() to assign unassigned resources if PCI
flags are not PCI_PROBE_ONLY
v9 thread here, with links to previous threads: https://lkml.org/lkml/2014/8/12/361
Best regards,
Liviu
Fix ioport_map() for !CONFIG_GENERIC_IOMAP cases.
PCI: Introduce helper functions to deal with PCI I/O ranges.
ARM: Define PCI_IOBASE as the base of virtual PCI IO space.
PCI: OF: Fix the conversion of IO ranges into IO resources.
PCI: Create pci_host_bridge before its associated bus in
pci_create_root_bus.
PCI: Introduce generic domain handling for PCI busses.
OF: Introduce helper function for getting PCI domain_nr
OF: PCI: Add support for parsing PCI host bridge resources from DT
PCI: Assign unassigned bus resources in pci_scan_root_bus()
PCI: Introduce pci_remap_iospace() for remapping PCI I/O bus resources
into CPU space
arch/arm/include/asm/io.h | 1 +
arch/arm/mach-integrator/pci_v3.c | 23 +++---
drivers/of/address.c | 146 ++++++++++++++++++++++++++++++++++++++
drivers/of/of_pci.c | 136 +++++++++++++++++++++++++++++++++++
drivers/pci/host/pci-tegra.c | 10 ++-
drivers/pci/host/pcie-rcar.c | 21 ++++--
drivers/pci/pci.c | 33 +++++++++
drivers/pci/probe.c | 46 +++++++-----
include/asm-generic/io.h | 2 +-
include/asm-generic/pgtable.h | 4 ++
include/linux/of_address.h | 15 ++--
include/linux/of_pci.h | 18 +++++
include/linux/pci.h | 24 +++++++
13 files changed, 429 insertions(+), 50 deletions(-)
--
2.0.4
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
-------------------
.oooO
( )
\ ( Oooo.
\_) ( )
) /
(_/

One small step
for me ...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2014-09-10 18:30:01 UTC
Permalink
We can assume that if a domain is not specified and there is a single
top level PCIe node, the domain defaults to 0. Are there any arm32
platforms that require multiple domains (and do not specify a number in
the DT)?
In theory, I think all of them could work with a single domain, but then
you need to partition the bus number space between the host controllers,
so you have the exact same situation that you either need to make up
random bus numbers or put them in DT.

Using multiple domains is way cleaner for this, even if we have to
make up the numbers.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Phil Edworthy
2014-09-11 14:20:03 UTC
Permalink
Hi,
Post by Arnd Bergmann
We can assume that if a domain is not specified and there is a single
top level PCIe node, the domain defaults to 0. Are there any arm32
platforms that require multiple domains (and do not specify a number in
the DT)?
In theory, I think all of them could work with a single domain, but then
you need to partition the bus number space between the host controllers,
so you have the exact same situation that you either need to make up
random bus numbers or put them in DT.
Using multiple domains is way cleaner for this, even if we have to
make up the numbers.
Maybe this is a stupid question, but why would you want to specify the domain
in the DT at all? Doesn't every instance of a driver imply a separate domain?

Thanks
Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Suravee Suthikulpanit
2014-09-12 08:30:02 UTC
Permalink
This is my version 10 of the attempt at adding support for generic PCI host
bridge controllers that make use of device tree information to
configure themselves. This version reverses v9's attempt to create one function
to drive the whole process of extracting the host bridge ranges, setup the
host bridge driver and then scan the root bus behind the host bridge. While it
would've been quite user friendly, I agree that it would've caused a lot of pain
in the future.
I would like to get ACKs for the remaining patches as I would like to integrate
this into -next in the following week.
This version marks an implementation break with the previous versions as
of_create_pci_host_bridge() is now gone. It gets replaced by
of_pci_get_host_bridge_resources() that only parses the DT and extracts the
relevant ranges and converts them to resources. The updated host bridge drivers
static int foohb_probe(struct platform_device *pdev)
{
struct device_node *dn = pdev->dev.of_node;
struct foohb_drv *drv;
resource_size_t io_base = 0; /* phys address for start of IO */
struct pci_bus *bus;
int err = 0;
LIST_HEAD(res);
.....
err = of_pci_get_host_bridge_resources(dn, 0, 255, &res, &io_base);
if (err)
goto err_handling;
err = foohb_setup(drv, ...., &res, &io_base);
if (err)
goto err_handling;
My understanding is that the "foohb_setup" above is supposed to be equivalent
to the "int (*setup)(struct pci_host_bridge *, resource_size_t)" in V9 that was
passed in as an argument of "of_create_pci_hot_bridge()".

The problem I have is I need an intermediate step between "pci_create_root_bus()"
and "pci_scan_child_bus()" in order to update the information such as the "pci_bus->msi"
before this is propagate down to the child bus during the "pci_scan_child_bus"
which is also called in the pci_scan_root_bus() function.

Does this mean that I should not be calling the pci_scan_root_bus(), and tries to
re-implement the same logic in my driver?

Thanks,

Suravee
.....
pci_add_flags(....);
bus = pci_scan_root_bus(&pdev->dev, 0, &foohb_ops, drv, &res);
if (!bus)
goto err_handling;
....
return 0;
......
return err;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Suravee Suthikulpanit
2014-09-12 10:10:02 UTC
Permalink
Post by Suravee Suthikulpanit
This is my version 10 of the attempt at adding support for generic PCI host
bridge controllers that make use of device tree information to
configure themselves. This version reverses v9's attempt to create one function
to drive the whole process of extracting the host bridge ranges, setup the
host bridge driver and then scan the root bus behind the host bridge. While it
would've been quite user friendly, I agree that it would've caused a lot of pain
in the future.
I would like to get ACKs for the remaining patches as I would like to integrate
this into -next in the following week.
This version marks an implementation break with the previous versions as
of_create_pci_host_bridge() is now gone. It gets replaced by
of_pci_get_host_bridge_resources() that only parses the DT and extracts the
relevant ranges and converts them to resources. The updated host bridge drivers
static int foohb_probe(struct platform_device *pdev)
{
struct device_node *dn = pdev->dev.of_node;
struct foohb_drv *drv;
resource_size_t io_base = 0; /* phys address for start of IO */
struct pci_bus *bus;
int err = 0;
LIST_HEAD(res);
.....
err = of_pci_get_host_bridge_resources(dn, 0, 255, &res, &io_base);
if (err)
goto err_handling;
err = foohb_setup(drv, ...., &res, &io_base);
if (err)
goto err_handling;
Hi Suravee,
Post by Suravee Suthikulpanit
My understanding is that the "foohb_setup" above is supposed to be equivalent
to the "int (*setup)(struct pci_host_bridge *, resource_size_t)" in V9 that was
passed in as an argument of "of_create_pci_hot_bridge()".
Correct. Parameters are probably different, but it is internal to your driver so
you can do whatever you want there.
Post by Suravee Suthikulpanit
The problem I have is I need an intermediate step between "pci_create_root_bus()"
and "pci_scan_child_bus()" in order to update the information such as the "pci_bus->msi"
before this is propagate down to the child bus during the "pci_scan_child_bus"
which is also called in the pci_scan_root_bus() function.
How did that work with my v9 patchset? How does it work for other MSI-aware platforms?
Are they not using pci_scan_child_bus()?
In the of_create_pci_host_bridge of V9, you call the setup function between "pci_create_root_bus()"
and "pci_scan_child_bus()". At that point, I can update the "root_bus->msi" to point to my
"struct msi_chip" which was created during GICv2m initialization
(Please see http://marc.info/?l=linux-kernel&m=141034053331632&w=2).
Then, when a child bus is created, it propagate this msi_chip pointer from the parent bus.

Suravee


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Loading...