Xen Zynq Distribution Support Forums
SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - Printable Version

+- Xen Zynq Distribution Support Forums (http://xzdforums.dornerworks.com)
+-- Forum: General Xilinx Support (http://xzdforums.dornerworks.com/forumdisplay.php?fid=1)
+--- Forum: Getting Started (http://xzdforums.dornerworks.com/forumdisplay.php?fid=10)
+--- Thread: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue (/showthread.php?tid=745)



SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - brettstahlman - 10-19-2017

I've written a small app, designed to run in dom0, which uses the Xen "foreignmemory" interface to read arbitrary pages within a user domain. The call to perform the memory mapping succeeds, and a pointer to the mapped buffer is returned by xenforeignmemory_map(). But I get a SIGBUS as soon as I attempt to access the data in the buffer:


Code:
(XEN) traps.c:2508:d0v1 <register values...>
[    62.1234413] Unhandled fault: ttbr address size fault (0x92000000) at 0x0000007fxxxxxxxx
Bus error

Note that the timestamped "Unhandled fault" message is actually from the linux kernel (mm/fault.c). I suspect the "address size fault" occurs because something is telling Xen that 0x7fxxxxxxxx is an invalid (too large) address. Looking in $(RELEASE_DIR)/dts/xen-zcu102.dts, I see the following:
Code:
    memory {
        device_type = "memory";
        reg = <0x0 0x0 0x0 0x80000000 0x8 0x0 0x0 0x80000000>;
    };

...which, IIUC, defines 2, 2G memory regions: one at address 0, the other at address 0x800000000. Adding 2G to the start of the upper range gives an end address of 0x880000000, which is significantly below the address of the buffer I'm attempting to access. Accordingly, I tried modifying the memory device entry to increase the sizes from 0x80000000 to 0x8000000000, which should have placed the offending address within both upper and lower ranges, but the error persisted. Is there another DTB file I would need to modify? The one I modified is the one that gets copied to the sd card's boot partition, which I'm assuming is used by the Xen kernel, not Dom0 itself. I just looked at zynqmp-zcu102.dts (under $RELEASE_DIR/components/linux-kernel/xlnx-4.6/arch/arm64/boot/dts) and saw that its memory node looks identical to the one above, so perhaps that's the problem. But if so, it seems odd that mmap() (used by xenforeignmemory_map() to allocate the buffer) would return an address outside the default ranges defined in the kernel's device tree. Isn't the kernel supposed to use the information in the device tree to configure its memory management? As a test, I allocated a buffer with malloc(), and it was placed at 0x3xxxxxxx, well within the configured limits. (Unfortunately, xenforeignmemory_map() doesn't allow you to pass a pointer to the desired buffer the way mmap() does...)
Is the device tree the likely cause of the SIGBUS error? If so, is changing the memory node's "reg" property the right fix, or does the fact that mmap() returns an address above 4G point to a problem elsewhere?
Thanks,
Brett S.


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - Nathan.Studer - 10-25-2017

(10-19-2017, 05:04 PM)brettstahlman Wrote: I've written a small app, designed to run in dom0, which uses the Xen "foreignmemory" interface to read arbitrary pages within a user domain. The call to perform the memory mapping succeeds, and a pointer to the mapped buffer is returned by xenforeignmemory_map(). But I get a SIGBUS as soon as I attempt to access the data in the buffer:

Can you paste the code from your application?   It's not clear from your description how your application is populating the arguments to that function or determining that the mapping fully succeeded.

(10-19-2017, 05:04 PM)brettstahlman Wrote: Is the device tree the likely cause of the SIGBUS error? If so, is changing the memory node's "reg" property the right fix, or does the fact that mmap() returns an address above 4G point to a problem elsewhere?
Thanks,
Brett S.

It's unlikely to be the device tree.  The address being returned should be a virtual one, so it doesn't follow that it must be an address within physical memory, but the exception does seem to indicate that the mapping didn't completely succeed.

The toolstack save/restore functions should have an example of accessing foreign pages, if that helps at all.

     Nate


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - brettstahlman - 10-25-2017

(10-25-2017, 04:04 PM)Nathan.Studer Wrote:
(10-19-2017, 05:04 PM)brettstahlman Wrote: I've written a small app, designed to run in dom0, which uses the Xen "foreignmemory" interface to read arbitrary pages within a user domain. The call to perform the memory mapping succeeds, and a pointer to the mapped buffer is returned by xenforeignmemory_map(). But I get a SIGBUS as soon as I attempt to access the data in the buffer:

Can you paste the code from your application?   It's not clear from your description how your application is populating the arguments to that function or determining that the mapping fully succeeded.

(10-19-2017, 05:04 PM)brettstahlman Wrote: Is the device tree the likely cause of the SIGBUS error? If so, is changing the memory node's "reg" property the right fix, or does the fact that mmap() returns an address above 4G point to a problem elsewhere?
Thanks,
Brett S.

It's unlikely to be the device tree.  The address being returned should be a virtual one, so it doesn't follow that it must be an address within physical memory, but the exception does seem to indicate that the mapping didn't completely succeed.

The toolstack save/restore functions should have an example of accessing foreign pages, if that helps at all.

     Nate

Nathan,
I've pasted the sample code below. I'll check out the toolstack save/restore functions. As for the sample app, I've just been feeding it arbitrary guest-physical pages to map. The idea is to be able to read (from dom0) guest pages that would be at fixed (non-paged) addresses: e.g., kernel code and data. As far as you know, does the basic concept seem valid? Note that although the ultimate plan is to map pages from user domains, most of my tests thus far have attempted to map pages from dom0 itself.

Thanks,
Brett S.
Code:
/*
* Placeholder PetaLinux user application.
*
* Replace this with your application code
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>           /* PROT_READ */
#include <xenforeignmemory.h>

#define PAGE_SHIFT 12u
#define PAGE_SIZE (1u << PAGE_SHIFT)

int main(int argc, char *argv[])
{
    char *endptr = NULL;
    /* Since we're mapping only 1 page, use int as array of 1 element. */
    int err;

    /* From which dom do we read? */
    uint32_t domid = (uint32_t)strtoul(argv[1], &endptr, 0);
    if (argv[1] == endptr) {
        printf("Invalid domid: %d\n", domid);
        return -1;
    }
    /* At what address do we read? */
    xen_pfn_t pfn = (xen_pfn_t)strtoul(argv[2], &endptr, 0);
    if (argv[2] == endptr) {
        printf("Invalid addr: %llx\n", (unsigned long long)pfn);
        return -1;
    }

    /* Convert address to pfn. */
    pfn >>= PAGE_SHIFT;
    printf("Reading page %0llx in domain %d\n", pfn, domid);

    /* Open a handle for use with map/unmap. */
    xenforeignmemory_handle *xfmh = xenforeignmemory_open(NULL, 0);

    /* Perform the memory mapping. */
    void *addr =
        xenforeignmemory_map(xfmh, domid,  PROT_READ, 1, &pfn, &err);
    if (!addr) {
        printf("Error: Unable to map guest page frame %llx"
               " due to error: %d\n", pfn, err);
        return -1;
    }
    printf("Success: Page mapped at %p\n", addr);
    printf("Trying to read 32-bit: %d\n", *(int32_t *)addr);
    printf("Trying to read 64-bit: %ld\n", *(int64_t *)addr);
    err = xenforeignmemory_unmap(xfmh, addr, 1);
    if (err)
        printf("Unmap attempt failed: %d\n", err);
    /* TODO: close? */

    return 0;
}



RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - brettstahlman - 10-25-2017

(10-25-2017, 06:53 PM)brettstahlman Wrote:
(10-25-2017, 04:04 PM)Nathan.Studer Wrote:
(10-19-2017, 05:04 PM)brettstahlman Wrote: I've written a small app, designed to run in dom0, which uses the Xen "foreignmemory" interface to read arbitrary pages within a user domain. The call to perform the memory mapping succeeds, and a pointer to the mapped buffer is returned by xenforeignmemory_map(). But I get a SIGBUS as soon as I attempt to access the data in the buffer:

Can you paste the code from your application?   It's not clear from your description how your application is populating the arguments to that function or determining that the mapping fully succeeded.

(10-19-2017, 05:04 PM)brettstahlman Wrote: Is the device tree the likely cause of the SIGBUS error? If so, is changing the memory node's "reg" property the right fix, or does the fact that mmap() returns an address above 4G point to a problem elsewhere?
Thanks,
Brett S.

It's unlikely to be the device tree.  The address being returned should be a virtual one, so it doesn't follow that it must be an address within physical memory, but the exception does seem to indicate that the mapping didn't completely succeed.

The toolstack save/restore functions should have an example of accessing foreign pages, if that helps at all.

     Nate

Nathan,
I've pasted the sample code below. I'll check out the toolstack save/restore functions. As for the sample app, I've just been feeding it arbitrary guest-physical pages to map. The idea is to be able to read (from dom0) guest pages that would be at fixed (non-paged) addresses: e.g., kernel code and data. As far as you know, does the basic concept seem valid? Note that although the ultimate plan is to map pages from user domains, most of my tests thus far have attempted to map pages from dom0 itself.

Thanks,
Brett S.
Code:
/*
* Placeholder PetaLinux user application.
*
* Replace this with your application code
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>           /* PROT_READ */
#include <xenforeignmemory.h>

#define PAGE_SHIFT 12u
#define PAGE_SIZE (1u << PAGE_SHIFT)

int main(int argc, char *argv[])
{
    char *endptr = NULL;
    /* Since we're mapping only 1 page, use int as array of 1 element. */
    int err;

    /* From which dom do we read? */
    uint32_t domid = (uint32_t)strtoul(argv[1], &endptr, 0);
    if (argv[1] == endptr) {
        printf("Invalid domid: %d\n", domid);
        return -1;
    }
    /* At what address do we read? */
    xen_pfn_t pfn = (xen_pfn_t)strtoul(argv[2], &endptr, 0);
    if (argv[2] == endptr) {
        printf("Invalid addr: %llx\n", (unsigned long long)pfn);
        return -1;
    }

    /* Convert address to pfn. */
    pfn >>= PAGE_SHIFT;
    printf("Reading page %0llx in domain %d\n", pfn, domid);

    /* Open a handle for use with map/unmap. */
    xenforeignmemory_handle *xfmh = xenforeignmemory_open(NULL, 0);

    /* Perform the memory mapping. */
    void *addr =
        xenforeignmemory_map(xfmh, domid,  PROT_READ, 1, &pfn, &err);
    if (!addr) {
        printf("Error: Unable to map guest page frame %llx"
               " due to error: %d\n", pfn, err);
        return -1;
    }
    printf("Success: Page mapped at %p\n", addr);
    printf("Trying to read 32-bit: %d\n", *(int32_t *)addr);
    printf("Trying to read 64-bit: %ld\n", *(int64_t *)addr);
    err = xenforeignmemory_unmap(xfmh, addr, 1);
    if (err)
        printf("Unmap attempt failed: %d\n", err);
    /* TODO: close? */

    return 0;
}

Just an extra bit of information... After sending the previous message, I noticed a comment that indicated xenforeignmemory_map() could return page-specific error codes, even when a non-NULL pointer was returned. Accordingly, I added a printf and discovered that the code for the first (and only) mapped page was -22 (EINVAL). Not sure what it means yet. I'll try to instrument the Xen foreignmemory lib to see whether I can get some additional insight...

Thanks,
Brett S.


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - Nathan.Studer - 10-27-2017

(10-25-2017, 08:11 PM)brettstahlman Wrote: Nathan,
I've pasted the sample code below. I'll check out the toolstack save/restore functions. As for the sample app, I've just been feeding it arbitrary guest-physical pages to map. 

Thanks,
Brett S.

Code:
/*
* Placeholder PetaLinux user application.
*
* Replace this with your application code
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>           /* PROT_READ */
#include <xenforeignmemory.h>

#define PAGE_SHIFT 12u
#define PAGE_SIZE (1u << PAGE_SHIFT)

int main(int argc, char *argv[])
{
    char *endptr = NULL;
    /* Since we're mapping only 1 page, use int as array of 1 element. */
    int err;

    /* From which dom do we read? */
    uint32_t domid = (uint32_t)strtoul(argv[1], &endptr, 0);
    if (argv[1] == endptr) {
        printf("Invalid domid: %d\n", domid);
        return -1;
    }
    /* At what address do we read? */
    xen_pfn_t pfn = (xen_pfn_t)strtoul(argv[2], &endptr, 0);
    if (argv[2] == endptr) {
        printf("Invalid addr: %llx\n", (unsigned long long)pfn);
        return -1;
    }

    /* Convert address to pfn. */
    pfn >>= PAGE_SHIFT;
    printf("Reading page %0llx in domain %d\n", pfn, domid);

    /* Open a handle for use with map/unmap. */
    xenforeignmemory_handle *xfmh = xenforeignmemory_open(NULL, 0);

    /* Perform the memory mapping. */
    void *addr =
        xenforeignmemory_map(xfmh, domid,  PROT_READ, 1, &pfn, &err);
    if (!addr) {
        printf("Error: Unable to map guest page frame %llx"
               " due to error: %d\n", pfn, err);
        return -1;
    }
    printf("Success: Page mapped at %p\n", addr);
    printf("Trying to read 32-bit: %d\n", *(int32_t *)addr);
    printf("Trying to read 64-bit: %ld\n", *(int64_t *)addr);
    err = xenforeignmemory_unmap(xfmh, addr, 1);
    if (err)
        printf("Unmap attempt failed: %d\n", err);
    /* TODO: close? */

    return 0;
}

It looks like your treating the guest frame number (gfn) as a page frame number (pfn).  (The argument being named pfn is really confusing in some cases.)  Also your approach of taking an arbitrary physical address and converting it to a gfn won't work without a lot of additional work, because it would be hard to sort out if the address actually belonged to the domain, especially because the domain may not be mapped 1 to 1 or be contiguous in main memory.

(10-25-2017, 08:11 PM)brettstahlman Wrote: As far as you know, does the basic concept seem valid? Note that although the ultimate plan is to map pages from user domains, most of my tests thus far have attempted to map pages from dom0 itself.

Thanks,
Brett S.

The concept is valid, since it is used to save and restore domains as well as to do virtual machine introspection.  I'm not sure it was intended to be used on dom0, and I've never tried to do so, so I can't tell you if that should work or not.

(10-25-2017, 08:11 PM)brettstahlman Wrote: Just an extra bit of information... After sending the previous message, I noticed a comment that indicated xenforeignmemory_map() could return page-specific error codes, even when a non-NULL pointer was returned. Accordingly, I added a printf and discovered that the code for the first (and only) mapped page was -22 (EINVAL). Not sure what it means yet. I'll try to instrument the Xen foreignmemory lib to see whether I can get some additional insight...

Thanks,
Brett S.
 
From above I'm guessing either your pfn argument isn't correct or it doesn't work on dom0. 

     Nate


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - Nathan.Studer - 10-27-2017

We're more than happy to help where we can and will continue to do so in this thread, but you may also want to ask this question on the Xen User mailing list as well.  The toolstack experts there may be able to give you additional information that will help you reach your intended goal.

Just be sure to adhere to the mailing list rules when posting and be prepared to justify your approach, since the first question will likely be "why?".

   Nate


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - brettstahlman - 10-27-2017

(10-27-2017, 01:46 PM)Nathan.Studer Wrote:
(10-25-2017, 08:11 PM)brettstahlman Wrote: Nathan,
I've pasted the sample code below. I'll check out the toolstack save/restore functions. As for the sample app, I've just been feeding it arbitrary guest-physical pages to map. 

Thanks,
Brett S.

Code:
/*
* Placeholder PetaLinux user application.
*
* Replace this with your application code
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>           /* PROT_READ */
#include <xenforeignmemory.h>

#define PAGE_SHIFT 12u
#define PAGE_SIZE (1u << PAGE_SHIFT)

int main(int argc, char *argv[])
{
    char *endptr = NULL;
    /* Since we're mapping only 1 page, use int as array of 1 element. */
    int err;

    /* From which dom do we read? */
    uint32_t domid = (uint32_t)strtoul(argv[1], &endptr, 0);
    if (argv[1] == endptr) {
        printf("Invalid domid: %d\n", domid);
        return -1;
    }
    /* At what address do we read? */
    xen_pfn_t pfn = (xen_pfn_t)strtoul(argv[2], &endptr, 0);
    if (argv[2] == endptr) {
        printf("Invalid addr: %llx\n", (unsigned long long)pfn);
        return -1;
    }

    /* Convert address to pfn. */
    pfn >>= PAGE_SHIFT;
    printf("Reading page %0llx in domain %d\n", pfn, domid);

    /* Open a handle for use with map/unmap. */
    xenforeignmemory_handle *xfmh = xenforeignmemory_open(NULL, 0);

    /* Perform the memory mapping. */
    void *addr =
        xenforeignmemory_map(xfmh, domid,  PROT_READ, 1, &pfn, &err);
    if (!addr) {
        printf("Error: Unable to map guest page frame %llx"
               " due to error: %d\n", pfn, err);
        return -1;
    }
    printf("Success: Page mapped at %p\n", addr);
    printf("Trying to read 32-bit: %d\n", *(int32_t *)addr);
    printf("Trying to read 64-bit: %ld\n", *(int64_t *)addr);
    err = xenforeignmemory_unmap(xfmh, addr, 1);
    if (err)
        printf("Unmap attempt failed: %d\n", err);
    /* TODO: close? */

    return 0;
}

It looks like your treating the guest frame number (gfn) as a page frame number (pfn).  (The argument being named pfn is really confusing in some cases.)  Also your approach of taking an arbitrary physical address and converting it to a gfn won't work without a lot of additional work, because it would be hard to sort out if the address actually belonged to the domain, especially because the domain may not be mapped 1 to 1 or be contiguous in main memory.

(10-25-2017, 08:11 PM)brettstahlman Wrote: As far as you know, does the basic concept seem valid? Note that although the ultimate plan is to map pages from user domains, most of my tests thus far have attempted to map pages from dom0 itself.

Thanks,
Brett S.

The concept is valid, since it is used to save and restore domains as well as to do virtual machine introspection.  I'm not sure it was intended to be used on dom0, and I've never tried to do so, so I can't tell you if that should work or not.

(10-25-2017, 08:11 PM)brettstahlman Wrote: Just an extra bit of information... After sending the previous message, I noticed a comment that indicated xenforeignmemory_map() could return page-specific error codes, even when a non-NULL pointer was returned. Accordingly, I added a printf and discovered that the code for the first (and only) mapped page was -22 (EINVAL). Not sure what it means yet. I'll try to instrument the Xen foreignmemory lib to see whether I can get some additional insight...

Thanks,
Brett S.
 
Quote:From above I'm guessing either your pfn argument isn't correct or it doesn't work on dom0. 

     Nate

Quote:It looks like your treating the guest frame number (gfn) as a page frame number (pfn).  (The argument being named pfn is really confusing in some cases.)  Also your approach of taking an arbitrary physical address and converting it to a gfn won't work without a lot of additional work, because it would be hard to sort out if the address actually belonged to the domain, especially because the domain may not be mapped 1 to 1 or be contiguous in main memory.


I've really struggled to find good documentation on Xen's virtual memory terminology... I had been thinking of "PFN" as "physical frame number" (i.e., guest physical)... I found some documentation earlier today that suggests that in a PV such as dom0, the page tables managed by the guest map PFN -> MFN, and that GFN == MFN. IIUC, this would preclude the possibility of loading something like a guest kernel at a known, fixed, non-paged address in a PV, since such a strategy would require the kernels for multiple guests to be loaded at the same machine address.
Quote:The concept is valid, since it is used to save and restore domains as well as to do virtual machine introspection.  I'm not sure it was intended to be used on dom0, and I've never tried to do so, so I can't tell you if that should work or not.


Ok. I'll try it out on a user domain.
Quote:From above I'm guessing either your pfn argument isn't correct or it doesn't work on dom0. 

     Nate


Perhaps it would help if I clarified the intent with a specific example... Suppose I know that one of my guests is running a kernel that's traditionally loaded at 0x100000, and I want to read that kernel's "zero page" from an app running in dom0... I had been thinking I could count on the guest kernel being loaded at a "GFN" of 0x100, and I further assumed (apparently incorrectly) that this was the "PFN" that would be used with xenforeignmemory_map(). Now I'm thinking that was wrong. Is it wrong because a PV guest can't be permitted to load things at fixed guest-physical locations (since GFN == MFN)? If the guest in question were an HVM, would the strategy be different? In an HVM, IIUC, Xen maintains shadow page tables (or perhaps H/W virtualization page tables) to map PFN -> MFN, and GFN == PFN. So in the HVM case, I'm thinking the kernel might actually be loaded at a guest-physical address of 0x100000, and 0x100 would actually be a PFN. I guess what would clear things up is a better understanding of what "PFN" means... It seems to me it means something different in PV and HVM cases: in PVs, it seems to be more like a virtual address than physical, whereas in HVMs, it seems to be equivalent to "guest-physical"...
Thanks,
Brett S.


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - Nathan.Studer - 11-01-2017

(10-27-2017, 07:59 PM)brettstahlman Wrote: I've really struggled to find good documentation on Xen's virtual memory terminology... I had been thinking of "PFN" as "physical frame number" (i.e., guest physical)... I found some documentation earlier today that suggests that in a PV such as dom0, the page tables managed by the guest map PFN -> MFN, and that GFN == MFN. IIUC, this would preclude the possibility of loading something like a guest kernel at a known, fixed, non-paged address in a PV, since such a strategy would require the kernels for multiple guests to be loaded at the same machine address.


Perhaps it would help if I clarified the intent with a specific example... Suppose I know that one of my guests is running a kernel that's traditionally loaded at 0x100000, and I want to read that kernel's "zero page" from an app running in dom0... I had been thinking I could count on the guest kernel being loaded at a "GFN" of 0x100, and I further assumed (apparently incorrectly) that this was the "PFN" that would be used with xenforeignmemory_map(). Now I'm thinking that was wrong. Is it wrong because a PV guest can't be permitted to load things at fixed guest-physical locations (since GFN == MFN)? If the guest in question were an HVM, would the strategy be different? In an HVM, IIUC, Xen maintains shadow page tables (or perhaps H/W virtualization page tables) to map PFN -> MFN, and GFN == PFN. So in the HVM case, I'm thinking the kernel might actually be loaded at a guest-physical address of 0x100000, and 0x100 would actually be a PFN. I guess what would clear things up is a better understanding of what "PFN" means... It seems to me it means something different in PV and HVM cases: in PVs, it seems to be more like a virtual address than physical, whereas in HVMs, it seems to be equivalent to "guest-physical"...
Thanks,
Brett S.

Xen on ARM isn't PV or HVM.  It's a hybrid called PVH.  For your case it resembles HVM in how memory mapping is handled.  Guest Physical Frame Number (GPFN) = Guest Machine Frame Number (GMFN), and Xen sets up the second stage page tables to "auto-translate" guest addresses to machine ones.

Can you give an example of how you're using the program you posted to try and access a DomU's "zero page"?

     Nate


RE: SIGBUS: "ttbr address size fault", possibly caused by .dtb issue - brettstahlman - 11-01-2017

(10-27-2017, 03:29 PM)Nathan.Studer Wrote: We're more than happy to help where we can and will continue to do so in this thread, but you may also want to ask this question on the Xen User mailing list as well.  The toolstack experts there may be able to give you additional information that will help you reach your intended goal.

Just be sure to adhere to the mailing list rules when posting and be prepared to justify your approach, since the first question will likely be "why?".

   Nate

(11-01-2017, 06:33 PM)Nathan.Studer Wrote:
(10-27-2017, 07:59 PM)brettstahlman Wrote: I've really struggled to find good documentation on Xen's virtual memory terminology... I had been thinking of "PFN" as "physical frame number" (i.e., guest physical)... I found some documentation earlier today that suggests that in a PV such as dom0, the page tables managed by the guest map PFN -> MFN, and that GFN == MFN. IIUC, this would preclude the possibility of loading something like a guest kernel at a known, fixed, non-paged address in a PV, since such a strategy would require the kernels for multiple guests to be loaded at the same machine address.


Perhaps it would help if I clarified the intent with a specific example... Suppose I know that one of my guests is running a kernel that's traditionally loaded at 0x100000, and I want to read that kernel's "zero page" from an app running in dom0... I had been thinking I could count on the guest kernel being loaded at a "GFN" of 0x100, and I further assumed (apparently incorrectly) that this was the "PFN" that would be used with xenforeignmemory_map(). Now I'm thinking that was wrong. Is it wrong because a PV guest can't be permitted to load things at fixed guest-physical locations (since GFN == MFN)? If the guest in question were an HVM, would the strategy be different? In an HVM, IIUC, Xen maintains shadow page tables (or perhaps H/W virtualization page tables) to map PFN -> MFN, and GFN == PFN. So in the HVM case, I'm thinking the kernel might actually be loaded at a guest-physical address of 0x100000, and 0x100 would actually be a PFN. I guess what would clear things up is a better understanding of what "PFN" means... It seems to me it means something different in PV and HVM cases: in PVs, it seems to be more like a virtual address than physical, whereas in HVMs, it seems to be equivalent to "guest-physical"...
Thanks,
Brett S.

Xen on ARM isn't PV or HVM.  It's a hybrid called PVH.  For your case it resembles HVM in how memory mapping is handled.  Guest Physical Frame Number (GPFN) = Guest Machine Frame Number (GMFN), and Xen sets up the second stage page tables to "auto-translate" guest addresses to machine ones.

Can you give an example of how you're using the program you posted to try and access a DomU's "zero page"?

     Nate

Thanks. Julien Grall (Xen ARM maintainer) made me aware of this peculiarity of Xen on ARM via the xen-users mailing list the other day. With his and Stefano Stabellini's help, I came to understand that the pfn's expected by xenforeignmemory_map() were the "guest physical" (aka "pseudo physical" or "intermediate physical") addresses produced by ARM's first stage address translation. Oddly enough, my approach was closer to the mark than I realized at the time of my last post on this list (when I was still thinking my domain was pure "PV"). But I mistakenly assumed that Xen would load the linux kernel at a "fixed" guest physical address (like 0x100000 on x86). Julien indicated that on ARM, unlike x86, the kernel's memory was not direct-mapped, but that I could determine its virtual start address by starting my guest domain with `xl -vvv create`. The resulting debug output revealed a kernel start address of 0x40080000 (pfn=0x40080), and when I used the code I posted earlier to map this page, I was able to read the expected data (corresponding to the head of my guest domain's kernel image).

Given that my earlier attempts failed because I was attempting to map GPFN's that had no second stage translation defined, the error message generated by the kernel's handler ("ttbr address size fault") was not very helpful, and actually quite misleading. The misleading error message was generated because 64-bit Xen doesn't populate FSC (Fault Status Code) in HSR_EL1, and 0 happens to be the value corresponding to "address size fault". Julien indicated that he would submit a patch to improve matters, perhaps by reporting a "synchronous external abort".

Anyways, I appreciate all of your help on this.

Thanks,
Brett S.