IRC logs for #openrisc Tuesday, 2013-07-30

--- Log opened Tue Jul 30 00:00:49 2013
stekernaargh, the bits in the IMMUPR register are the wrong way around in the arch spec!02:36
stekernDMMUPR i meant02:37
stekernactually, it's in the translate register that they are the wrong way around...02:38
stekernso, DMMUPR is completely useless now...02:41
stekernwell, perhaps not completely useless, they are easily swapped in hardware, and in software emulation of them I can just swap them in the definition03:22
stekernok, so Linux at least boots with the PTE proposed in the ml discussion06:58
jonibostekern: available?07:59
stekernyes, but I'm leaving for lunch in a minute08:00
joniboare you following the arch spec 8 + 11 bits indexing when walking the page tables_08:01
jonibothis results in less than optimal memory usage, right?08:01
joniboif you've got time to discuss after lunch, ping me08:01
stekernright now I'm basically doing exactly what's done in the tlb miss vectors08:01
jonibootherwise we can do this by email08:01
jonibook, i'll take a look what's going on there08:02
jonibothe arch spec for TLB/page table handling needs a bit of a cleanup08:02
stekern"a bit" =P08:25
stekernjonibo: if I you mean using bits 31:24 from VPN as the offset relative to the base address and bits 23:13 from VPN as the offset from the pte table base with 8 + 11 bits indexing, then yes08:32
stekernthat's what is currently done in the kernel (and mor1kx)08:32
jonibook, got it08:32
jonibowhat's with the PL1 bit?08:33
jonibolevel 1/2 indicator08:33
jonibowhat i'd really like to do is be able to mix 1-level and 2-level page tables08:33
stekernit's probably a mess of inbetween decisions how they should implement things08:33
stekerncouldn't that be done if the L flag would be used?08:34
jonibothe kernel maps to 0xc000000000 + 16MB08:34
joniboso if we could set L to 1 for the PGD for address 0xc0000000 we have a huge page that maps linearly08:34
joniboand L sets to 0 for all other PGD entries08:34
joniboin SW it would be possible to dedicate a TLB entry to the kernel mapping so you never take a TLB exception for kernel pages...08:35
jonibo...but for the HW implementation, that doesn't seem possible08:35
jonibobut if it's just 1 level then it's reasonably cheap anyway08:36
stekerncould it be done if support for CID existed? (if only for the MMU)08:37
jonibono, doesn't need CID08:37
jonibowhat's the point of the L bit?08:37
jonibothe HW implementation has to have special knowledge of the two-level structure in order to handle 8/11 bit indexes properly anyway08:38
jonibo(aside: the HW TLB mapper has to write out the A/D bits to the PTE's when replacing an entry...)08:39
jonibo(that aside just so I don't forget to mention it later)08:39
stekernmy point with the CID was that, could you use that to lock down TLB's for the kernel. Assign one CID for the kernel08:41
joniboproblem with CID is that the kernel often wants to access userspace context and if that's in a different CID then you're kind of hooped08:41
stekernhmm, ok08:42
joniboCID might work for process context separation, but not kernel/userspace separations08:42
joniboand then there's the ATB that's not implemented anywhere... it allows 16MB pages, but why both with that when 1-level page directories buy you that08:43
jonibohow much can one change the arch spec here?08:43
joniboxTLBWyMRz regs have a PL1 bit.... I don't get what that's for08:44
joniboand all the match/translate regs use 22bit page frames... is that an error?  you said the implementations uses the high 19 bits of these?  why the high bits?08:46
stekernme neither, but I didn't get your point with the "hw implementation has to have special knowledge of the two-level structure to handle 8/11 bit..."08:47
joniboit needs to know to index the first level by 8 bits and the next level by 11 bits08:47
joniboso it treats level 1 and level 2 differently08:48
stekernyes, ok, I agree on that08:48
joniboanyway, the point of all this is that i'd like to pre-create the kernel page directory so that the HW walker can be used as soon as you enable MMU's08:49
joniboit can be done with 2 pages... the initial PGD (swapper_pg_dir) filled with a single entry pointing at the PMD for the kernel space (16 MB)08:50
joniboalternatively, if we can set L to 1 in the PGD and get a 24-bit "page" for the kernel, even better as then we don't need to create the PMD page at all08:50
jonibo(but that last bit is not necessarily according to spec)08:51
jonibothis would allow us to get rid of the boot_xtlb_miss handlers altogether, too, for the SW implementation08:52
stekernI was just about to write, that if you assume that the hw page wallker _has_ knowledge about the 8/11 split, then you'd be able to that08:52
stekernset the L to 1 in PGD and use that for a 24-bit page08:52
jonibothat's what i'd like to do, yes08:52
jonibobut then what's the point of the 16MB pages in the ATB?08:53
joniboso if do that, the L doesn't really mean "last" (though it does), it really means 16MB page or 8kb page08:55
joniboi.e. normal page/huge page08:55
stekernumm, but even if you have your page table set up like that, the tlb entries will still be for 8kb pages08:55
joniboright... of course... sorry08:56
joniboi think i was thinking there was equivalence in the pgd and pmd entries, but that's not the case08:57
joniboor is it... this is confusing... they are equivalent!?!09:01
stekern(22bit page frames) I think that's an error, at least my implementation in mor1kx only use bits 31:1309:02
jonibook, if or1200 does the same we should modify the arch spec...09:02
stekernwhat does equivalent mean in that context? pmd is folded into pgd, so the pte lookup is (pgd_base + offset)->(pte_base + offset)->pte09:04
jonibocontain same "data"09:04
jonibodata format09:04
joniboanyway, forget that, I think I was confused09:04
joniboso the TLB entries will be for 8kb pages... can that be changed?09:05
jonibomaybe that's what the PL1 bit is for?09:05
stekernbelieve me, I've been (and still am) plenty confused while trying to grasp how the page table works ;)09:05
jonibofor the match register, if "page level is 1" (PL1=1), then you've got a 16MB page09:06
stekernbut isn't that the whole point with the ATLBs?09:06
joniboi think the TLB and ATB design is convoluted09:06
joniboi'd say the TLB part of the spec contains enough to do huge pages09:07
joniboand the ATB is only need for 32GB pages, but who on earth needs those????????09:07
joniboif you need 32GB pages then you might as well turn off the MMU09:07
joniboi'd rip the ATB out of the spec completely09:08
joniboanyway, i'd change the name of PL1 to HP (huge page)09:08
joniboso if level-1 L=1, then PL1=1 (16MB page) otherwise you've got to hit level-2 and then set PL1=0 (8kb page)...09:10
stekernI agree, it looks to me too that you could do huge pages fine with only the tlb09:10
stekernI realised another problem with the arch spec pte too, it has no PRESENT/VALID bit...09:11
jonibogiven that L and PL1 exist, I think somebody has already thought a bit about of this... the names suck but that can be changed09:11
stekernthe hw walker needs to know that09:12
joniboyeah, PRESENT/VALID are needed but can probably be squeezed into bits 12-10 somewhere, right?09:12
jonibo(in the PTE)09:12
jonibobut for that we need to make sure that everybody uses bits 31-13 for the page frame09:13
stekern(PRESENT in 12-10) yes, that's not a problem, just that we need add that to the arch spec09:14
joniboright... which is doable iff the or1200 also uses bits 31-13 for the frame... what if it uses bits 28-10?09:15
joniboi'm not sure we need a valid bit...09:15
joniboin the PTE, I mean... isn't that a TLB detail?09:16
stekernwhat should the hw walker do when there's a pte without the present bit (if it doesn't know about it)09:17
stekernas I have understood it, all other bits can basically be "random" when present is not set.09:18
joniboit needs to know about it... you're right, I think the rest is garbage to help the swapper find the page when it's not set09:19
stekernanyways, I mentioned it before you logged on, I tested redoing the PTE layout as per my suggestion in the mail yesterday and rewrote the tlb miss handlers for it this morning09:21
stekernit boots at least =P09:22
jonibovery good09:22
stekernI managed to shave of one of the register saves in the exception context switch while I rewrote the tlb miss handlers09:22
stekern*shave off09:22
joniboalso good!09:23
jonibowhich one, r12?09:23
stekernno, the tlb miss handlers only save(d) r2,r3,r4,r5,r609:23
jonibooh yeah, right, that one's special09:24
stekernby reorganizing slightly the r6 uses could be removed09:24
joniboi think that stuff could be redone in C, really09:25
stekernI'm inclined to agree09:25
joniboit's really just a matter of making sure the kernel pages are mapped into the TLB and then turn on the MMU again and run the C function09:26
jonibothat's why i'd really liek the 16MB kernel page... it's quick to map it if it's not there09:26
joniboof course, we might get into trouble if we end up overwriting the TLB entry that the kernel is using... that's really why a dedicated "kernel TLB entry" would be nice... can be done in SW if you dedice entry 0 in the TLB to the huge kernel page09:28
stekern(garbage to help the swapper) so is it cool to have the present bit wherever in the pte?09:31
joniboi was wondering the same thing... I don't know but I think it should be ok09:33
joniboother arches have it "anywhere" so it's certainly fine09:35
stekernok, good09:35
joniboare you doing the D and A bits write-back when swapping out a TLB entry?09:36
jonibothose bits are supposed to be sync'ed to the PTE when the TLB entry goes away09:36
stekernno, not currently09:36
stekernshouldn't the tlb miss handler do that too then?09:36
stekernand isn't the dirty bit supposed to be set when something writes to the page?09:37
jonibothe dirty bit should be set when there's a write access to the page (by the CPU)09:38
joniboyes, the TLB miss handler should do that...09:38
stekern(which is pretty much impossible to do in sw, unless the miss itself was a write access)09:38
stekernbut if you do that properly, you can just write-through the tlb, no?09:40
stekernin hw I mean?09:40
jonibothen the TLB would need to look exactly like the PTE...???09:41
stekernumm, yeah....09:41
stekernor just do rmw on the pte on the first access09:41
joniboi wonder if the kernel even uses those bits...???09:42
jonibomaybe it's moot09:42
jonibosince it works today, it would seem so :)09:42
jonibowill need to dig into that too09:42
stekernwell, isn't those just performance tweaks?09:42
jonibonot necessarily, but in this case, perhaps yes09:43
jonibohow does the kernel know a page has bee nwritten to?09:43
joniboit'd have to set it RO, wait for the page fault, set the page RW and set the D bit09:43
joniboseems like a fair bit of work to do for the first page write09:44
jonibois PL1 not implemented on mor1kx/or1200?  that must be a bug...???09:45
jonibo(my huge page bit)09:45
stekernno, not implemented in neither (or well, you can write the bit in mor1kx, but it's not interpreted by anything inside mor1kx)09:47
jonibodo you agree that's a bug?09:48
stekernI'm not sure what it should do, so I can't agree ;)09:49
joniboi was just going to check what or1ksim does09:49
joniboor1ksim seem to do nothing with it09:50
stekern*but*, an interesting fact is that the defines for xMMUCR in spr_defs.h that was wrong, mentioned "Level1 and Level2 page size" and "Vaddr and Paddr widths"09:52
jonibohmmm, interesting09:53
jonibo"A PTE translates a virtual memory area into a physical memory area. How much09:59
jonibovirtual memory is translated depends on which level the PTE resides. PTEs are either in09:59
jonibopage directories with L bit zeroed or in page tables with L bit set. PTEs in page10:00
jonibodirectories point to next level page directory or to final page table that containts PTEs for10:00
joniboactual address translation."10:00
jonibothat's from the arch spec10:00
joniboI think that somehow replace the old DMMUCR stuff from spr-defs10:00
joniboit's now _defined_ to 8 + 11 bits10:00
jonibo_and_ the L bit decides whether it's a huge page or not, _and_ the top-level page directory can contain mixed huge and normal pages10:01
joniboso the TLB needs to know this too, hence PL110:02
joniboPL1 isn't optional so that lack of it is a bug10:02
jonibokthxbye :)10:02
joniboPTBP in DMMUCR is 22bits, again that's got to be wrong10:03
joniboit must be 19 bits, no?10:03
stekernbut in a software tlb reload environment, isn't that the job of the software to set that bit then?10:04
joniboyes, absolutely10:04
joniboif L = 1 at level 1 then you need to set that bit10:04
joniboi'm not say that the SW implementation is correct... I'm just trying to figure out how we can unify the SW and HW implementations as best as possible and how to make it all sane with respect to the arch spec10:05
stekernyeah I know10:05
jonibothe SW implementation doesn't use PL1 at all either, but then again it's meaningless if the HW doesn't implement it10:05
jonibobut how can we make this change backwards-compatible with the or1200 which I presume is abandoned?10:06
jonibokernel config option to special-case OR1200's "broken" behaviour10:07
stekernthere is already that kind of kernel config10:07
joniboI supposed we could actually test for it by mapping a huge page and checking if an access to the second normal page within it generates an exception10:09
jonibo(it should't)10:09
jonibowhy do the protection registers (PPI index) map from 1-7, why not 0-7?10:12
stekern(PTBP 22 bits) hmm, I'm not sure, if you have 8-bits from VPN, you'll need 22 bits "to fill the rest of the bits"10:13
stekernbut I guess that's always page aligned anyway10:13
jonibothe page tables/directories will always be complete pages (8kb), so using 19 bits is sufficient10:14
stekernbut is that OS independent?10:16
joniboisn't it?10:16
stekernI'm the one asking10:16
joniboi'd say it is10:16
stekernbut that's, as you said, waste of memory10:17
joniboespecially given that we enforce 8 + 11 bit indexing10:17
jonibono, not in this case10:17
joniboyou can't realistically allocate less than a page at a time anyway10:18
stekernhmm, yeah10:20
stekern(PPI) oh... no...10:21
stekernok, so we have a "VALID/PRESENT" bit in the arch spec, PPI = 0...10:21
stekernso the whole idea with X W U bits mapping nicely into PPI fail then10:24
joniboi don't see why...10:24
stekernbecause X=0 & W=0 & U=0 was supposed to imply R10:25
joniboright... I see the problem10:25
joniboagain, the arch spec needs to be sane there10:25
jonibowe need to change it so that there are 8 sets of protection... what's the point of not using set 010:26
joniboPPI=0 must be a vlaid value10:26
joniboPPI=0 = invalid... hmmm10:27
joniboso it might be even trickier yet, or better...10:35
stekernso to do that properly according to the arch spec, you'd have to map the protection into numbers, not bitmasks10:35
jonibothe ITLB and DTLB uses separate protection regs10:35
jonibobut they use the same page tables10:36
joniboso the meaning of the PPI field varies depending on whether the instruction tlb or DTLB are in play10:36
stekernin essence, yes10:36
joniboso we really only have 2 bits relevant for DTLB... Writable/User.  And two bits relevant for ITLB: Exec/User10:39
stekernbut you can of course look at it in the way of IMMUPR[PPI] | DMMUPR[PPI]10:39
jonibowait... for itlb it's just 1 option... user... it's always executable10:42
jonibono it's not10:42
stekernwell, in the current itlb miss handler it is... but that's wrong10:43
jonibofor the itlb it's "invalid" if it's not executable10:43
joniboit makes no sense to load a page into the ITLB if it's not executable10:44
stekernare the PAGE_XXX defines used directly in generic kernel code? or are the pte_xxx accessor functions used for that?10:52
jonibopte_ accessors10:53
joniboI think10:54
stekernI'm just wondering how big of a deal it would be to do the PPI field properly10:54
jonibohow properly?10:55
jonibodo you mean "skip the protection registers"?10:56
jonibothat would probably be better10:56
joniboI think part of the question is "has anybody already implemented something around this before"... if not, we don't need to worry so much about compatibility10:57
jonibowe could just introduce a new, parallel MMU system with it's own registers, too and leave the old stuff as it is... just a thought10:58
joniboi need to get some lunch... back in a bit10:58
stekernI meant, like PAGE_PRESENT=1, PAGE_READ=2, PAGE_WRITE=3 etc10:59
jonibowhat about this for PPI:  bit 2 = writable, bit 1 = user, bit 0 = valid/readable/exec11:26
jonibofor dtlb valid = readable11:26
jonibofor itlb valid = executable11:26
jonibofor dtlb valid = readable + maybe user + maybe writable (really only four cases); invalid = 011:28
jonibofor itlb, valid = executable + maybe user (really only 2 cases), invalid = 011:28
joniboso this way we can map the two different uses onto one bitmask... I think i works11:29
stekernhmm, yeah, maybe, it kind of feels right at least ;)11:33
stekernthe only thing that feels a bit off is that READ == EXECUTE, but perhaps that's not an issue in practice11:34
stekernI'll play with that in the WIP pte rework I have in my working tree11:36
stekerntonight or latest tomorrow morning11:36
joniboread == execute... hmm, yeah, that's kind of ugly11:36
jonibobut we'd almost need separate page tables for data and instructions to get around that11:37
jonibobut really, I think the best solution would be to redefine the meaning of the PPI field to make it explcity W/U/X and put the Valid bit elsewhere11:43
jonibothen drop the protection regs altogetehr11:44
jonibothe protection regs are overdimensioned anyway11:44
joniboIMMUPR provides 7 sets with 2 bits each!11:45
joniboDMMUPR provides 7 sets with 4 bit each, but many of those combinations really don't make sense11:45
jonibook, how about this:11:58
jonibowhat if we made the page table indices 11 + 8 bits instead of 8 + 11... that would get our huge page size down to 2 MB which is more reasonable11:58
jonibo(i'd prefer 1 MB, even, if possible)11:58
jonibounfortunately, we'd end up with more second level page tables and they'd not be as full... bad tradeoff11:59
joniboor:  the top level page directory is sparesely populate (as it's just 8 bit indexed)... if the L bit is set, it could mean that the PTE is found between the 8 bit entries in the top-level directory and is 11 bit indexed... that PTE would point to a 21 bit huge page (2MB)12:04
jonibothat's reasonably elegant12:05
jonibostekern: thanks for the discussion, I tried to summarize the above in an email and sent it to the list(s)... let's continue the discussion there in case it leads to something useful/concrete12:56
stekernjonibo: agreed, and thanks for your input too, it helped me a lot!13:24
stekernand I'm leaning towards the W/U/X + PRESENT bit, I've got a feeling that !(PPI) is going to be messy to have as PRESENT, but I still want to investigate that a bit more13:26
--- Log closed Wed Jul 31 00:00:50 2013

Generated by 2.15.2 by Marius Gedminas - find it at!