Agner`s CPU blog

Paging

Author: Kurt Baumgardner

Date: 2016-09-09 23:40

I have received this mail from Kurt Baumgardner and I think it belongs on the forum so that others can join the discussion:

Hi Agner,

My name is Kurt Baumgardner, and I've been reading your processor design draft. Very impressive! Quite obviously a labor of love. You've done your research, which is also obvious from the rest of your site. I highly respect your work and fndings.

Please understand, I don't mean to be critical. But I had to mention my worry about the decision to drop paging from the memory model, which I think is going to be missed. In a handful of 'controversial' choices, you provided a logical rationale for going with one method vs. another, but, either I missed it, or you didn't go into detail about why no paging. I think it is related to your application memory map model, which, if I read correctly, is a simpler (and possibly tighter) app-based, "just what's needed" model. Such a model runs counter to a typical paging scheme, where most mallocs receive pages vs. arbitrary-sized blocks. I will be the first to admit that I must read that section a few more times And, even then, it's a deep subject to follow.

Regardless, I feel that, from what I understand so far, the memory model might be a bit too simple. You described an idea of the compiler calculating the needed stack size, and/or profiling to attempt to determine the needs. And, this is made easier with the 2 stack approach (which I really like). I understand this pre-calculation approach for optimization, but is its purpose also to help reduce complexities that would make your memory model seem less atractive? (I'm not sure if I described that understandably).

Let me take a step back. My biggest issue/fear/disappointment is with the removal of paging. It always nice to fit everything in memory (64Mb is enough for anyone, right? , but you can't always do it. With all the ideas of future compatibility, like with software automatically taking advantage of larger vectors, the loss of paging seems like a huge step backwards, as it seriously limits machines without large amounts of installed memory. I know of a small percentage of server apps that allocate memory with disregard for actual physical memory size, depending on and expecting the paging hardware and software to handle the otherwise messy details of managing extremely large datasets.

In my opinion, 2 of the most important inventions of computing are: #1: using free memory as a disk cache, and #2: built-in automatic paging. There's really no other practical way to provide apps the freedom to allocate memory as needed. Apps have no business trying to home bake their own paging scheme.

Without automatic paging, all the work you did for future compatibility is wasted. Why? Because developers will not write the big programs that could take advantage of future improvements, because they're afraid of running out of memory!

I know the goal is to simplify the system, for elegance, for cost, for power efficiency, for performance, and many other good reasons. But, your future chip must at least be able to support the hugely-successful and useful capabilities of today's processors. The loss of paging cripples the system severely as a whole.

Or, maybe I'm missing something huge. With the current spec, would you simply have to crash a launching (or running) program with an "Out-of-memory" error? Memory is handled so well currently that many programmers don't even test for the possibility of memory allocation to fail (which is awful, but also amazing that they can get away with it).

Thanks for your time reading, and, if you have a minute or two, please let me know what you think, or your rationales, or your thoughts. If you wish, this text could be placed into your forum. Have a nice day!

By the way, I *really* like the idea of programmer-built extensions to the instruction set. I would love to have the following capabilities:

1. Software gate connection (direct gate array access).
2a. Microcode programming (if so equipped), or
2b. the ability to control load/store, ALU, bit shift, cache control, vectoring, etc hardware.
3. An on-chip cache dedicated to user instructions. Could be used for extremely fast lookups/translations, pre-calculated math, etc.

(one can dream

Kurt

Reply To This Message

Previous Message

Next Message