Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kernel,cmd] heap fix, meminfo update, discussion #105

Merged
merged 4 commits into from
Nov 22, 2024
Merged

[kernel,cmd] heap fix, meminfo update, discussion #105

merged 4 commits into from
Nov 22, 2024

Conversation

Mellvik
Copy link
Owner

@Mellvik Mellvik commented Nov 19, 2024

This PR fixes a small bug in the heap code appearing when adding more memory to the heap after the initial boot time (big) allocation. Applies only to the code doing SEG-descriptor allocations from the top of the heap. Under some circumstances, the variable that keeps track of the highest-memory-free-block would get disturbed by a heap_add() and lose its intended function. A minimal change fixes that.

Just as interesting though: How does such an addition (a recent introduction to the system with the changes in low memory usage during boot and startup) to the heap space influence heap allocation in general? The typical situation is that the 1k options block is added to the heap at some point during startup. The block is separate from (not contiguous with) the rest of the heap. At this point in the startup process, a few SEG-descriptors have been allocated from the top of the heap as they should (see snippet below). After the new 1k block is added, new small (<1k) heap allocations naturally gravitate into that block, which fills up fast with SEG-descriptors and TTY buffers, sometimes other small allocations. When full, SEG-descriptors will again be allocated from the top of the 'main' heap.

This works out really well for the system. The newly added 1k block lends itself perfectly to such small allocations, keeping them away from 'main' heap, where they would otherwise (without the special TLVC allocation mechanism) fragment the heap quite fast. In fact, the 1k block is large enough to hold most SEG descriptors normally needed on a running system, making maximal use of the freed block.

This PR also updates meminfo with a number of enhancements from @ghaerr/ELKS. Thank you! (ghaerr/elks#1847) - and TLVC adaptions, including the removal of tiny_printf which doesn't work well with long decimals when they get big. A bug for another time and day.

There are still a few display adjustments to be made in meminfo depending on which option(s) is in use.

tlvc-qemu# meminfo
  HEAP   TYPE  SIZE    SEG   STYPE   SSIZE CNT  NAME
  6bee   TASK  9042
  8f4c   CACH  6144
  a758   BUFH   400
  a8f4   DRVR  1024
  ad00   DRVR  1024
  b10c   TTY   1024
  b518   TTY     80
  b574   NETB  1536
  bb80   NETB  1536
  c18c   free 15904
  ffb8   MEM     16   1274   CSEG    7056    1  /bin/init 
  ffd4   MEM     16   25e7   DSEG    5872    1  /bin/init 
  fff0   MEM     16   24ed   BUF     4000    1  
  3ea8   MEM     16   0050   free    1536    0  
  3ec4   MEM     16   142d   DSEG    3056    1  /bin/getty 
  3ee0   MEM     16   2756   CSEG    4096    1  telnetd 
  3efc   MEM     16   4c71   free     816    0  
  3f18   MEM     16   557c   CSEG    5920    1  /bin/getty 
  3f34   MEM     16   56ee   DSEG    5328    1  meminfo 
  3f50   MEM     16   14ec   free      16    0  
  3f6c   MEM     16   467c   DSEG    5680    1  telnetd 
  3f88   MEM     16   4ca4   CSEG   18608    1  ktcp 
  3fa4   MEM     16   512f   free   17616    0  
  3fc0   MEM     16   5f8c   DSEG   58832    1  ktcp 
  3fdc   NETB    12
  3ff4   MEM     16   6de9   free  204144    0  
  4010   MEM     16   2856   CSEG   46064    1  -/bin/sh 
  402c   free    16
  4048   MEM     16   47df   CSEG   14048    1  ftpd 
  4064   MEM     16   4b4d   CSEG    4672    1  meminfo 
  4080   MEM     16   5aea   DSEG   14640    1  ftpd 
  409c   MEM     16   5e7d   free    4336    0  
  40b8   TTY     80
  4114   TTY     80
  4170   MEM     16   3395   DSEG   25216    1  -/bin/sh 
  418c   MEM     16   39bd   free   52208    0  
  41a8   MEM     16   583b   free   10992    0  
  41c4   free   216
  Heap/free   38942/16136, Total mem 514752
  Memory:  503KB total,  218KB used,  285KB free
tlvc-qemu# 
tlvc17-386sx# ./meminfo
  HEAP   TYPE  SIZE    SEG   STYPE   SSIZE CNT  NAME
  6bee   TASK 13974
  a290   CACH 12288
  d29c   BUFH   200
  d370   DRVR  1024
  d77c   DRVR  1024
  db88   TTY   1024
  df94   TTY     80
  dff0   TTY   1024
  e3fc   free  7088
  ffb8   MEM     16   1274   CSEG    7120    1  /bin/init 
  ffd4   MEM     16   256a   DSEG    5920    1  /bin/init 
  fff0   MEM     16   24ed   BUF     2000    1  
  3ea8   MEM     16   0050   free    1536    0  
  3ec4   MEM     16   1431   free    3008    0  
  3ee0   MEM     16   26dc   CSEG    4096    1  telnetd 
  3efc   free    16
  3f18   MEM     16   550c   CSEG    5840    2  /bin/getty 
  3f34   MEM     16   5679   free   16240    0  
  3f50   MEM     16   5ec2   free    1360    0  
  3f6c   MEM     16   4601   DSEG    5648    1  telnetd 
  3f88   MEM     16   4c28   CSEG   18656    1  ktcp 
  3fa4   MEM     16   50b6   DSEG    3056    1  /bin/getty 
  3fc0   MEM     16   5f17   DSEG   58896    1  ktcp 
  3fdc   MEM     16   6d78   free  205952    0  
  3ff8   MEM     16   27dc   CSEG   46048    1  -/bin/sh 
  4014   MEM     16   5175   DSEG    5296    1  ./meminfo 
  4030   MEM     16   4762   CSEG   14032    1  ftpd 
  404c   MEM     16   4acf   CSEG    4672    1  ./meminfo 
  4068   MEM     16   5a70   DSEG   14640    1  ftpd 
  4084   MEM     16   5e03   DSEG    3056    1  /bin/getty 
  40a0   TTY     80
  40fc   TTY     80
  4158   MEM     16   331a   DSEG   25232    1  -/bin/sh 
  4174   TTY     80
  41d0   MEM     16   4bf3   free     848    0  
  41ec   MEM     16   3943   free   52192    0  
  4208   MEM     16   52c0   free    9408    0  
  4224   free   120
  Heap/free   38942/7224, Total mem 514752
  Memory:  503KB total,  219KB used,  284KB free
tlvc17-386sx# 

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

the removal of tiny_printf which doesn't work well with long decimals when they get big.

Can you describe what the actual problem is? I recently update tiny_printf so that it interprets almost all of the decimal and string format descriptors (e.g. field width and precision) the same as printf.

There are still a few display adjustments to be made in meminfo depending on which option(s) is in use.

What are you saying needs be done here, I'm not yet seeing the immediate difference?

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

the removal of tiny_printf which doesn't work well with long decimals when they get big.

Can you describe what the actual problem is? I recently update tiny_printf so that it interprets almost all of the decimal and string format descriptors (e.g. field width and precision) the same as printf.

I'm using an older version of tiny_printf, that's probably it. The new one does not compile after what seems like some Watcom changes, or maybe not...

tiny_vfprintf.c:54:32: warning: unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas]
 #pragma GCC diagnostic ignored "-Wprio-ctor-dtor"
                                ^~~~~~~~~~~~~~~~~~
In file included from tiny_vfprintf.c:22:0:
/Users/helge/src/TLVC/libc/include/sys/rtinit.h:17:25: error: expected ‘)’ before numeric constant
 #define _INIT_PRI_STDIO 20 /* stdio stream setup/flush */
                         ^
tiny_vfprintf.c:55:26: note: in expansion of macro ‘_INIT_PRI_STDIO’
 DESTRUCTOR(__exit_flush, _INIT_PRI_STDIO);
                          ^~~~~~~~~~~~~~~

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

There are still a few display adjustments to be made in meminfo depending on which option(s) is in use.

What are you saying needs be done here, I'm not yet seeing the immediate difference?

This is a 'TODO' statement. You've made a lot of enhancements in meminfo, I need to familiarize myself with them. Like, in meminfo -a, displaying 'heap/free' data doesn't seem meaningful. And in meminfo -m there are occasional blank lines in the display. Etc. This may be different in TLCS than ELKS; after all there are some differences in the data structures (at leas in the SEG types IIRC).

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

The new one does not compile after what seems like some Watcom changes, or maybe not...

It looks like it might be trying to compile "DESTRUCTOR" as a function definition? Perhaps you haven't ported over the DESTRUCTOR macro from include/arch/cdefs.h:

#ifdef __GNUC__
#define noreturn        __attribute__((__noreturn__)) /* don't require <stdnoreturn.h> */
#define stdcall         __attribute__((__stdcall__))
#define restrict        __restrict
#define printfesque(n)  __attribute__((__format__(__gnu_printf__, n, n + 1)))
#define noinstrument    __attribute__((no_instrument_function))
#define CONSTRUCTOR(fn,pri) void fn(void) __attribute__((constructor(pri)))
#define DESTRUCTOR(fn,pri)  void fn(void) __attribute__((destructor(pri)))
#define __wcfar
#define __wcnear
#endif

Things got a little complicated between GCC and OWC's differing ways of specifying constructors/destructors, which are very useful in ordering the initialization and finalization of C library routines on startup and exit. Supporting multiple compilers in a C library can be messy.

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

And in meminfo -m there are occasional blank lines in the display. Etc.

Thanks, I'll take a look at them again. I was finding that the general meminfo display was getting too long in some instances, and thought more options would be useful. Of course, I can't remember what they but I did IIRC write a man page for them :)

In particular I did find I wanted to see a sorted list of main memory, which is what meminfo -m is supposed to do.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

The new one does not compile after what seems like some Watcom changes, or maybe not...

It looks like it might be trying to compile "DESTRUCTOR" as a function definition? Perhaps you haven't ported over the DESTRUCTOR macro from include/arch/cdefs.h:

#ifdef __GNUC__
#define noreturn        __attribute__((__noreturn__)) /* don't require <stdnoreturn.h> */
#define stdcall         __attribute__((__stdcall__))
#define restrict        __restrict
#define printfesque(n)  __attribute__((__format__(__gnu_printf__, n, n + 1)))
#define noinstrument    __attribute__((no_instrument_function))
#define CONSTRUCTOR(fn,pri) void fn(void) __attribute__((constructor(pri)))
#define DESTRUCTOR(fn,pri)  void fn(void) __attribute__((destructor(pri)))
#define __wcfar
#define __wcnear
#endif

That's probably it. I'll have another look. Thanks.

Things got a little complicated between GCC and OWC's differing ways of specifying constructors/destructors, which are very useful in ordering the initialization and finalization of C library routines on startup and exit. Supporting multiple compilers in a C library can be messy.

Indeed. I wanted to avoid that (gcc only), yet here I am because I'm too lazy to go in an understand it all, which is required in order to ditch what isn't relevant. Deserved obstacles I suppose :-)

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

in meminfo -m there are occasional blank lines in the display. Etc.

Ok, I remember now: I added the blank lines to segregate separate "arenas" of main memory that are not contiguous, since
-m is supposed to show a sorted list:
Screen Shot 2024-11-19 at 8 56 32 AM
In the above, the three separate arenas, the result of three seg_adds, correspond to main memory, the INITSEG segment, and the OPTIONS segment. They wouldn't be sorted without the blank line...

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

Ah, that explains it, and makes sense - but hard to understand for an occasional user (who probably wouldn't benefit from the tool in the first place...)

Man-page: I noticed - haven't 'imported it yet, that helps a lot! Thanks

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

I wanted to avoid that (gcc only), yet here I am because I'm too lazy to go in an understand it all,

I get it - delete code and move forward! The big benefit that bringing OWC on was forcing the kernel to accept the idea of multiple (any number of) code and data segments, not just two fixed code and one data segment per process. The OS/2 executable format happened to allow for any number of code and data segments (used to provide virtually unlimited application size up to 8086 limits) and I just adopted that executable format and now ELKS will load and execute a .os2 executable directly. Of course, that only works when using the OWC linker, which natively outputs OS/2 format. (Yes, OS/2 tools could do the same, but we definitely won't be emulating OS/2 system calls!)

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

but hard to understand for an occasional user (who probably wouldn't benefit from the tool in the first place...)

Yeah, they'll likely never run meminfo -m either. I found the blank lines a bit unsettling, but opted to keep them since the whole idea was a quick visual on main memory. The default main memory listing is very hard to follow for main memory since its sorted by near heap, not far heap.

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

but hard to understand for an occasional user

Another idea would be to replace the confusing blank lines with the SEG TYPE SIZE header again instead. You're pretty good at this sort of thing, let me know what you think after playing with it for a while.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

I wanted to avoid that (gcc only), yet here I am because I'm too lazy to go in an understand it all,

I get it - delete code and move forward! The big benefit that bringing OWC on was forcing the kernel to accept the idea of multiple (any number of) code and data segments, not just two fixed code and one data segment per process. The OS/2 executable format happened to allow for any number of code and data segments (used to provide virtually unlimited application size up to 8086 limits) and I just adopted that executable format and now ELKS will load and execute a .os2 executable directly. Of course, that only works when using the OWC linker, which natively outputs OS/2 format. (Yes, OS/2 tools could do the same, but we definitely won't be emulating OS/2 system calls!)

I certainly like the idea (and I did include multi-segment 'support' when I ported over a zilllion header files and accompanying code last week {PR pending}, but skipped fork/exec). Do you see any use for it with gcc?

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

but hard to understand for an occasional user

Another idea would be to replace the confusing blank lines with the SEG TYPE SIZE header again instead. You're pretty good at this sort of thing, let me know what you think after playing with it for a while.

I will take a shot at tiny_printf first, then take a closer look at the more 'advanced' features and UI of meminfo. Interesting (from your screenshot) that the size is that of an ethernet segment. Isn't OPTSEG 1k?

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

I did include multi-segment 'support'
Do you see any use for it with gcc?

Not without more big enhancements to gcc, which don't look like they're forthcoming. The compiler is using a special mechanism to force a second (.fartext) code segment; I'm not sure if it could be easily extended to > 2 code segments. Large data model likely won't ever happen in gcc. Looking at gcc internals I see it is extremely complicated, so I won't be doing it!

then take a closer look at the more 'advanced' features and UI of meminfo

I came up with the following instead of the blank lines, what do you think?
Screen Shot 2024-11-19 at 11 28 02 AM

that the size is that of an ethernet segment. Isn't OPTSEG 1k?

LOL, you're starting to see all numbers with some of their other meanings... 1536 also happens to be 1024+512, and the released low core memory not only includes OPTSEG, but setup's 512-byte data segment. Actually, all memory from OPTSEG up until REL_INITSEG is released, which for now happens to be just those two. This is the reason why SETUP_xxx variables now need to be copied after kernel init, since their previous memory segment is no longer meaningful using setupb/setupw.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 19, 2024

I came up with the following instead of the blank lines, what do you think?

I like it, this certainly works for me. I would probably have chosen 'block' ('segment' being out of the question) instead of 'arena' - having a smaller vocabulary (non-native) means gravitating towards simpler terms. I like 'arena'!

thanks for the reminder about releasing segments to the heap, I haven't completed that part yet. As to 1536, I have that number twice or more in my meminfo listings (qemu) - (netbuffers) so it immediately caught my eye. Blindness has many facets...

@ghaerr
Copy link

ghaerr commented Nov 19, 2024

thanks for the reminder about releasing segments to the heap, I haven't completed that part yet.

I'm a little worried about the correctness of the multiple heap_add and seg_add calls in their respective allocators. I'm pretty sure they both use the same algorithm: they try to automatically merge segments found contiguous with each other. The UNIX V7 malloc is similar (I have gotten very familiar with it), except that it allocates a "busy" block in between sbrk calls in the case that the calls don't return continuous memory; these allocators have no such thing.

Our kernel allocators "split" an existing segment, then try to "unsplit/merge" them back together when possible. I'm not exactly sure how this works, but there was a lot of talk about it when it was implemented. While V7 uses a pointer internal to the memory being managed, our main memory allocator uses the near heap to manage the segment splitting, which of course is why there's a near heap entry for every main memory segment. I'm not sure how the near heap allocator internally differs, other than of course it has its own header which holds the managed memory information.

All of this is fine, except the issue I'm concerned about is whether either of the heap allocators might try to merge memory that was never added, thereby stomping into the middle of the kernel data or main memory segments. I think we're OK, but a potential merge could likely only occur if the additional arenas happened to come at higher addresses, which, by luck, just happen to be the reverse - the additionally added near and far heap memory arenas each have lower addresses. I suppose I need to study the allocator code again.

All this can probably be resolved with testing, by purposely finding test cases where, say, main memory is allocated in a secondary arena, and another allocation comes along that requires a merge, all the while watching through meminfo (which of course disturbs the heaps just to run it). Of course, any new system crashes would quickly prove we have a problem.

What do you think, I noticed you've been into the allocator code recently. Do the algorithms seems safe with multiple arenas?

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 20, 2024

Thanks for the new version of tiny_printf @ghaerr. Getting it to compile opened a can of worms, but now we're here!! Still, my meminfoclocks in at 4304 bytes, but then again I have one more SEG type to handle.

Also thanks for the hint about the (new) compiler options for apps, TLVC didn't have either of those.

@ghaerr
Copy link

ghaerr commented Nov 20, 2024

Thanks for the new version of tiny_printf @ghaerr. Getting it to compile opened a can of worms, but now we're here!!

You're welcome. You likely ran into a bunch of reorganization required when adding OWC as an alternative toolchain. I tried to compartmentalize most of it, but there were lots of changes...

my meminfoclocks in at 4304 bytes, but then again I have one more SEG type to handle.

That alone shouldn't account for the ~280 byte increase in size, I would suspect its the C library differences between systems at this point. I've been a slave to tiny executables since our distribution images are almost all full. I am happy to report that as of very recently, our 1440k uncompressed version has 14 blocks free and is loaded with good stuff. There's still no way I've found to get even the compressed man pages onto the boot floppy though. Perhaps the time will come for a two-floppy distribution with games and man pages on the second floppy.

There's some exciting stuff going on: @rafael2k got what seems to be a full-featured C compiler running on ELKS just today in ghaerr/elks#1443 (comment), and a NASM port running yesterday. Both of these are taking advantage of the much-needed large model Open Watcom C / NE executable (OS/2) format that ELKS now supports. The magic that seems to be working is an experimental memory allocator that uses the near heap for small allocations, and the main memory heap for large allocations. Since the application is running in large model, the pointer size remains the same - 32 bit. OWC supports "huge" pointers, and I've just figured out how they work. That will soon allow for pointer blocks up to 384k+ of memory, should that much contiguous memory be found.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 20, 2024

You likely ran into a bunch of reorganization required when adding OWC as an alternative toolchain. I tried to compartmentalize most of it, but there were lots of changes...

tell me about it! My upcoming 'header file synchronization ' pr has more than 20 files... :-)

my meminfoclocks in at 4304 bytes, but then again I have one more SEG type to handle.

That alone shouldn't account for the ~280 byte increase in size, I would suspect its the C library differences between systems at this point. I've been a slave to tiny executables since our distribution images are almost all full.

Sounds like a libc update should be on my list too!

There's some exciting stuff going on: @rafael2k got what seems to be a full-featured C compiler running on ELKS just today in ghaerr/elks#1443 (comment), and a NASM port running yesterday. Both of these are taking advantage of the much-needed large model Open Watcom C / NE executable (OS/2) format that ELKS now supports. The magic that seems to be working is an experimental memory allocator that uses the near heap for small allocations, and the main memory heap for large allocations. Since the application is running in large model, the pointer size remains the same - 32 bit. OWC supports "huge" pointers, and I've just figured out how they work. That will soon allow for pointer blocks up to 384k+ of memory, should that much contiguous memory be found.

I've been following this one (apparently I made a comment early in that thread which put me on the mailing list). Fascinating indeed, and admittedly I didn't give it much of a chance. Happy to be proven wrong! I'm still not convinced that having a native compiler suite is practically useful, but as an enticing project (and proving skeptics like me wrong), it's undoubtedly great. The energy going into it right now is incredible!

I keep coming back to Venix86 when these messages fly by. K&R of course, V7 style make, but working well on very 'thin' systems. I just looked at them (sizes) for fun. Like, cpp is 18022, as is 34752 bytes etc. If I needed something to do, I'd disasm the key components. as, c0 and copt, possibly ld. cc, cpp and make should be plain v7. For another life.

thank you.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 21, 2024

I'm a little worried about the correctness of the multiple heap_add and seg_add calls in their respective allocators. ....

Thank you for bringing up this issue, @ghaerr ! I've been buried in other updates, but don't let this one slip. I'll be back with some ponderings and tests shortly. It's complicated - not the code but creating robustness 'in retrospect'.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

@ghaerr, I've put some thought into (and tested) the heap allocation mechanisms since you brought up the issue. There are indeed some interesting scenarios, as you point out. Such as whether a deallocation will attempt to merge disjunct blocks of a multi-block heap.

It will not. When attempting to merge a newly freed block, the code will search the free list for a match - above and below. So the 'upper' end of a lower arena will not merge with the lower end of the next arena up since there is never going to be a match. This goes both ways - up and down - and additional arenas added above even the main one will work just as well as those added below. IOW - adding a multi block (sorry, multi-arena) heap does not introduce new risks in that regard.

An entirely different question is what happens if a heap_free is called on a random address. There is currently no sanity check of the pointer in the _free call. It is extremely unlikely (but not impossible) that such a call may cause a bogus block to be merged with an existing free block. OTOH the block WILL be added to the free-list, and thus create havoc when some activity reuses it. And even before that since _free does update pointers in the bogus block.

Whether this is a problem or not is debatable. These calls are kernel_internal only and it would be assumed that a kernel developer will/should catch and fix bugs like that. OTOH the type of symptoms a bug like this will cause may be really hard to diagnose. Also, a reasonable fix shouldn't be all that hard or complicated. Like (say) adding a 5 element array to hold the start and end of each arena, which _free can use to at least keep the damage inside the heap itself. An improvement? Just a little. Damaged heap is as bad as damaged memory anywhere. [I went down this alley first because I've been looking for an easy way to get meminfo -m to display the arenas with start and end addresses, without having to traverse the entire list every time ...]

A more reliable (and costly) alternative is to traverse the list of INUSE blocks, looking for a match before actually freeing it. And panic if it's bad. The cost is a search through half the list (average) for every _free. Something like this (which works):

...
        heap_s *h = ((heap_s *)(data)) - 1;  // back to header

        // Free block will be inserted to free list:
        //   - tail if merged to previous or next free block
        //   - head if still alone to increase 'exact hit'
        //     chance on next allocation of same size

        list_s *i = &_heap_free;
        list_s *n = _heap_all.prev;
        heap_s *this;
        debug_heap("free 0x%x/%u; ", h, h->size);

        /* Sanity check */
        while (n != &_heap_all) {
                this = structof(n, heap_s, all);
                //printk("this+1 %x data %x tag %x\n", this+1, data, this->tag);
                if ((this->tag & HEAP_TAG_USED) && data == (this+1))
                        break;
                n = n->prev;    /* going backwards is slightly faster */
        }
        if (n == &_heap_all)
                panic("bogus heap_free");

        // Try to merge with previous block if free
...

Then of course there is main memory allocation, I haven't looked into that.

@Mellvik Mellvik merged commit 9b184bb into master Nov 22, 2024
@ghaerr
Copy link

ghaerr commented Nov 22, 2024

whether a deallocation will attempt to merge disjunct blocks of a multi-block heap.
It will not.
the code will search the free list for a match

Thank you very much for thinking about this and drawing your conclusion - which I now understand and agree with. In retrospect it seems simple (of course): the big difference between the heap routines and my referenced V7 allocator is that the kernel heap routines keep an a explicit free list, whereas the V7 allocator keeps only a single list, and identifies the in-use or free blocks within an arena using bit 0 in the list pointer.

The explicit free list guarantees no "outside" memory not having ever been added to the heap will ever get merged. Thank you. Both the near and main memory heaps use the same free-list implementation, so it looks like we're safe adding new kernel data segment arenas (actually, free-list entries) and main memory (UMB, bootopts) in any order. Nice.

An entirely different question is what happens if a heap_free is called on a random address.

Another good analysis, thank you. I would say that there's not much usefulness in always checking a free pointer, is if we're going to do that, we might as well start checking all pointers. Of course the kernel always checks any user-mode passed pointer. That said, executing your suggested checking code on a debug kernel when CONFIG_TRACE and a new CHECK_ALLOC is set would add another nice integrity check during development. Having this sort of integrity check in application malloc/free would go a very long way however. I'm working on a debug allocator now and will look into adding this kind of integrity check to it.

Thank you!

@ghaerr
Copy link

ghaerr commented Nov 22, 2024

to get meminfo -m to display the arenas with start and end addresses, without having to traverse the entire list every time

I would say that given the infrequency of running meminfo, pre-traversing the list would be a small price to pay if you're trying to show the arena size (on the previously blank lines)? Lol, a bigger problem might be how to show the size of the first (main) arena without using another valuable line of screen space scrolling off the screen...

Then of course there is main memory allocation

The main memory allocator's fundamental algorithm looks the same as the near heap allocator, except that the main memory allocator actually calls the near heap allocator also for every allocation, for the header. This is so it doesn't have to use far pointers to access the header, which would then cascade the need for far pointers throughout the kernel when accessing the information about any particular memory allocation.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

An entirely different question is what happens if a heap_free is called on a random address.

Another good analysis, thank you. I would say that there's not much usefulness in always checking a free pointer, is if we're going to do that, we might as well start checking all pointers.

I'm not following you. Aside from heap_add and seg_add, which we're not discussing in this context, the pointer submitted to heap_free is the one case where the heap subsystem may be screwed up from 'the outside'. Of course, and as pointed out in the analysis, everything here is kernel internal, and I agree that such a mechanism is for debugging much more than 'regular use'. How this connected with 'might as well checking all pointers', I don't get.

.... executing your suggested checking code on a debug kernel when CONFIG_TRACE and a new CHECK_ALLOC is set would add another nice integrity check during development. Having this sort of integrity check in application malloc/free would go a very long way however. I'm working on a debug allocator now and will look into adding this kind of integrity check to it.

I'm losing you again here. We're discussing the kernel internal heap subsystem, not memory management for user processes, right?

Anyway, I'll make the added code conditional with a comment for now, to be included into debugging infrastructure later.

Thanks.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

to get meminfo -m to display the arenas with start and end addresses, without having to traverse the entire list every time

I would say that given the infrequency of running meminfo, pre-traversing the list would be a small price to pay if you're trying to show the arena size (on the previously blank lines)?

Hard to disagree with that, and that's what I have right now but it turned ugly, so I'm ripping it out, taking a different angle. Sometimes trying too hard to get more functionality out of the code already in place gets really messy.

Lol, a bigger problem might be how to show the size of the first (main) arena without using another valuable line of screen space scrolling off the screen...

So true, and there may be no truly elegant solution to that. One alternative I've had on my mind is to add a column to the display containing the Arena #, then at the bottom, before the final total memory display, print a line with start and end addresses per Arena, probably size too. Just one line. The next alternative on the list is to keep the display as is, but extend the first data line per Arena with a '[3456-FFFF]' indicating start and end. The third alternative is to forget about it and add an option (like -H) to list the heap resources - start, end size, maybe current free/inuse percentages per Arena.

Will see when I get there.

Then of course there is main memory allocation

The main memory allocator's fundamental algorithm looks the same as the near heap allocator, except that the main memory allocator actually calls the near heap allocator also for every allocation, for the header. This is so it doesn't have to use far pointers to access the header, which would then cascade the need for far pointers throughout the kernel when accessing the information about any particular memory allocation.

It's a good (and robust) solution I think, and given the frequency of heap_alloc/heap_free's, keeping the # of checkpoints low is very important, for XT class systems in particular.

@ghaerr
Copy link

ghaerr commented Nov 22, 2024

We're discussing the kernel internal heap subsystem, not memory management for user processes, right?

I was discussing the two kernel heap managers - the near heap heap manager, used for managing the kernel data segment heap, initialized with heap_add, and the "main memory" heap manager, used for managing the memory outside the main heap, which is initialized with seg_add. In my previous comments I was contrasting these two algorithms with the V7 malloc memory manager (used only in applications programs); that was probably confusing. I suppose I was rambling too much comparing the ways that different memory management algorithms manager their arena(s).

how this connected with 'might as well checking all pointers', I don't get.

Given that any pointer returned from heap_alloc is just "another pointer" used in the kernel, I don't understand how that particular pointer would be subject to overwriting any more than any other pointer passed around in the kernel. That is, the kernel doesn't validate any of its own pointers.

the pointer submitted to heap_free is the one case where the heap subsystem may be screwed up from 'the outside'.

Now I'm losing you: you mean that there is something special about a heap pointer versus any other pointer in the kernel data segment? There isn't really. The kernel doesn't validate (e.g. check a pointer for valid range within kernel data segment) any pointers used internally - so why start with just validating a to-be-freed pointer? What I'm getting at here is that if we really wanted to be pedantic and check pointers, every pointer passed to every kernel function should be passed to, say, validate(ptr) for range checking, that's all.

Of course, the biggest problem on 8086 real mode, which you've already pointed out, is by the time a pointer is found to be invalid, it has likely damaged memory elsewhere.

Another thought (in addition to validating allocation pointers) would be to add a "canary" word (or byte) in the heap header. Since the heap is comprised of a linked list of headers+alloc'd memory, we wouldn't need a separate canary at the end of the allocated memory, as is sometimes done in user-mode malloc overwrite instrumentation.

@ghaerr
Copy link

ghaerr commented Nov 22, 2024

add a "canary" word (or byte) in the heap header.

Hey, what do you know, I thought there might be a spare byte in the near heap header!:

struct heap {
        list_s all;
        list_s free;
        word_t size;
        byte_t tag;
        byte_t unused;

In reality, I'm not really sure how helpful this would be - I suppose about as much as checking the 'free' pointer. But when a system is not working, these kinds of things can come in real handy.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

how this connected with 'might as well checking all pointers', I don't get.

Given that any pointer returned from heap_alloc is just "another pointer" used in the kernel, I don't understand how that particular pointer would be subject to overwriting any more than any other pointer passed around in the kernel. That is, the kernel doesn't validate any of its own pointers.

OK I get it. Different angles, different perspectives. I'm compartmentalizing kernel 'services' and discussing from the perspective that each service has an API, and is - for the purpose of the discussion - readable in itself. It's a stretch, I know. When viewing the heap code and the resource it manages as a functional unit (or a service), what comes out (the alloc pointer) is per definition (for the discussion) reliable, while what the outside (drivers in particular) are prone to dynamism - and bugs. That's why I've been focusing on the _free pointer as less reliable than whatever the heap subsystem returns when a service is requested.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

add a "canary" word (or byte) in the heap header.

Hey, what do you know, I thought there might be a spare byte in the near heap header!:

struct heap {
        list_s all;
        list_s free;
        word_t size;
        byte_t tag;
        byte_t unused;

In reality, I'm not really sure how helpful this would be - I suppose about as much as checking the 'free' pointer. But when a system is not working, these kinds of things can come in real handy.

Completely agree, I like the idea a lot. Much easier than traversing a list every time, actually almost free. What do you suggest as the 'canary' content?

@ghaerr
Copy link

ghaerr commented Nov 22, 2024

Well, certainly not FF or 00. 55 is used for these kinds of things and has alternating bits, like AA. (And yes, it's use in a boot block comes to mind).

@ghaerr
Copy link

ghaerr commented Nov 22, 2024

And panic if it's bad.

However, I disagree with using panic, (as you did once when I initially used panic for bad /bootopts) - far better to issue an upper case warning to the console than just stop, IMO.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

Well, certainly not FF or 00. 55 is used for these kinds of things and has alternating bits, like AA. (And yes, it's use in a boot block comes to mind).

So A5 it is then ? :-)

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 22, 2024

And panic if it's bad.

However, I disagree with using panic, (as you did once when I initially used panic for bad /bootopts) - far better to issue an upper case warning to the console than just stop, IMO.

I admittedly don't recall the discussion from back then, but the reasoning is that if the situation is 'impossible' or unpredictable, a panic is in place. In this case it's unpredictable and continuing may crash the system at some point if not right away. IMHO it's then better to panic.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 23, 2024

@ghaerr, on second thought I agree with you. The case does not qualify per the stated requirements for a panic. Corruption has been prevented and the consequences will only come later (when the system runs out of heap) or never - depending on the size of the file deallocation. By that time, there will have been (presumably) many warnings.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 24, 2024

@ghaerr,
this being the thread in which we discuss meminfo, heap, segs etc., I'd like to discuss heap vs. segs in the context of adding available memory chunks after boot/startup.

What we currently have in both TLVC and ELKS is a combination - some freed up memory being added to the heap, some to the main memory pool (segs). Is there a meaningful reasoning behind the selection of which goes where? Size? Location? Something else?

It seems to me that adding small chunks of memory to the heap is more meaningful than adding it to segs because seg-allocations are typically larger than small, ie. > 1536, while small heap allocations are the rule rather than the exception, in particular after startup. As part of the heap, smaller additions become useful right away, sucking up small allocations so they don't fragment the main heap.

What I'm seeing is that the SEG addition from low memory (1536 bytes) might be more useful as heap space than a seemingly permanently empty SEG. Or, if low memory was rearranged once again and - in a setting with no floppy cache - merged with the options block - a larger segment may be added to SEG space, becoming useable (and shrinking the heap of course).

What's your take?

@ghaerr
Copy link

ghaerr commented Nov 24, 2024

Is there a meaningful reasoning behind the selection of which goes where?

Yes - the near heap manager can only manage memory that's in the kernel data segment, while the main memory manager manages everything else. Obviously this depends on what and where exactly is the kernel data segment... read on.

adding small chunks of memory to the heap is more meaningful than adding it to segs

That would be nice, but that can't be made to work unless the data being "added" is already part of the kernel data segment: "heap_add" takes memory that is already in the kernel data segment and adds it to the free list. In normal practice, the kernel data segment is entirely 64k, but the heap itself is always less than that by the size of .data and .bss.

if low memory was rearranged once again and - in a setting with no floppy cache - merged with the options block

Rearranging low memory could open up some possibilities for a different main memory seg_add, but I can't think of any. Also, keep in mind there are two "option blocks": the low far memory one (OPT_SEG) and the one that's already in the kernel data segment, prior to the heap start (char options[1024]).

Main memory is laid out as : 1K interrupt vectors, BIOS data, low memory, kernel code, kernel INITSEG, kernel data seg, then free main memory.

Low memory is laid out as: far options buffer OPT_SEG, setup data REL_SEG, DMASEG, track cache segs.

The kernel data segment is laid out as .data, .bss, then free memory (e.g. only that memory following .bss that can be accessed through DS). Since the kernel data is arranged as the last far segment, the setup_arch function called very early in kernel init reserves the maximum kernel heap, which has to be directly following the kernel .data and .bss. That memory is added as the initial near heap with the first heap_add. The memory following the kernel data segment is then added with the first seg_add as the main memory block.

Hopefully this all makes sense - there isn't any latitude in giving a far memory block to the near kernel heap manager, unless the data is already addressable by DS. This was the key insight when I realized that we could trickily add another 1K to the kernel heap when I was searching for how to save memory; when /bootopts was increased from 512 to 1024 bytes, the actual cost was 1K bytes - 512 from low memory and another 512 from the kernel .data segment. I thought that was too much and started thinking harder about how to get that memory back. Far pointers to OPTSEG were considered, but the problem is that parse_optionsO modifies that data with NUL bytes, and the options data has to be parsed in two passes, so that wouldn't work.

We can't really move the low memory stuff to higher memory for two reasons: the first is that DMASEG has be to below 64k in order to work, and the second is that the bootstrap needs to have a fixed location for OPT_SEG and has no idea how big the kernel is.

There you have it! Its interesting coming to realize that all the C and ASM code in an operating system, at the end of the day, just deals with changing memory values in a master memory arrangement that has to be carefully thought out. IMO, teaching about operating systems should start with the low-level layout, then go to the code, rather than the reverse. It took me years to learn the ELKS memory layout, because its not stated anywhere in the code (although it is documented in the documentation). We have the same issue understanding the layout of the newer 64-bit Linux kernels, where the precise layout of the (kernel and user) virtual address spaces needs to be known in order to understand how it works. I'm still working on that one.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 24, 2024

Sorry for the confusion @ghaerr. I realized - just after posting - that in my mind I had erroneously placed DEF_OPTSG inside the kernel DS. My bad!

OTOH - I'm glad I did, thanks for the rundown! If it's ok with you I'll use parts of it in a wiki about the memory structure, badly needed (even if there is an old one), not the least to refresh my own memory every once in a while.

Revisiting the heap code and adding (finally) some insight into the main memory allocation has been really useful for gaining a better read on the whole picture. I still hate the segmented architecture, but now can admit that it has its uses (and in some settings advantages). BTW - part of the rabbit chase I entered into after the major header update was kind of interesting. init.h was not including config.h (i.e. CONFIG_FARTEXT_KERNEL always undefined) so INITPROC was undefined without me knowing, so a driver (which one depended on configuration) would enter its init routine and then return to oblivion. Only when I removed INITPROC from the init routine and didn't get a compiler error, I finally smelled the rabbit. [thinking about it, it may be interesting to go back and reintroduce the problem to figure out what's really happening since there was no compiler/linker warning about whatever was wrong.]

Anyway, I'm creating a display in meminfo listing the heap blocks and the seg blocks (arenas) w/sizes. I've missed that when working on this stuff and it may be useful later.

Thanks.

@ghaerr
Copy link

ghaerr commented Nov 24, 2024

I'll use parts of it in a wiki about the memory structure, badly needed (even if there is an old one

That'd be great - and I might then just copy that back into the ELKS wiki as well. Thanks.

init.h was not including config.h (i.e. CONFIG_FARTEXT_KERNEL always undefined) so INITPROC was undefined without me knowing

I'm well aware of this one. The problem of includes including more includes has its own problems too, and I solved this by, in most cases, not including more includes unless necessary. What "necessary" means in this case is a set of increased criteria. Admittedly, init.h used to be a borderline case, however, the ordering of kernel includes is not as easily solved as one might think. What I did finally was always include config.h as the first include in any kernel .c file that used any CONFIG_ option, including those that include init.h. It is then explicit, as chained includes are not explicit.

You might be thinking, what the heck, why not just include more includes in each include? The reason for this is that the C library has to include some kernel header files in order to not duplicate kernel data structures. Think sys/stat.h , although there are many more. There were two situations that got completely out of control: the first is that the C library header files themselves are exported out of ELKS for other developers to use to develop applications for ELKS. This ended up exporting kernel variables and internal types that caused compilations to fail. For instance, when Doom was ported, the kernel sector_t type ended up conflicting with Doom's own sector_t, used to track room sectors, not disk sectors. That's all fixed and no internal data types are leaked, except those in arch/types.h (not linuxmt/types.h).

The other problem was build times - due to includes including includes, the C library ended up actually including config.h, which meant that every time the kernel config changed, the entire C library had to be recompiled. While seeming no big deal, just waiting another 1-2 minutes every kernel compile, there was in fact no reason (and should be no reason) that the C library is dependent on the kernel configuration. Doing so would amount to expecting user applications to be relinked with libc every time the kernel config changed. (The C library build creates special "*.d" files as dependency markers to know exactly when each file needs to be recompiled, unlike the fixed kernel Makefile approach).

So - to untangle the mess - the decision was made not to include other includes in many header files, unless necessary based on a larger criteria list. After all the untangling, rather recently, init.h was enhanced to include config.h and types.h explicitly, to avoid the problem you ran into. This was because other developers also ran into the same problem, and got strange hard-to-debug behavior as a result. (BTW, all of this took months to figure out and I'm still untangling things).

@ghaerr
Copy link

ghaerr commented Nov 24, 2024

it may be interesting to go back and reintroduce the problem to figure out what's really happening since there was no compiler/linker warning about whatever was wrong.

The problem is, with our current toolchain, in this case the linker isn't seeing that in two different .o files, one function is defined as extern but defaults to near .text, while in the other it is public, and associated to .fartext. I suspect this has to do with the way the toolchain has been enhanced to actually work with far addresses in the first place: you may have seen for any symbol f, there's a f! and f&. These are used in tricky fashion to learn the segment to be used for f according to the linker scripts (i.e. elks-small.ld). The symbol table in the .o file itself doesn't flag any symbol to be near or far, but the compiler of course emits near or far call/returns based on the definition seen. All of this is, in turn, because the ELF format doesn't support segments - it was designed for a flat address space. So there's no ELF bits defined for near vs far. (BTW, this is why I have disallowed any "extern func" declarations randomly throughout the source, specifically to prohibit this problem and force the use of a single header file with a single declaration).

Since the toolchain itself doesn't flag symbols as being near or far, but is only performed with linker script magic, I am guessing that when the symbol is finally being resolved, there isn't any code re-checking the input .o file to see whether the incoming code segment names differ. This may be because for far data (currently only seen as pointers to far functions as far data itself isn't supported), there is no .data/.fardata to compare against, and would fail in the same way.

There is an old saying that one never fully understands a system until one has studied the linker! In another issue at ELKS, we're porting the C86 compiler which produces ASM output, which we're assembling with NASM to produce dev86 AS86 .o output, only to be linked with dev86 LD86. Getting that working is reminding me of the tons of things we take for granted in a toolchain, lots of things that all have to happen in order for a memory layout to be produced with all the pointers, functions, segments and relocations correct in order to actually run on a CPU.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 26, 2024

Again thanks for the detailed rundowns, @ghaerr. Very valuable.

The include-file cleanup turned out to much more pervasive than I had anticipated- with far reaching consequences into the code files, which meant the cleaning operation became a project in itself. A royal pain - and very valuable. Compared to the months you spent on the overall cleaning - which I pulled over most of), that effort was minimal (and also an educating experience in parts of the systems I've never set foot in).

I hadn't thought about the including includes affecting compile time, but of course it does. I guess I'm spoiled in that department, with the M1 portable until recently and now the M4 mini. Kernel compile after kclean is a mere 16s, system compile after clean is 61s. Recommended.

@Mellvik
Copy link
Owner Author

Mellvik commented Nov 26, 2024

There is an old saying that one never fully understands a system until one has studied the linker!

Having followed your discussions with @tkchia over the years on compiler and linker issues, I can appreciate this one. And accept own limitations in that regards. In fact, at one point I decided not to try to expand my circle of interest to the toolchain, as there was more than enough to chew on already. That said, and having done embedded systems in the past, the conceptual understanding is there - more than enough to appreciate the results of the trickery around memory models, far pointers, thunking etc. And again your rundown increased that understanding another notch.

So the way I understand it, there may be (error-)situations, undetected but the tools, in which code gets built that will 'return to forever' (apologies to chick corea) if called in a certain way. That will be enough for now. After all, it was a tool error (or lacking that error) that revealed the problem in this case too.

I was not in favour of changing tool suites back in the day as I really liked the simplicity of the old one. As often is the case, now that we're here and - among so many other things - can do things like placing the run-once setup code in its own segment - to be released when done, it's great. Having the expertise available to take advantage of it, equally so.

Thanks again for the rundown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants