Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic scaling & Graph meter coloring (new) #714

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Explorer09
Copy link
Contributor

@Explorer09 Explorer09 commented Jul 29, 2021

This pull request implements dynamic scaling support and a "stacked" color for Graph meters. In addition, the Bar meter drawing routine is slightly improved for dynamically expanding the "total" value when needed.

This code would replace the current, one-color graph drawing algorithm so that all Graph meters are drawn with colors.

The data structures and algorithms for the color graphs and dynamically scaling are brand new, totally reworked from #129 and the colors are intended to represent the meter data as close as possible. It still uses braille dots to display the graph, but they now have finer details (I think).

Screenshot


Caveats

  • The graph now shows only half the number of records after this PR. The old code used to show two records per terminal column (utilizing the two columns of a braille character), but due to coloring in terminal display, I have to change it to one record per terminal column, both in ASCII and Unicode terminal display.
  • The new code no longer keeps the graph "values" in the C double (floating point) data type. It now stores the precomputed colors and braille dots instead, and the data are specific to a graph height setting.
  • The "GRAPH_1" and "GRAPH_2" colors defined in CRT.h would become unused.

Technical details

  • The main method of determining the colors of all cells (terminal characters) is the largest remainder method. It ensures the color cells chosen would be representative to the original values of a meter. (Just like how it is used in electing representatives in an electoral system)
  • When dynamic scaling and color graph are combined, the graph color computing routine would perform multiple times to render the colors in multiple scales. The "color cells in multiple scales" are then stored in an array with an indexing method mimicking the "in-order traversal" of a complete binary tree. This allows at most 2 × "graph height" cells to be used for storing colors.

The following are potential features that can be implemented after this pull request, but I doubt if they are good ideas or if I have the ability to implement them:

  • A new color for the scale unit text (%, 32K, 16M, etc.), separate from the caption color. I was considering making the text blue to separate it from the cyan text of the caption.
  • Graph height adjustment at runtime. At least allow the htoprc file to specify the graph height. The settings UI would be the next step to implement.
  • A "colorblind" display of the graph, for the "monochrome" color scheme. The bar mode characters (|#*@$%&) can be reused for this.

@BenBE BenBE added the new feature Completely new feature label Jul 29, 2021
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick note:

Please split the new functions for PrevPowerOfTwo and PopCount into their own commit. Also try to utilize configure.ac to check for them and provide static inline implementations directly in the header (XUtils.h or Macros.h both should work).

There's an example how to check for specific compiler features in configure.ac to determine whether certain warning flags are supported. You should be able to work based on that example there.

Meter.c Outdated Show resolved Hide resolved
@BenBE
Copy link
Member

BenBE commented Jul 29, 2021

Having the caption above the unit makes more sense IMHO.

@Explorer09
Copy link
Contributor Author

  1. I've done moving the caption to above the unit. It's an easy change.
  2. The code for checking __builtin_clz in configure is also done.
AC_MSG_CHECKING(for __builtin_clz)
AC_COMPILE_IFELSE([
   AC_LANG_PROGRAM([], [__builtin_clz(1); /* Requires gcc 3.4 or later */])],
   [AC_DEFINE([HAVE_BUILTIN_CLZ], 1, [Define to 1 if the compiler supports `__builtin_clz' function.])
   AC_MSG_RESULT(yes)],
   AC_MSG_RESULT(no))
  1. I think I can move the prevPowerOf2 function to XUtils.c, but I don't think it's good idea to make it static inline as the code looks non-trivial.
  2. I am reserved on using builtin_popcount. The reason is, for x86 family, POPCNT requires SSE 4.2 (instruction set) or -msse4.2 compiler flag to be efficient. If POPCNT is not available, gcc emits a libgcc call. My PopCount8 implementation is optimized (for 8-bit input specifically) and is smaller than libgcc's popcount. Do you suggest me to check for processor instruction instead of just check the compiler's popcount builtin?

@BenBE
Copy link
Member

BenBE commented Jul 29, 2021

  1. I've done moving the caption to above the unit. It's an easy change.

k. :)

  1. I think I can move the prevPowerOf2 function to XUtils.c, but I don't think it's good idea to make it static inline as the code looks non-trivial.

As inline typically just a hint to the compiler it can still decide to ignore it. Also most recent optimizers do code folding anyway, thus would remove the parts imported from different code units upon linking. Also there's some more magic going on with linkers nowadays, thus I doubt this is a real issue with recent compilers. Compilers can still opt to ignore the inline anytime and will happily do if the code becomes too complex or too wasteful when inlining (unless forced to).

  1. I am reserved on using builtin_popcount. The reason is, for x86 family, POPCNT requires SSE 4.2 (instruction set) or -msse4.2 compiler flag to be efficient. If POPCNT is not available, gcc emits a libgcc call. My PopCount8 implementation is optimized (for 8-bit input specifically) and is smaller than libgcc's popcount. Do you suggest me to check for processor instruction instead of just check the compiler's popcount builtin?

Similar argument with popcount:
Link to godbolt

For any reasonable recent compiler the libgcc function will likely be optimized enough (and likely even use SSE2 when the system has it available), thus only the first call to such a function will receive a slowdown. Also most architectures by now have a POPCNT instruction available, which likely results in that instruction used on corresponding optimization levels. It may even be, that on higher optimization levels the compiler will replace your implementation by that instruction anyway. Better to convey the intention to the compiler than have it figure things out via pattern matching IMHO.

NB: Modern compilers are usually better in optimizing your code, than you are. Better optimize the algorithm by replacing it with a better one than trying to be smart …

@Explorer09
Copy link
Contributor Author

For any reasonable recent compiler the libgcc function will likely be optimized enough (and likely even use SSE2 when the system has it available), thus only the first call to such a function will receive a slowdown. Also most architectures by now have a POPCNT instruction available, which likely results in that instruction used on corresponding optimization levels. It may even be, that on higher optimization levels the compiler will replace your implementation by that instruction anyway. Better to convey the intention to the compiler than have it figure things out via pattern matching IMHO.

NB: Modern compilers are usually better in optimizing your code, than you are. Better optimize the algorithm by replacing it with a better one than trying to be smart …

If I have not read the disassembly of __popcountdi2 I would not have made the argument in the first place. The problem is gcc doesn't provide the "popcount 8 bit" for me, so I implement one by hand. (The usual "popcount" algorithm works great for 32, 64 and 128 bits but not optimal for 8 bits.) And I really don't like the call to the sub-optimal __popcountdi2 anyway.

@Explorer09
Copy link
Contributor Author

Explorer09 commented Jul 29, 2021

Update: I figured out what to do with the popCount8 problem. Commit amended and rebased. See XUtils.h in commit b160428 to see what I mean.
(Update (v3): 28b0a60)

@BenBE
Copy link
Member

BenBE commented Jul 29, 2021

Did some tests with the prevPowerOfTwo function on godbolt:
Depending on your compiler and architecture and settings any of the three variants at times produces the shortest assembly with my tweaked variant being faster on x86, but slower on ARM, except when CLZ is actually used on ARM, where it depends heavily on the compiler version used.

Similar effects are likely visible with the popcount implementation too.

@Explorer09
Copy link
Contributor Author

Did some tests with the prevPowerOfTwo function on godbolt:
Depending on your compiler and architecture and settings any of the three variants at times produces the shortest assembly with my tweaked variant being faster on x86, but slower on ARM, except when CLZ is actually used on ARM, where it depends heavily on the compiler version used.

prevPowerOf2(0) should (in theory) output 0, not 1.
I've experimented with what you did before, but your code is not equivalent to mine, and you didn't assert(x > 0).

@Explorer09 Explorer09 force-pushed the graph-color-new-4 branch 2 times, most recently from 14e63c2 to e0b6f94 Compare July 30, 2021 10:16
XUtils.c Outdated Show resolved Hide resolved
XUtils.h Outdated Show resolved Hide resolved
XUtils.h Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
@BenBE
Copy link
Member

BenBE commented Jul 30, 2021

@Explorer09 Why write everything onto screen directly instead of using RichString as buffers? Advantage would be that you could abuse double-buffering there to shift the graph by two units (data points), thus with two sets of RichStrings you could just shift everything by one character and just draw the last cell …

@Explorer09
Copy link
Contributor Author

@BenBE

Why write everything onto screen directly instead of using RichString as buffers? Advantage would be that you could abuse double-buffering there to shift the graph by two units (data points), thus with two sets of RichStrings you could just shift everything by one character and just draw the last cell

It would break dynamic scaling. Look at Load average meter for example, the whole graph needs to be redrawn when the scale (or unit) changes.

@BenBE
Copy link
Member

BenBE commented Jul 30, 2021

Sure, but those situations can be tracked/detected …

@Explorer09
Copy link
Contributor Author

@BenBE The second reason for not using a RichString buffer is the use may change color scheme of htop at runtime, and the RichString buffer is not designed to reflect color scheme changes. Detecting when to clean the buffers like the cases above is too much work, and so I consider it doesn't worth the trouble.

@BenBE
Copy link
Member

BenBE commented Jul 30, 2021

On the other hand, re-using existing code makes maintaining the code easier once it got merged. And in the current state I'm not too convinced that these 650 LOC of new code are in a state that makes maintaining them easy enough to justify the complexity in this code. Just skimming over I saw several routines, that are justifiably quite long. And I also understand you are hesitant to refactor major parts of that code after that much effort that went in.

I had some PRs of mine that needed refactoring and bug fixing for about half a year until I could finally merge them. And even then there were small issues that needed to be addressed. So given the amount of code in this PR it should be one priority to ease maintaining the code and another one to have it working. Even if this sound's like I'm a killjoy here both are requirements to proceed with merging (this) new code.

@Explorer09 Explorer09 force-pushed the graph-color-new-4 branch 4 times, most recently from cd46429 to 3485b0d Compare August 2, 2021 03:00
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
@eworm-de
Copy link
Contributor

eworm-de commented Aug 2, 2021

First of all: Thanks a lot for picking up on this again!

The new code works and it gives even more precision.
On the other hand it looks strange in some situations... Not sure I like it that way. After all the meters are just to give a good overview, the latest bit of precision is not needed IMHO.

Either way I am happy if the makes its way into master.

@BenBE BenBE added the needs-rebase Pull request needs to be rebased and conflicts to be resolved label Aug 3, 2021
@Explorer09 Explorer09 force-pushed the graph-color-new-4 branch 2 times, most recently from 0760ab7 to 24bd56e Compare August 3, 2021 15:36
@lgtm-com
Copy link

lgtm-com bot commented Aug 3, 2021

This pull request introduces 2 alerts when merging 24bd56e into ed82ce6 - view on LGTM.com

new alerts:

  • 1 for Multiplication result converted to larger type
  • 1 for Comparison result is always the same

@Explorer09 Explorer09 force-pushed the graph-color-new-4 branch 2 times, most recently from aa10e7c to 17e8f97 Compare August 6, 2021 03:54
@eworm-de
Copy link
Contributor

Rebased this on current main... Not sure about all the details, please double check if this is ok.

@Explorer09
Copy link
Contributor Author

@eworm-de Have you tried building the code of this branch with clang-19 and would there be additional errors not covered by #1513?

Clang 19 has -Wshorten-64-to-32 warning enabled by default, which might produce warning lines for some implicit value casts.

@eworm-de
Copy link
Contributor

I tested with GCC only. Will have to take a look at clang when I have some spare time...

@Explorer09 Explorer09 force-pushed the graph-color-new-4 branch 2 times, most recently from 9cb1c10 to cee45a7 Compare August 25, 2024 21:50
@Explorer09
Copy link
Contributor Author

@eworm-de I took my time to revise this patchset.
Notable changes:

  • I implemented graph dynamic scaling on the older/original GraphData structure. This way the graph dynamic scaling feature can be pulled separately from the graph meter coloring (the latter has GraphData structure rework and a complex algorithm for computing colors).
  • Fix a potential devision by zero case when a meter's total value can be zero.
  • Added Meter.curItems > 0 checks because sumPositiveValues no longer accepts a null pointer for the Meter.values member.

@BenBE
Copy link
Member

BenBE commented Sep 27, 2024

This seems to accidentally introduce patches from #1533

@Explorer09
Copy link
Contributor Author

This seems to accidentally introduce patches from #1533

I rebased this branch on #1533 so that I don't need to resolve merge conflicts later.

@Explorer09 Explorer09 force-pushed the graph-color-new-4 branch 2 times, most recently from 240f9e6 to 76b6f29 Compare December 16, 2024 09:53
This suppresses a "-Wshorten-64-to-32" warning in Clang 19.
This property distinguishes meters that have a (relatively) fixed
"total" value and meters that do not have a definite maximum value.
The former meters would be drawn as a "100% stacked bar" or "graph", and
the latter could have their "total" values updated automatically in bar
meter mode.

This commit is a prerequisite for the feature "Graph meter dynamic
scaling and percent graph drawing".

Signed-off-by: Kang-Che Sung <[email protected]>
If "isPercentChart" of a meter is false, update its "total" value
automatically in the bar meter mode. The "total" value is capped to
DBL_MAX in order to ensure the division never produces NaN.

The newly introduced Meter_computeSum() function will be reused by the
feature "Graph meter dynamic scaling and percent graph drawing".

Signed-off-by: Kang-Che Sung <[email protected]>
Implement dynamic scaling for Graph meter mode, and separate it from
"100% graph" drawing. This is controlled by the "isPercentChart"
property of a MeterClass.

If "isPercentChart" is true, the meter would be drawn as a
"100% graph". Graph meters now expect the "total" value may change, and
the newly changed "total" value no longer affects the percent graph
drawings of earlier meter values.

If "isPercentChart" is false, the meter would be drawn with a dynamic
scale.

Signed-off-by: Kang-Che Sung <[email protected]>
Round the graph meter's dynamic scale to a power of two and print the
graph scale. For a percent graph, a "%" character is printed in place
of the scale.

Signed-off-by: Kang-Che Sung <[email protected]>
This is a code quality change that avoids dependency on the hard-coded
GRAPH_HEIGHT in GraphMeterMode_draw().

This doesn't enable variable graph heights per meter, but it makes room
for implementing such feature.

Signed-off-by: Kang-Che Sung <[email protected]>
This is a prerequisite for the feature "Graph meter coloring (with
GraphData structure rework)".

powerOf2Floor() will utilize __builtin_clz() or stdc_bit_floor_ui()
(__builtin_clz() is preferred) if either is supported.

popCount8() will utilize ARM NEON instructions and x86 POPCNT
instruction if the machine supports either of them.

I am not adopting the C23 standard interface stdc_count_ones_uc() yet,
as I am not sure C libraries would implement it as fast as our version.

Signed-off-by: Kang-Che Sung <[email protected]>
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated
Comment on lines 860 to 879
double total;
if (isPercentChart) {
total = MAXIMUM(this->total, sum);
} else {
int scaleExp = 0;
(void)frexp(sum, &scaleExp);
if (scaleExp < 0) {
scaleExp = 0;
}
// In IEEE 754 binary64 (DBL_MAX_EXP == 1024, DBL_MAX_10_EXP == 308),
// "scaleExp" never overflows.
assert(DBL_MAX_10_EXP < 9864);
assert(scaleExp <= INT16_MAX);
valueStart[0].scaleExp = (int16_t)scaleExp;
total = ldexp(1.0, scaleExp);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several issues with this block of code:

  1. The uninitialized variable declaration. It's not wrong, as both code paths eventually initialize it, but it is prone to error and thus I'd rather like to avoid it.
  2. Also a blank above couldn't hurt to separate this block.
  3. The use of magic numbers in the else block, which aren't even commonly encountered, is a bit concerning too.
  4. The else block may need a quick comment for what it tries to do. Good to have the description for the asserts, but that's just half of the story.
  5. Do you think we can use the true-branch to init total and overwrite it with the false-branch if necessary? The compiler should see this dead store and eliminate the initial assignment if necessary. This also avoids the potential for leaving total uninitialized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can use the true-branch to init total and overwrite it with the false-branch if necessary? The compiler should see this dead store and eliminate the initial assignment if necessary. This also avoids the potential for leaving total uninitialized.

I like the idea and it looks like it can save one line of code.

The use of magic numbers in the else block, which aren't even commonly encountered, is a bit concerning too.

The assert is actually redundant if htop only supports IEEE 754 floating points. I have explained what the magic numbers do (DBL_MAX_EXP == 1024, DBL_MAX_10_EXP == 308), but I used the 9864 for a lax checking, which might have confused you. (2^32768 ≈ 10^9864)

I think I could remove that assert to reduce confusion. Or should I assert(DBL_MAX_10_EXP <= 308) instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the hint on 2^32768 ≈ 10^9864 the 9864 is hard to understand. And with the comment above, I'd probably prefer to use the assert(DBL_MAX_10_EXP <= 308) for the check.

Meter.c Outdated Show resolved Hide resolved
Comment on lines +1069 to +1087
// Convert GraphDataCell.c.details bit representation to Unicode braille
// dot ordering.
// (Bit0) a b (Bit3) From: h g f e d c b a (binary)
// (Bit1) c d (Bit4) | | | X X |
// (Bit2) e f (Bit5) | | | | \ / | |
// (Bit6) g h (Bit7) | | | | X | |
// To: 0x2800 + h g f d b e c a
// Braille Patterns [U+2800, U+28FF] in UTF-8: [E2 A0 80, E2 A3 BF]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagram should be somewhat more obvious in the code; eg. as a comment for the details field of the structure declaration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an idea of presenting the details values not as Unicode braille characters, but as Unicode "block octant" characters (U+1CD00 - U+1CDE5). So the "transformation to braille code point" diagram would go here.

I can add a comment to explain what the details field does in the structure definition, however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diagram as is is somewhat easy to follow. It only is a bit burried within the code.

Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Meter.c Outdated Show resolved Hide resolved
Rewrite the entire graph meter drawing code to support color graph
drawing in addition to the dynamic scaling (both can work together
because of the new GraphData structure design).

The colors of a graph are based on the percentages of item values of
the meter. The rounding differences of each terminal character are
addressed through the different numbers of braille dots.

Due to low resolution of the character terminal, the rasterized colors
may not look nice, but better than nothing. :)

The new GraphData structure design has two technical limitations:
* The height of a graph meter now has a limit of 8191 (terminal rows).
* If dynamic scaling is used, the "total" value or scale of a graph now
  has to align to a power of 2.

The code is designed with the anticipation that the graph height may
change at runtime. No UI or option has been implemented for that yet.

Signed-off-by: Kang-Che Sung <[email protected]>
Specifically 'PIXPERROW_*' and 'GraphMeterMode_dots*' constants.
They were commented out rather than removed in the previous commit (for
ease of code reviewing). Now this commit removes the constant defines
for good.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Pull request needs to be rebased and conflicts to be resolved new feature Completely new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants