Replies: 18 comments
-
I think for granularity, you probably want to just match with Regex (see -Wconf in scalac for an example). Maybe rather than line numbers as in the example you specify the full path including function name. Lines change a lot, function names less so. |
Beta Was this translation helpful? Give feedback.
-
In general I think that Rule + file name is by far good enough. Else the comment solution known from other linters, would both be easy to use and familiar. |
Beta Was this translation helpful? Give feedback.
-
I understand that you may not want to introduce a new tool, but this is literally what I designed Betterer to do. Here's the example ESLint test: https://github.com/phenomnomnominal/betterer/blob/v4/packages/eslint/src/eslint.ts For elm-review to work with Betterer it would just need a JS API (probably |
Beta Was this translation helpful? Give feedback.
-
Actually I like the approach of ignoring files and I think that it should support regex when listing which files should be ignored. Even TLDR: Internally we specify which modules we want to validate. In our project we took a different approach. The frontend of our project is elm and the codebase is evolving at the same time as our elm skills. So we try new ways of doing things every now and then and we don't have a lot of consistency. Recently we tried successfully something that we call RequestResponse pattern. After we felt confident with the new approach, we realised that ReviewConfig.elm: config : List Rule
config =
[ NoUnused.Variables.rule
, NoUnused.Dependencies.rule
, ...
, RequestResponseUpdatedFields.rule
[ "API1", "API2" ]
] Context of our Rule: type alias Context =
{ ...
, reviewEnabled: Bool
} Then we use But thinking about it, I think that the approach of having this feature in elm-review will be beneficial because you will know that that file/module was specifically ignored
|
Beta Was this translation helpful? Give feedback.
-
I totally get that! And I agree that it's probably not too hard to integrate with That said, there are some benefits to doing this in the tool rather than through a third-party tool.
I hope this makes sense and why I want to explore this space, which I think would be valuable for static analysis tools out there too. |
Beta Was this translation helpful? Give feedback.
-
You mean you like ignoring files indefinitely over suppressing the errors in the files? I am open to the idea of having a
For everything? That doesn't sound like a great idea to me 🤔 Is it because you're still wondering whether review rules are a good idea? If so, I think you'd be better served if you could suppress errors, because you would at least be notified in errors showing up in ignored files, and based on that you could decide whether to fully adopt or remove this rule. For the use-case of the |
Beta Was this translation helpful? Give feedback.
-
Oh sorry, I meant suppressing the error. 🙇♂️
Only for our custom rules that target specific modules |
Beta Was this translation helpful? Give feedback.
-
I'm not using Elm currently, but I can share my experience of adopting Haskell linter HLint and Haskell Static Analyser Stan on a large Haskell codebase. When running HLint initially, it produced 130K suggestions, and it was not possible to fix them automatically. What we did is probably similar to Betterer approach, though I haven't heard about Betterer before. We've implemented a snapshot file that stores serialized output of HLint — hints name, file, source position. This snapshot file represents ignored hints, and by default, everything is ignored. Hints found in a Haskell file are filtered by the snapshot file. If the file content changes (e.g. new lines are added), all hints below this line are triggered. Our CI requires to fix all triggered hints, so when you touch new files for the first time, you need to fix all hints in them. This approach allows us to fix problems incrementally, while we work in the most active files. This makes sense because if you don't touch some source code file for years, it probably works even with some warnings 🙂 To summarize, the approach contains several parts:
We have our own build tool to build our complicated project, so our linting commands support incremental linting out-of-the-box and are rerunning only for changed files. For Stan we took a similar approach, but Stan supports ignoring observations via config natively. Each observation found in code has a unique ~15-character length ID that depends on check, module name and source code position. You can specify in the config all ignored ids. So a similar approach is taken — generate all hints, ignore them and fix gradually. Stan also allows ignoring whole directories, which is indeed helpful for ignoring auto-generated code. Our projects dynamically generates 200 Haskell modules with 100K lines of code total, and ignoring them file by file is not convenient 🙂 |
Beta Was this translation helpful? Give feedback.
-
I would love to introduce elm-review to the main NoRedInk monorepo, so I'd like to provide some context:
Due to the size of the repo it is not feasible to introduce elm-review with a single PR, so my current plan was to introduce it gradually in various areas of the codebase by writing a custom script and providing this sort of configuration: {
"groups": [
{
"name": "Teach/Classes",
"folders": [
"src/Page/Teach/Classes",
"ui/tests/Page/Teach/Classes"
],
"ignore": [
{
"path": "src/Page/Teach/Classes/Routes.elm",
"rule": "NoUnused.Exports",
"match": "classDetailsUrl"
},
{
"path": "src/Page/Teach/Classes/Routes.elm",
"rule": "NoUnused.Exports",
"match": "studentDetailsUrl"
}
]
},
{
"name": "Curriculum",
"folders": [
"src/Page/Curriculum",
"src/Page/Admin/Curriculum/CsvImportTool",
"ui/tests/Page/Curriculum"
],
"ignore": [
{
"path": "src/Page/Curriculum/Update.elm",
"rule": "NoUnused.Variables",
"match": "curriculumAPI"
}
]
}
]
} Each group represents a list of folders which are meant to be analyzed as a single area in the codebase, with a list of errors that we want to ignore. My plan is to filter through the output of In this way I'd be able to add a CI step that verifies that elm-review passes for these areas of the codebase and we would be able to introduce it gradually. Due to the structure of our teams, when you are working on a feature it is very likely you are going to touch code in those folders. Most of our shared UI components are hosted at https://github.com/NoRedInk/noredink-ui so they are outside the monorepo. I was also thinking I should be able to hack together a script to integrate this in an editor: check if the current path is part of any of the folders, find the group that folder belongs to and then run elm-review. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the info @chshersh!
Hmm, so you use the brittleness of this approach as a feature rather than a problem. That's interesting. I imagine that if you do not want to fix the issue right now anyway, you simply re-run the You mention you suppress those errors by source position. Do you do so by the exact position from start to end, or do you only take the start position or just the lines?
This is a bit tricky to do with
Yes, the same thing is true for Ignored files/directories means that we do not want Suppressed errors, which I'm proposing here, would mean that these issues should be fixed but have yet to be fixed. Currently, suppressing errors is done by ignoring files, but I find that having this distinction would help much with incrementally adopting a rule. It is hard to know whether a file is ignored temporarily or ignored for good reason when using the same API for both. @Arkham Let me know how your setup works. I think that you could do most of the same things by using the current |
Beta Was this translation helpful? Give feedback.
-
Yes, that's correct! 👍
We do it only by the starting position: line start and column. |
Beta Was this translation helpful? Give feedback.
-
You could show "3 errors, 15 suppressed" in the output, similar to how |
Beta Was this translation helpful? Give feedback.
-
Here are a few thoughts regarding granularity. In my opinion, having a way to specify a specific number of errors to expect in a specific file is going to become a crutch that leads to problems in people's codebases. It will lead to not trusting the tool, because you can have a new issue crop up without any warning, so it's only giving you a false sense of security there. My thinking is that it's more realistic to present it as rules being either on or off for a file (no in-between). Then you can focus on getting a particular file sorted out. Getting the number down and partially fixing a file seems like a bad approach to incremental adoption in my opinion. |
Beta Was this translation helpful? Give feedback.
-
I think it might be a good idea to separate out temporarily ignored rules for files into a generated file. Then just use Adding to Ignored$ elm-review --introduce-ignore-entries
Found 14 errors, 0 ignored problems
Would you like me to add this ignored entry?
Main.elm, InconsistentImports..., 2 errors
[Y/n]
Would you like me to add this ignored entry?
User.elm, InconsistentImports..., 12 errors
[Y/n]
elm-review-ignored.json has been updated.
Found 0 errors, 14 ignored problems in 2 ignore entries Addressing/Removing Ignored Entrieselm-review
Found 0 errors, 14 ignored problems in 2 ignore entries
Run `elm-review --show-ignored` for a breakdown
$ elm-review --address-ignored
Found 0 errors, 14 ignored problems in 2 ignore entries
Here are some good places to start (sorted by least problems to most).
Fix all issues in a file and then re-run `elm-review --address-ignored`.
Main.elm, InconsistentImports..., 2 errors
User.elm, InconsistentImports..., 12 errors
<etc.>
# After fixing the InconsistentImport errors in `Main.elm`
$ elm-review --address-ignored
An ignored file/rule is now obsolete. I've updated elm-review-ignored.json.
Found 0 errors, 12 ignored problems in 1 ignore entry
Here are some good places to start (sorted by least problems to most).
Fix all issues in a file and then re-run `elm-review --address-ignored`.
User.elm, InconsistentImports..., 12 errors SummaryI think that having this be generated has a few advantages:
Also, I think having the tool give you suggestions for "low hanging fruit" by showing ignored items with the fewest problems at the top would be nice. It might be useful to have some extra CLI options for filtering out to only see ignored issues for specific rules (or hide for certain rules). |
Beta Was this translation helpful? Give feedback.
-
GranularityThanks for the comments @dillonkearns 😊
I am not sure what you mean with what you mean with "going to become a crutch" (I might not understand the expression). I don't see how new errors can crop up without warning. If the configuration says that there are 5 suppressed errors in a file X for rule Y, and the
I would say with this approach (which is similar to the one you currently have to take with the ignored files), it will be easier for new errors to crop up. When you add rule Y, you notice there are 20 reported errors in file X and you ask to suppress the file. Some time later you want to tackle the errors and you notice there are now 30. The same problem happens if you start fixing them but can't them all in a single session. Also, keep in mind that since I can imagine something like the migration from Generation
Yeah, I have been leaning towards this too recently. Generate it into a different file, and then either
module ReviewConfig exposing (config)
import SuppressedErrors exposing (suppressedErrors) -- new addition
config =
[ RuleA.rule
, RuleB.rule
]
|> Rule.suppressErrors suppressedErrors -- new addition And the generated file: module SuppressedErrors exposing (suppressedErrors)
suppressedErrors =
[ { ruleName = "RuleA", path = "src/A.elm", nbErrors = 1 } -- or the variant with the locations of the errors. We could have the data be in a different format/data structure too, not sure yet.
]
My current thoughts are to make the generation file to be all or nothing. I'll think about having an interactive mode, but I am not entirely sold yet. I intend to make the generated file readable and editable by the user (one reason why I'm thinking of having it be in Elm) so that they can go and set targets themselves for their refactor, like manually remove the suppressed errors for a file. I think that there will be two flags for the generation: one to (re)generate the errors from scratch, and one to remove the obsolete entries (or to reduce the number of allowed errors) that would fail if new errors have been introduced.
Instinctively, I would have put the files that need the most attention (have the most errors) at the top. But yeah, you may be right that in the idea of incrementally improving the codebase, having the low-hanging fruit fixed first will be better 👍
Yeah. I think using |
Beta Was this translation helpful? Give feedback.
-
Become a crutch is just an expression that means you'll start depending on something in a way that becomes a bad habit. I'll elaborate a little on why I think that's the case here. Say you have 1 suppressed error in this file for an InconsistentImport rule. import Html.Attributes
-- ^ InconsistentImport error, should be `as Attr` Let's imagine that this rule requires a manual fix. You might go about coding, and end up not depending on import Element.Border
-- ^ InconsistentImport error, should be `as Border` We changed the code, and in the process ended up with the same number of suppressed errors. But it's a different one. That's not necessarily a problem. But it seems misleading to give the impression that you can "lock in" the specific set of errors in a file/rule as ignored, because that's in fact not what's happening. So I see it as less misleading to just ignore a rule for a whole file, because it doesn't give the false impression that you've prevented new problems from arising in that file. That's just my 2 cents there, reasonable people can certainly disagree there. But at least I think it's worth pointing out that it could be misleading. |
Beta Was this translation helpful? Give feedback.
-
Also, it seems to me that it will be easier to gain confidence that you've fully fixed a rule for a module and keep that fixed (rather than partially fixing several modules). That's the crutch part (bad habit). I think it will be easier to keep track of addressing all problems for a rule in a specific module at a time, then moving on. And I think it's a good idea to start with the low-hanging fruit and keep those areas clean. When you start to mix modules with problems as "partially fixed", that seems like it could be harder to keep track of what's going on. |
Beta Was this translation helpful? Give feedback.
-
Ok, so what you mean is that this method does not prevent what I called "exchanging errors" in the description of the issue. I agree that it's a problem, but since you at least didn't make the codebase worse (you exchanged one error with an error of the same severity), I think it's a relatively low problem. My take is that it will at least prevent regressions in one file, which doesn't happen when you ignore per file. If we take @chshersh's approach where we ignore by location, we would get a lot less error exchanges because you're less likely to have errors at the exact same locations. The implication of that is that I am thinking we could provide both options to the user: ignore by count (more stable, but less likely to be fixed) or by range (more unrelated errors, but more likely to be fixed as you go). But I would really like it if we could suggest the "better" approach, and maybe that is @chshersh's. Because at least the team is more likely. And by re-generating the suppressed file, you can I am more worried about the full re-generation of the suppressed errors becoming the crutch (in both options). Just like people like to automatically add linter disable comments when they get annoyed, I'm worried that they'll default to using this too much. I'm hoping that the user's colleagues will have an easier time noticing this at review time and asking for a justification at least 🤷♂️ |
Beta Was this translation helpful? Give feedback.
-
Abstract
I am looking for ways to make it easier to adopt static analysis/linting rules in a project when there are many errors, and fixing all of them in one go is not feasible or too daunting.
Context
At the moment, when you add a new rule to the
elm-review
configuration (or simply runelm-review
for the first time), you're likely to end up with a lot of new errors, which you may not want to deal with right now.If the rule happens to provide automatic fixes, that will help out a lot. But if there were a lot of errors, even that might result in a lot of changes that your colleagues (and yourself!) will not fancy reviewing.
Currently the best option is probably to enable the rule, but to ignore the errors for every file containing an error using
ignoreErrorsForFiles
, and then get your team to fix the errors gradually.You can also fix the errors before enabling the rule, but there is then a risk of new errors appearing in the meantime.
I propose a solution for this problem, but I first want to go through the "obvious" alternative solutions that other tools seem to have chosen.
Alternative solutions
Severity levels
I believe that one way this is usually done in other static analysis tools (ESLint for instance), is to associate rules with a severity level, like "warning" and "error" (which we don't support). I believe this to be a non-helpful solution in the long run, as I describe here. I personally don't enable any rule as a "warning" when I work with ESLint.
I therefore really don't want to support severity levels, and prefer this problem to remain as it is than to solve it this way (except if it does solve other kinds of problems).
Add suppression comments
Again, this is something that you could see in ESLint-land. Though if there are too many errors, the rule will most certainly end up as a warning. But it would be well possible to automate adding a
// eslint-disable-line <rule-name>
on every reported line, or even a/* eslint-disable <rule-name> */
at the top of each reported file.elm-review
does not support suppression comments. I have described over here why I think this is not a good idea in my opinion, even though that can make a lot of things harder.I would add to this that people don't go hunting for disable comments to fix the errors. The reasoning is that it's often hard to know whether a rule was disabled for good reason (bug in the rule, ...) or not (lazy developer, ...), plus there is no centralized location that tells you how many disabled errors there are or where to start.
Proposed solution
My proposal is this: Add new functions that temporarily suppress a rule for a list of files. It would look very much like
ignoreErrorsForFiles
. I will for the sake of this proposal name the functiontemporarilySuppressErrors
, but the bikeshed is open.It looks very much like
ignoreErrorsForFiles
, but there would be some differences.When we notice that no error is showing up in the given file anymore (even if by coincidence),
elm-review
would report an error and fail, telling the user to remove the file from the list of ignored files in their review configuration. This is a small improvement but would help clean up step by step. Maybe the user would get a dopamine rush and feel like fixing the rest of the issues!If a file turns out not to exist, then the CLI would report that. Otherwise we'd never be able to clean up the list of files if a file was deleted.
At the moment and when possible, ignored files are not being analyzed by rules, when we know that not analyzing it will not impact the analysis of the rest of the project. For suppressed files, we would need to analyze them, because otherwise we can't report when they do not contain errors anymore.
The CLI would provide a flag (and editors would provide some toggle?) allowing the user to disable the suppressions (i.e. re-activating the rules), allowing you to easily go hunt for these errors. The user could also manually remove a file from the suppression list if they prefer. Having everything in the configuration makes it easier to see the amount of errors left, even though we don't see how many errors are in each file.
An idea from a colleague of mine was to run
elm-review
in the CI twice: Once the "usual" way, and if there are no errors, run it once again with the previously mentioned flag (but ignoring the exit code), and feed that to our monitoring service (we are a monitoring company after all), so we can follow the quantity of suppressed errors and prioritize their hunt. An alternative solution would be to makeelm-review
NOT exit with an error code if only suppressed errors were reported, so that we don't have to run the tool twice. I imagine that if some people see a good reason for the tool to fail even in that case, we could add a separate flag or make it take an argument.For this last thing to work well, we do need to output of the error to be marked as "suppressed", so I think the output will most likely look something like this
and the JSON output would have an additional field
"suppressed": true
added to it.We could have done the same thing by having the rules to be enabled in a different review configuration, but I don't believe that to be practical for several reasons (that I could go into if asked, I'm being lazy now).
After reviewing, the CLI would report the number of suppressed errors (without showing them individually). The wording is temporary, but it shows the idea.
I don't think we should mention suppressed errors when there are non-suppressed errors being reported. It's likely just distracting.
Granularity
As you may have noticed, you can't disable a rule for only a section of a file as the smallest item you can customize here is a file. The issue with this is that you can start with a file with N suppressed errors, and when you want to fix the issues, notice that they have grown to a number bigger than N since the time of suppression.
We could ignore errors at given locations (
Rule.temporarilySuppressErrors [ ("src/A.elm", [{start={row=1, column=1},end={row=2, column=1}}]) ]
, but since files change andelm-review
can't compare with previous versions of the source code (I imagine that including Git in the analysis would make several things very complex), if a reported line gets moved because another line was included 100 lines prior, you'd get a false positive. It's technically possible, but I think it would make things very annoying.We could provide an API that allow for both (example below), but that does not solve the flakiness of the suppressions.
A alternative solution - which I think would be the way to go - is to count the number of errors in each file, and if the number of errors changes, we will require it to be updated (and we'd forbid specifying 0).
This would not help prevent exchanging errors (one gets fixed while one gets removed), but it would at least help detect when one gets introduced and the codebase gets worse.
I imagine that if the count increases, we'd have to display all the suppressed errors for that file so that the user could figure what was introduced. Not great but better than not showing anything.
Making this easy to use
There would not be an equivalent to
ignoreErrorsForDirectories
. While it makes sense to haveignoreErrorsForDirectories
to disableelm-review
on generated and vendored code, temporarily suppressing an entire folder would make it much harder for users to gradually fix the errors.I have not personally felt the pain of adding the list of ignored files into the configuration, because I am good enough with editors to extract the file names from the errors and format them in the required way. But I imagine that not to be the case for most people.
I think it would be nice if the CLI helped with that. I am imagining a flag that would generate the suppression code for you. Or even modify the configuration, but that's a way more daunting task.
An example (again, just temporary names and texts)
The tricky part would be to provide a way that makes it as easy as possible to just copy-paste this in the file. At Humio, parts of our configuration looks like
Which doesn't prone itself to easy copy pasting of what I suggested above.
The CLI could suggest using this flag when the number of shown errors reaches a certain threshold (20?).
Additional alternative solution: Integrate with Betterer or do the same thing
I haven't yet mentioned @phenomnomnominal's Betterer as a solution. It is a tool that aims to solve the same issue in a more generic way. I only talk about it now because I believe it tries to solve the issue in a similar manner, and it's worth comparing. That said, I have not yet been able to use it, so my understanding is likely a bit shallow...
From what I understand, it either snapshots the results from ESLint (for the ESLint integration), stores that file in the project for it to be committed, and then checks that nothing got worse than before. I have no clue how the location gets matched with the previous run.
That or it simply counts the number of errors, but I could not figure out whether it's the total number of errors or the number of errors per file 🤷♂️
Summary
The proposed solution
What I'm looking for
I'm looking for general feedback on this feature. Do you think this is a good idea, a bad idea? Do you think you'd use it?
What do you think about the granularity issue? Can you think of a better solution?
It's not my main concern now, but I could use some bikeshedding on the names here (for which I have not yet put any effort into).
Do you think we should always show problems related to suppression errors (new errors showed up, some errors got fixed), or only if there are no more non-suppressed errors?
Beta Was this translation helpful? Give feedback.
All reactions