-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New codemod: use-generator #135
Merged
Merged
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
from core_codemods.use_generator import UseGenerator | ||
from integration_tests.base_test import ( | ||
BaseIntegrationTest, | ||
original_and_expected_from_code_path, | ||
) | ||
|
||
|
||
class TestUseGenerator(BaseIntegrationTest): | ||
codemod = UseGenerator | ||
code_path = "tests/samples/use_generator.py" | ||
|
||
original_code, expected_new_code = original_and_expected_from_code_path( | ||
code_path, | ||
[(5, "x = sum(i for i in range(1000))\n")], | ||
) | ||
|
||
expected_diff = """\ | ||
--- | ||
+++ | ||
@@ -3,5 +3,5 @@ | ||
yield i | ||
|
||
|
||
-x = sum([i for i in range(1000)]) | ||
+x = sum(i for i in range(1000)) | ||
y = some([i for i in range(1000)]) | ||
""" | ||
|
||
expected_line_change = "6" | ||
change_description = UseGenerator.CHANGE_DESCRIPTION |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
Imagine that someone handed you a pile of 100 apples and then asked you to count how many of them were green without putting any of them down. You'd probably find this quite challenging and you'd struggle to hold the pile of apples at all. Now imagine someone handed you the apples one at a time and asked you to just count the green ones. This would be a much easier task. | ||
|
||
In Python, when we use list comprehensions, it's like we've created the entire pile of apples and asked the interpreter to hold onto it. Sometimes, a better practice involves using generator expressions, which create iterators that yield objects one at a time. For large data sets, this can turn a slow, memory intensive operation into a relatively fast one. | ||
|
||
Using generator expressions instead of list comprehensions can lead to better performance. This is especially true for functions such as `any` where it's not always necessary to evaluate the entire list before returning. For other functions such as `max` or `sum` it means that the program does not need to store the entire list in memory. These performance effects becomes more noticeable as the sizes of the lists involved grow large. | ||
|
||
This codemod replaces the use of a list comprehension expression with a generator expression within certain function calls. Generators allow for lazy evaluation of the iterator, which can have performance benefits. | ||
|
||
The changes from this codemod look like this: | ||
```diff | ||
- result = sum([x for x in range(1000)]) | ||
+ result = sum(x for x in range(1000)) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import libcst as cst | ||
|
||
from codemodder.codemods.api import BaseCodemod, ReviewGuidance | ||
|
||
|
||
class UseGenerator(BaseCodemod): | ||
NAME = "use-generator" | ||
SUMMARY = "Use generators for lazy evaluation" | ||
REVIEW_GUIDANCE = ReviewGuidance.MERGE_WITHOUT_REVIEW | ||
DESCRIPTION = "Replace list comprehension with generator expression" | ||
REFERENCES = [ | ||
{ | ||
"url": "https://pylint.readthedocs.io/en/latest/user_guide/messages/refactor/use-a-generator.html", | ||
"description": "", | ||
}, | ||
{ | ||
"url": "https://docs.python.org/3/glossary.html#term-generator-expression", | ||
"description": "", | ||
}, | ||
{ | ||
"url": "https://docs.python.org/3/glossary.html#term-list-comprehension", | ||
"description": "", | ||
}, | ||
] | ||
|
||
def leave_Call(self, original_node: cst.Call, updated_node: cst.Call): | ||
match original_node.func: | ||
# NOTE: could also support things like `list` and `tuple` | ||
# but it's a less compelling use case | ||
case cst.Name("any" | "all" | "sum" | "min" | "max"): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As you have pointed it out, those could use an extra step to check if they are builtin functions. You can do this using the |
||
match original_node.args[0].value: | ||
case cst.ListComp(elt=elt, for_in=for_in): | ||
self.add_change(original_node, self.CHANGE_DESCRIPTION) | ||
return updated_node.with_changes( | ||
args=[ | ||
cst.Arg( | ||
value=cst.GeneratorExp( | ||
elt=elt, # type: ignore | ||
for_in=for_in, # type: ignore | ||
# No parens necessary since they are | ||
# already included by the call expr itself | ||
lpar=[], | ||
rpar=[], | ||
) | ||
) | ||
], | ||
) | ||
|
||
return original_node |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
import pytest | ||
|
||
from core_codemods.use_generator import UseGenerator | ||
from tests.codemods.base_codemod_test import BaseCodemodTest | ||
|
||
|
||
class TestUseGenerator(BaseCodemodTest): | ||
codemod = UseGenerator | ||
|
||
@pytest.mark.parametrize("func", ["any", "all", "sum", "min", "max"]) | ||
def test_list_comprehension(self, tmpdir, func): | ||
original_code = f""" | ||
x = {func}([i for i in range(10)]) | ||
""" | ||
new_code = f""" | ||
x = {func}(i for i in range(10)) | ||
""" | ||
self.run_and_assert(tmpdir, original_code, new_code) | ||
|
||
def test_not_special_builtin(self, tmpdir): | ||
expected = original_code = """ | ||
x = some([i for i in range(10)]) | ||
""" | ||
self.run_and_assert(tmpdir, original_code, expected) | ||
|
||
@pytest.mark.xfail(reason="TODO: check for built-in names") | ||
def test_not_global_function(self, tmpdir): | ||
expected = original_code = """ | ||
from foo import any | ||
x = any([i for i in range(10)]) | ||
""" | ||
self.run_and_assert(tmpdir, original_code, expected) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
def some(iterable): | ||
for i in iterable: | ||
yield i | ||
|
||
|
||
x = sum([i for i in range(1000)]) | ||
y = some([i for i in range(1000)]) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly a nitpick, but this is the kind of language I'd avoid in documentations. Be direct and on point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is a little too whimsical but it's important to remember that a big part of our vision here involves educating developers and security professionals on best practices. This means we actually need to make our content as engaging as possible. I'm not necessarily saying I've succeeded in that here, but I'm going to leave it as-is and see what feedback we receive.