Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add manufacture_data django command #426

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ Unreleased
* Add script to get github action errors
* Add script to republish failed events

[2.1.0] - 2023-06-01
~~~~~~~~~~~~~~~~~~~~

* Adds test factory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two high-level comments:

  1. This functionality seems low on docs about the new functionality (README or longer docstrings, etc.), including this changelog entry.
  2. I don't think this repo is the right place for this, but we should discuss. See https://github.com/edx/edx-arch-experiments/blob/main/docs/decisions/0001-purpose-of-this-repo.rst. I'm not sure that this aligns. That said, the missing docs makes it hard for me to be sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think this might make more sense in devstack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think this might make more sense in devstack

Do we have any history of the decision to use edx-django-utils that we can provide here or are on the new PR?


[2.0.0] - 2023-06-01
~~~~~~~~~~~~~~~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion edx_arch_experiments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
A plugin to include applications under development by the architecture team at 2U.
"""

__version__ = '2.0.0'
__version__ = '2.1.0'
Empty file.
Empty file.
369 changes: 369 additions & 0 deletions edx_arch_experiments/management/commands/manufacture_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,369 @@
"""
Management command for making things with test factories
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of how this might be called would be useful


Arguments
========

--model: complete path to a model that has a corresponding test factory
--{model_attribute}: (Optional) Value of a model's attribute that will override test factory's default attribute value
--{model_foreignkey__foreignkey_attribute}: (Optional) Value of a model's attribute
that will override test factory's default attribute value


Examples
========

./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomer
This will generate an enterprise customer record with placeholder values according to the test factory

./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomer --name "FRED"
will produce the customized record:
'EnterpriseCustomer' fields: {'name': 'FRED'}

./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomerCatalog /
--enterprise_customer__site__name "Fred" --enterprise_catalog_query__title "JOE SHMO" --title "who?"
will result in:
'EnterpriseCustomerCatalog' fields: {'title': 'who?'}
'EnterpriseCustomer' fields: {}
'Site' fields: {'name': 'Fred'}
'EnterpriseCatalogQuery' fields: {'title': 'JOE SHMO'}

To supply an existing record as a FK to our object:
./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomerUser /
--enterprise_customer 994599e6-3787-48ba-a2d1-42d1bdf6c46e
'EnterpriseCustomerUser' fields: {}
'EnterpriseCustomer' PK: 994599e6-3787-48ba-a2d1-42d1bdf6c46e

or we can do something like:
./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomerUser /
--enterprise_customer__site 9 --enterprise_customer__name "joe"
which would yield:
'EnterpriseCustomerUser' fields: {}
'EnterpriseCustomer' fields: {'name': 'joe'}
'Site' PK: 9


Errors
======

But if you try and get something that doesn't exist...
./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomerUser --enterprise_customer <SOMETHING BAD>
we'd get:
CommandError: Provided FK value: <SOMETHING BAD> does not exist on EnterpriseCustomer

Another limitation of this script is that it can only fetch or customize, you cannot customize a specified, existing FK
./manage.py lms manufacture_data --model enterprise.models.EnterpriseCustomerUser /
--enterprise_customer__site__name "fred" --enterprise_customer 994599e6-3787-48ba-a2d1-42d1bdf6c46e
would yield CommandError: This script does not support customizing provided existing objects
"""

import logging
import re
import sys

import factory
from django.core.exceptions import ImproperlyConfigured
from django.core.management.base import BaseCommand, CommandError, SystemCheckError, handle_default_options
from django.db import connections
from factory.declarations import SubFactory

log = logging.getLogger(__name__)


def convert_to_pascal(string):
"""
helper method to convert strings to Pascal case.
"""
return string.replace("_", " ").title().replace(" ", "")


def pairwise(iterable):
"""
Convert a list into a list of tuples of adjacent elements.
s -> [ (s0, s1), (s2, s3), (s4, s5), ... ]
"""
a = iter(iterable)
return zip(a, a)


def all_subclasses(cls):
"""
Recursively get all subclasses of a class
https://stackoverflow.com/a/3862957
"""
return set(cls.__subclasses__()).union(
[s for c in cls.__subclasses__() for s in all_subclasses(c)])


def convert_to_snake(string):
"""
Helper method to convert strings to snake case.
"""
return re.sub(r'(?<!^)(?=[A-Z])', '_', string).lower()


class Node():
"""
Non-binary tree node class for building out a dependency tree of objects to create with customizations.
"""
def __init__(self, data):
self.data = data
self.children = []
self.customizations = {}
self.factory = None
self.instance = None

def set_single_customization(self, field, value):
"""
Set a single customization value to the current node, overrides existing values under the same key.
"""
self.customizations[field] = value

def add_child(self, obj):
"""
Add a child to the current node
"""
self.children.append(obj)

def find_value(self, value):
"""
Find a value in the tree
"""
if self.data == value:
return self
else:
for child in self.children:
found = child.find_value(value)
if found:
return found
return None

def build_records(self):
"""
Recursively build out the tree of objects by first dealing with children nodes before getting to the parent.
"""
built_children = {}
for child in self.children:
# if we have an instance, use it instead of creating more objects
if child.instance:
built_children.update({convert_to_snake(child.data): child.instance})
else:
# Use the output of child ``build_records`` to create the current level.
built_child = child.build_records()
built_children.update(built_child)

# The data factory kwargs are specified custom fields + the PK's of generated child objects
object_fields = self.customizations.copy()
object_fields.update(built_children)

# Some edge case sanity checking
if not self.factory:
raise CommandError(f"Cannot build objects as {self} does not have a factory")

built_object = self.factory(**object_fields)
object_data = {convert_to_snake(self.data): built_object}
return object_data

def __str__(self, level=0):
"""
Overridden str method to allow for proper tree printing
"""
if self.instance:
body = f"PK: {self.instance.pk}"
else:
body = f"fields: {self.customizations}"
ret = ("\t" * level) + f"{repr(self.data)} {body}" + "\n"
for child in self.children:
ret += child.__str__(level + 1)
return ret

def __repr__(self):
"""
Overridden repr
"""
return f'<Tree Node {self.data}>'


def build_tree_from_field_list(list_of_fields, provided_factory, base_node, customization_value):
"""
Builds a non-binary tree of nodes based on a list of children nodes, using a base node and it's associated data
factory as the parent node the user provided value as a reference to a potential, existing record.

- list_of_fields (list of strings): the linked list of associated objects to create. Example-
['enterprise_customer_user', 'enterprise_customer', 'site']
- provided_factory (factory.django.DjangoModelFactory): The data factory of the base_node.
- base_node (Node): The parent node of the desired tree to build.
- customization_value (string): The value to be assigned to the object associated with the last value in the
``list_of_fields`` param. Can either be a FK if the last value is a subfactory, or alternatively
a custom value to be assigned to the field. Example-
list_of_fields = ['enterprise_customer_user', 'enterprise_customer', 'site'],
customization_value = 9
or
list_of_fields = ['enterprise_customer_user', 'enterprise_customer', 'name'],
customization_value = "FRED"
"""
current_factory = provided_factory
current_node = base_node
for index, value in enumerate(list_of_fields):
try:
# First we need to figure out if the current field is a sub factory or not
f = getattr(current_factory, value)
if isinstance(f, SubFactory):
fk_object = None
f_model = f.get_factory()._meta.get_model_class()

# if we're at the end of the list
if index == len(list_of_fields) - 1:
# verify that the provided customization value is a valid pk for the model
try:
fk_object = f_model.objects.get(pk=customization_value)
except f_model.DoesNotExist as exc:
raise CommandError(
f"Provided FK value: {customization_value} does not exist on {f_model.__name__}"
) from exc

# Look for the node in the tree
if node := current_node.find_value(f_model.__name__):
# Not supporting customizations and FK's
if (bool(node.customizations) or bool(node.children)) and bool(fk_object):
raise CommandError("This script does not support customizing provided existing objects")
# If we found the valid FK earlier, assign it to the node
if fk_object:
node.instance = fk_object
# Add the field to the children of the current node
if node not in current_node.children:
current_node.add_child(node)
# Set current node and move on
current_node = node
else:
# Create a new node
node = Node(
f_model.__name__,
)
node.factory = f.get_factory()
# If we found the valid FK earlier, assign it to the node
if fk_object:
node.instance = fk_object
# Add the field to the children of the current node
current_node.add_child(node)

current_node = node
current_factory = f.get_factory()
else:
if current_node.instance:
raise CommandError("This script cannot modify existing objects")
current_node.set_single_customization(value, customization_value)
except AttributeError as exc:
log.error(f'Could not find value: {value} in factory: {current_factory}')
raise CommandError(f'Could not find value: {value} in factory: {current_factory}') from exc
return base_node


class Command(BaseCommand):
"""
Management command for generating Django records from factories with custom attributes

Example usage:
$ ./manage.py manufacture_data --model enterprise.models.enterprise_customer \
--name "Test Enterprise" --slug "test-enterprise"
"""

def add_arguments(self, parser):
parser.add_argument(
'--model',
dest='model',
help='The model for which the record will be written',
)

def run_from_argv(self, argv):
"""
Re-implemented from https://github.com/django/django/blob/main/django/core/management/base.py#L395 in order to
support individual field customization. We will need to keep this method up to date with our current version of
Django BaseCommand.

Uses ``parse_known_args`` instead of ``parse_args`` to not throw an error when encountering unknown arguments

https://docs.python.org/3.8/library/argparse.html#argparse.ArgumentParser.parse_known_args
"""
self._called_from_command_line = True
parser = self.create_parser(argv[0], argv[1])
options, unknown = parser.parse_known_args(argv[2:])

# Add the unknowns into the options for use of the handle method
paired_unknowns = pairwise(unknown)
field_customizations = {}
for field, value in paired_unknowns:
field_customizations[field.strip("--")] = value
options.field_customizations = field_customizations

cmd_options = vars(options)
# Move positional args out of options to mimic legacy optparse
args = cmd_options.pop("args", ())
handle_default_options(options)
try:
self.execute(*args, **cmd_options)
except CommandError as e:
if options.traceback:
raise

# SystemCheckError takes care of its own formatting.
if isinstance(e, SystemCheckError):
self.stderr.write(str(e), lambda x: x)
else:
self.stderr.write("%s: %s" % (e.__class__.__name__, e))
sys.exit(e.returncode)
finally:
try:
connections.close_all()
except ImproperlyConfigured:
# Ignore if connections aren't setup at this point (e.g. no
# configured settings).
pass

def handle(self, *args, **options):
"""
Entry point for management command execution.
"""
if not options.get('model'):
log.error("Did not receive a model")
raise CommandError("Did not receive a model")

# Convert to Pascal case if the provided name is snake case/is all lowercase
path_of_model = options.get('model').split(".")
if '_' in path_of_model[-1] or path_of_model[-1].islower():
last_path = convert_to_pascal(path_of_model[-1])
else:
last_path = path_of_model[-1]

provided_model = '.'.join(path_of_model[:-1]) + '.' + last_path
# Get all installed/imported factories
factories_list = all_subclasses(factory.django.DjangoModelFactory)
# Find the factory that matches the provided model
for potential_factory in factories_list:
# Fetch the model for the factory
factory_model = potential_factory._meta.model
# Check if the factories model matches the provided model
if f"{factory_model.__module__}.{factory_model.__name__}" == provided_model:
# Now that we have the right factory, we can build according to the provided custom attributes
field_customizations = options.get('field_customizations', {})
base_node = Node(factory_model.__name__)
base_node.factory = potential_factory
# For each provided custom attribute...
for field, value in field_customizations.items():

# We need to build a tree of objects to be created and may be customized by other custom attributes
stripped_field = field.strip("--")
fk_field_customization_split = stripped_field.split("__")
base_node = build_tree_from_field_list(
fk_field_customization_split,
potential_factory,
base_node,
value,
)

built_node = base_node.build_records()
log.info(f"\nGenerated factory data: \n{base_node}")
return str(list(built_node.values())[0].pk)

log.error(f"Provided model: {provided_model} does not exist or does not have an associated factory")
raise CommandError(f"Provided model: {provided_model}'s factory is not imported or does not exist")
Empty file.
Loading
Loading