Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Salesforce migration #629

Closed
wants to merge 37 commits into from
Closed

Conversation

djagoda881
Copy link

Summary

The Salesforce source was moved from viadot version 1. The source was moved along with the unit tests. The structure of the source has undergone the following changes compared to viadot 1:

  • Added class attribute config_key
  • Updated how secrets are retrieved. After the change, it looks the same as in other sources, e.g. exchange_rates.

Importance

Marge It is needed to use source in prefect-viadot container

Checklist

This PR:

  • [x ] adds new tests (if appropriate)
  • [ x] updates docstrings for any new functions or function arguments (if appropriate)
  • [ x] updates CHANGELOG.md with a summary of the changes

@djagoda881 djagoda881 requested a review from trymzet February 23, 2023 09:34
@djagoda881 djagoda881 marked this pull request as draft February 28, 2023 11:30
@djagoda881
Copy link
Author

I made some minor corrections, the code is now ready for review

@djagoda881 djagoda881 marked this pull request as ready for review March 2, 2023 12:12
Copy link
Contributor

@trymzet trymzet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor adjustments and missing docstrings

viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
@trymzet
Copy link
Contributor

trymzet commented Mar 2, 2023

@djagoda881 random tip, you can just do - [x] and it will create a checked box, this is what these symbols in the checklist are for. Just remove the extra spaces around the x

@djagoda881 djagoda881 requested a review from trymzet March 6, 2023 14:52
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
tests/unit/test_salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
"LastName": ["LastName"],
}
df_restored = pd.DataFrame(data=data_restored)
salesforce.upsert(df=df_restored, table=TABLE_TO_UPSERT)
Copy link
Contributor

@trymzet trymzet Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second point remains, I think it may be confusing because you're mixing 2 things together in these fixtures, you're creating a test df but also randomly upserting records. It's actually not easy to understand the tests since you already upsert records in the fixture, I don't think this should be happening?

Perhaps you meant to insert some records before running upsert() test in order to validate that they are correctly updated in the Salesforce.upsert() method (to validate the "update" part of upsert()'s functionality)? But in that case, I don't think this needs to be in a fixture, you can do this (note also there is a cleanup at the end, pls also add it - records should be created and removed in each test).

TEST_DF_EXTERNAL = pd.DataFrame({"LastName": [TEST_LAST_NAME], "SAPContactId__c": ["111"]})

def test_upsert(salesforce):
    # Setup.
    sf = salesforce.salesforce
    test_record = TEST_DF_EXTERNAL.to_records()  # not sure if this is the data structure required by insert(), pls check
    sf.Contact.insert(test_record)
     
    # Test.
    salesforce.upsert(
        df=TEST_DF_EXTERNAL, table=TABLE_TO_UPSERT, external_id="SAPContactId__c"
    )

    result = sf.query(
        f"SELECT ID, LastName FROM {TABLE_TO_UPSERT} WHERE ID='{ID_TO_UPSERT}'"
    )
    assert result["records"][0]["LastName"] == TEST_LAST_NAME

    # Cleanup
    sf.delete(test_record)

Copy link
Contributor

@trymzet trymzet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments

viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
viadot/sources/salesforce.py Outdated Show resolved Hide resolved
@djagoda881
Copy link
Author

I decided to change the test structure by looking at the advice from 1:1 on Thursday.

  • I have added a bulk upsert test

  • Each test is self-contained, i.e. rows are created and deleted in each individual test

  • At the end of the session, it is checked whether any row remains in the database and is deleted if found

  • In an upsert/bulk_upsert test the insert is performed first and then the update is checked in the same test

  • I added a get_tested_records function which returns the result of the query. I added it because very often I had to call the same qyery to validate the tests

✅ Added bulk upsert test and refeactored tests

@djagoda881 djagoda881 requested a review from trymzet March 10, 2023 14:23
@dyvenia dyvenia deleted a comment from Cuppingan Mar 13, 2023
@@ -28,7 +28,7 @@ def test_row_creation(salesforce):
sf.Contact.delete(result["records"][0]["Id"])


def test_upsert_external_id_correct(salesforce, test_row_creation):
def test_upsert_external_id_column(salesforce, test_row_creation):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just test_upsert?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the name of the test after the latest changes

Comment on lines 89 to 92
try:
salesforce.upsert(df=df, table="Contact", external_id_column="SAPContactId__c")
except Exception as exception:
raise exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this try/except here? Can you add a docstring explaining?

Copy link
Author

@djagoda881 djagoda881 Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the all try except blocks from the tests. Inside the function in upsert and bulk upsert the commands are executed in the try except block so if they don't work correctly an error will appear.

✅ Removed try except from tests

Comment on lines 99 to 100
assert created_rows["records"][0]["LastName"] == "viadot-insert-1"
assert created_rows["records"][1]["LastName"] == "viadot-insert-2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do result_df.equals(expected_df)?

Copy link
Author

@djagoda881 djagoda881 Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unfortunately not possible to compare in this way. Simple salesforce returns data in OrderedDict form, it can be converted into a data frame but another problem is the Id. The ID is given on insertion and takes the form of a unique hash e.g. ('Id', '0035E000031FfsMQAS'). I would have to add another query that returns the name itself, but this would involve further complications in the code

sf.Contact.delete(created_rows["records"][1]["Id"])


def test_upsert_external_id_column_wrong(salesforce, test_row_creation):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pls change the name to test_upsert_incorrect_external_id_column?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Args:
df (pd.DataFrame): The DataFrame to upsert.
df (pd.DataFrame): Dataframe containing the rows to Bulk upsert.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bulk -> bulk, given -> provided/specified, Dataframe -> DataFrame/pandas DataFrame

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args:
domain (str, optional): Domain of a connection. Defaults to 'test' (sandbox).
Can be added only if built-in username/password/security token is provided.
client_id (str, optional): Client id to keep the track of API calls.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id -> id, keep the track -> keep track

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an upsert operation gives you more control over logging and error handling than using Bulk upsert.

Args:
df (pd.DataFrame): Dataframe containing the rows to upsert.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas DataFrame

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Download all data from the indicated table or the result of the specified query.

Args:
query (str, optional): Query for download the specific data. Defaults to None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query for download -> The query to be used to download the data. (appears 2x)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args:
query (str, optional): Query for download the specific data. Defaults to None.
table (str, optional): Table name. Defaults to None.
columns (List[str], optional): List of columns which are needed,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"List of columns which are needed, requires table argument"-> "list of required columns; requires table to be specified" (appears 2x)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

columns: List[str] = None,
) -> pd.DataFrame:
"""
Downloads the indicated data and returns the Dataframe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downloads the indicated data returns it as a pandas DataFrame

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djagoda881 djagoda881 requested a review from trymzet March 21, 2023 14:49
@djagoda881
Copy link
Author

The DE team has created a new version of this PR.

@djagoda881 djagoda881 closed this Jun 20, 2023
@trymzet
Copy link
Contributor

trymzet commented Jun 21, 2023

The DE team has created a new version of this PR.

Link pls @djagoda881

@djagoda881
Copy link
Author

New PR: #708

@djagoda881 djagoda881 deleted the salesforce_migration branch March 20, 2024 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants