Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Option for isolated and colony profiles #30

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

shntnu
Copy link
Collaborator

@shntnu shntnu commented Jun 26, 2019

Addresses #29

I pushed all of Greg's changes verbatim

@shntnu shntnu requested a review from gwaybio June 26, 2019 14:23
metadata_dir <- paste("../..", "metadata", batch_id, sep = "/")
backend_dir <- paste("../..", "backend", batch_id, unlist(strsplit(plate_id, "_"))[1], sep = "/")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gwaygenomics Do you know why the string is being split?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed on the brief chat, indeed, it looks like I am splitting the plate ID because I am inputing the plate ID with the sc_type tagged (here).

This is a result of the output from the modified aggregate.R call above, based on this line

This will definitely break in situations where the plate ID has an underscore.

Will need to address!

Copy link
Collaborator Author

@shntnu shntnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave the line breaks in there


@gwaygenomics edit - addressed in a3675c9

@@ -98,7 +98,9 @@ for (operation in operations) {
# This is handled differently because there is no direct way yet to do filtering in cytominer
# TODO: rewrite this after cytominer has an appropriate filtering function for this
testthat::expect_false(is.null(replicates), info="replicates should be specified when performing replicate_correlation")


Copy link
Collaborator Author

@shntnu shntnu Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print debugging can go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@gwaybio gwaybio Jun 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in a3675c9

@@ -37,8 +37,7 @@ variables_selected <-
Reduce(function(df1, df2) dplyr::inner_join(df1, df2, by ="variable"), .) %>%
magrittr::extract2("variable")

backend_dir <- paste("../..", "backend", batch_id, plate_id, sep = "/")

backend_dir <- paste("../..", "backend", batch_id, unlist(strsplit(plate_id, "_"))[1], sep = "/")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably go (same as before)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

subset <- opts[["subset"]] # e.g. "Metadata_broad_sample_type == '''control'''"

backend_dir <- paste("../..", "backend", batch_id, plate_id, sep = "/")
backend_dir <- paste("../..", "backend", batch_id, unlist(strsplit(plate_id, "_"))[1], sep = "/")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably go (same as before)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator Author

@shntnu shntnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, only the logic of aggregate has changed. Additionally, there's some string processing related to plate id.

@gwaybio
Copy link
Member

gwaybio commented Jun 26, 2019

I don't remember why I did this, but I will look into before merging

Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These updates are not ready to be merged in

A couple changes need to be made first

  1. Specify more options for sc_type flag (conditional statements)
  2. Remove all print debugging statements
  3. Figure out plate ID parsing from single cell profiles

aggregate.R Outdated
strata = c("Metadata_Plate", "Metadata_Well"),
operation = "mean"
) %>% collect()
object <- object %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit here is also specific to aggregating single cell profiles

It takes code from cytominer::aggregate()

aggregate.R Outdated
}

aggregated <-
aggregate_objects("cells") %>%
aggregate_cols <- c("TableNumber", "ImageNumber", "ObjectNumber",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also specific to single cells and using it in other repositories will not always work

@@ -3,56 +3,78 @@
'aggregate

Usage:
aggregate.R <sqlite_file> -o <file>
aggregate.R -s <sqlite_file> -o <file> [-t <sc_type>]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after reviewing, I don't think this change is ready for prime time. A couple more conditional statements that only behave when sc_type is specified are required.

metadata_dir <- paste("../..", "metadata", batch_id, sep = "/")
backend_dir <- paste("../..", "backend", batch_id, unlist(strsplit(plate_id, "_"))[1], sep = "/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed on the brief chat, indeed, it looks like I am splitting the plate ID because I am inputing the plate ID with the sc_type tagged (here).

This is a result of the output from the modified aggregate.R call above, based on this line

This will definitely break in situations where the plate ID has an underscore.

Will need to address!

subset <- opts[["subset"]] # e.g. "Metadata_broad_sample_type == '''control'''"

backend_dir <- paste("../..", "backend", batch_id, plate_id, sep = "/")
backend_dir <- paste("../..", "backend", batch_id, unlist(strsplit(plate_id, "_"))[1], sep = "/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -98,7 +98,9 @@ for (operation in operations) {
# This is handled differently because there is no direct way yet to do filtering in cytominer
# TODO: rewrite this after cytominer has an appropriate filtering function for this
testthat::expect_false(is.null(replicates), info="replicates should be specified when performing replicate_correlation")


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -37,8 +37,7 @@ variables_selected <-
Reduce(function(df1, df2) dplyr::inner_join(df1, df2, by ="variable"), .) %>%
magrittr::extract2("variable")

backend_dir <- paste("../..", "backend", batch_id, plate_id, sep = "/")

backend_dir <- paste("../..", "backend", batch_id, unlist(strsplit(plate_id, "_"))[1], sep = "/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@gwaybio
Copy link
Member

gwaybio commented Jun 27, 2019

@shntnu - I am trying to push changes directly to this branch. b/c it is a branch directly from broadinstitute and not a fork, I cannot push!

Can you grant me push access?

@shntnu
Copy link
Collaborator Author

shntnu commented Jun 27, 2019 via email

object <- tbl(src = db, compartment)

object %<>% inner_join(image, by = c("TableNumber", "ImageNumber"))

if (sc_type == "isolated") {
object <- object %>% dplyr::filter(Cells_Neighbors_NumberOfNeighbors_Adjacent == 0)
Copy link
Collaborator Author

@shntnu shntnu Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gwaygenomics This wouldn't work when compartment is Cytoplasm or Nuclei, correct?

Copy link
Member

@gwaybio gwaybio Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally wrote this (see below), but then realized I am wrong. This may have been my original plan, but I do not remember 100%

you are correct, this will fail as currently implemented! This is because we're aggregating compartment by compartment. If we merge everything first and then aggregate, it should work. It seems like some version of this must have worked at one point (because the call on L72 would have errored out for other non-cell compartments).

Original thought that is wrong

If I'm remembering correctly, sc_type should be specified only when compartment = "cells" (here).

Since we're doing an inner join on L72, this should take care of filtering for other compartments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants