Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The hierarchical AutoGeneS #8

Open
Chuang1118 opened this issue Jul 21, 2021 · 5 comments
Open

The hierarchical AutoGeneS #8

Chuang1118 opened this issue Jul 21, 2021 · 5 comments

Comments

@Chuang1118
Copy link

Dear Author,

Thanks for this new API.

As you mentioned in your paper, the paragraph of Hierarchical optimization for highly correlated cell types.
"we ran AutoGeneS separated CD4+ and CD8+ T cells ......" as AutoGeneS*
I would like to run it on my data, It seems highly correlated in my reference i.e. subtype of memory B v.s. naive B cell.
With low correlation Pareto optimal solutions, I found very few markers.
I have about 100,000 cells and over 30 cell types as Reference initial, I had regroup some cell types for easy to deconvolution, but it doesn't work very well.

Now I want to use AutoGenS*, would you share your codes ?

Very nice feature selection method using GA.
Thanks in advance
Chuang

@lila167
Copy link
Contributor

lila167 commented Jul 22, 2021

Hi Chuang,

Thanks for your interest.
30 cell types are quite a lot. If there is a high correlation between the cell types, no method can handle it. I recommend to group as many sub-cell types as you can. Then run autogenes with different number of genes (300-400-500) and compare the results.
For AutoGeneS*, I ran autogenes for correlated cell types (e.g. memory B anf naive B) individually with very few genes (~10-20) and concatenated them with the autogenes's results applied on the whole dataset.
Does it make sense?

Best,
Hana

@Chuang1118
Copy link
Author

Hello Hana,

Thank you for your suggestion, this make sense for me.
Here, I have big celltypes. In my opinion, all the method can handle big celltypes, it cannot represent the powerful AutoGeneS.
I know autoGeneS use only 400 genes, it is huge advantage, but now I am interesting in the part of quality result deconvolution, whatever how many genes participate to regression 400 or 1000.

Now I observed between the pareto solutions with low correlation, it loss the important biological markers that have very difference mean expression compared to others celltypes when I increase number of generation(i.e. 5000 to 8000). This observation in 20 celltypes I want to prediction.

I don't have bulk sort or flow cytometry support and my result not robust.
How I can valid AutoGeneS prediction?
Add synthetic bulk in bulk dataset ? manually or tools special, any suggestion ?

Par example, I am in situation figure below as the starting point, toward more fine subtype Bcells.
1/ Can I believe the result of AutoGeneS the start point ?
I waiting for change the parameters, the result no change too much to sure the result robust. now I observed nuSVR and nnls are very difference.
2/ If I trust the result of start point, I continue .... , how I can valid each step.
3/ In which situation I need stop ?
4/ Or I just believe output AutoGeneS, because AutoGeneS gain the BencheMarking ?

I am beginning in deconvolution technique. I dont konw maximal power deconvolution tools, if we counter cell subtype, we must stop?

I'm looking foward to your reply my naive questions.
Best,
Chuang

image

image

@Chuang1118
Copy link
Author

Hello Hana,

I have a question about AutoGeneS*.
I don't know how to add genes additional.
After :

ag.optimize(ngen=5000,nfeatures=400,seed=0,mode='fixed')

Each pareto solution is a set 400 genes.
Then

ag.select(index=0)

Each pareto index has sum of true egal 400.
The result stock in class ag, I want to add 1 genes (i.e. CD79B) for a set 401 genes, How I can do this ?

Best,
Chuang

@lila167
Copy link
Contributor

lila167 commented Jul 29, 2021

Hi Chuang,

Unfortunately we don't support adding genes to the deconvolution, however we will consider it for future.
For the moment, you can run the regressions individually after concatenating the selected genes by autogenes and your genes. You can take a look at this code:
https://github.com/theislab/AutoGeneS/blob/master/autogenes/interface.py

Just search for nusvr and nnls.

Hope this helps

@kapoormuskan
Copy link

Hi Hana,

Do you have a documentation for AutoGeneS+?

I optimized my single cell data of 13 cell types and found highly correlated cell types. I ran the optimization on those cell types thus adding 10 more genes to the signature matrix- I am not sure how to deconvolute the bulk data now. I was trying something along these lines: [ag.AutoGeneS(data=signature_matrix_np), ag.deconvolve(numeric_bulk.T, model='nusvr')] but the ag is picking up vales from the new optimization which is on 2 cell types only..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants