Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating instrumental variable for mendelian randomisation - in Ubuntu #13

Open
rmgpanw opened this issue Aug 22, 2019 · 1 comment
Open
Labels
bug Something isn't working command-line Stuff we do via command line / terminal interface *NIX question related to UNIX-like OS

Comments

@rmgpanw
Copy link

rmgpanw commented Aug 22, 2019

Problem summary

Marker	Chrom	Pos	Allele1	Allele2	Ncases	Ncontrols	GC.Pvalue	Overall
rs12083781	1	796375	C	T	16144	17832	0.0829	+
rs75932129	1	796767	A	G	16144	17832	0.101	-
rs58013264	1	797440	C	T	16144	17832	0.0836	+
rs10900604	1	798400	G	A	16144	17832	0.0967	+
rs11240777	1	798959	A	G	16144	17832	0.102	+
rs61768212	1	801467	C	G	16144	17832	0.0846	+
rs7553096	1	802026	A	G	16144	17832	0.0787	+
rs10157494	1	802496	T	C	16144	17832	0.103	+
rs7526310	1	804759	T	C	16144	17832	0.0799	+

or download here: sample.txt

  • I would like to filter filter for only those SNPs with a 'GC.Pvalue' of significance p < 0.00001

I have tried the following, which does not work

awk -F '{if ( $8 = 0.0000.) print $0 }' sample.txt

I get 'syntax error' when I run this with the sample.txt file

Could someone please advise me on how to improve on this? Thanks!

@alhenry alhenry transferred this issue from ucl-ihi/Practicals Aug 22, 2019
@alhenry
Copy link
Member

alhenry commented Aug 22, 2019

I think there are a couple of things that caused syntax errors in the initial code:

  • -F (field separator) option needs to be followed by an argument. However, I think this can be omitted as the default is whitespace so tab / space should be fine.
  • = operator should be < (or == for "equal to", but that's not what we want)
  • 0.0000. should be 0.00001

The following should work:
awk '$8 < 0.00001' sample.txt

(it wont show any output as there is no variant with P < 0.00001, so to check you can change it to e.g. awk '$8 < 0.1' sample.txt

If you want to preserve the header: awk 'NR==1 || $8 < 0.1' sample.txt

@alhenry alhenry added *NIX question related to UNIX-like OS bug Something isn't working command-line Stuff we do via command line / terminal interface labels Aug 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working command-line Stuff we do via command line / terminal interface *NIX question related to UNIX-like OS
Projects
None yet
Development

No branches or pull requests

2 participants