Loading required namespace: GenomicFiles Using local VCF. File already tabix-indexed. Finding empty VCF columns based on first 10,000 rows. 1 sample detected: ieu-b-4854 Constructing ScanVcfParam object. VCF contains: 9,703,526 variant(s) x 1 sample(s) Reading VCF file: multi-threaded (4 threads) Renaming ID as SNP. VCF file has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column. No INFO (SI) column detected. Standardising column headers. First line of summary statistics file: SNP chr BP end REF ALT FILTER ES LP SE SS P Summary statistics report: - 9,703,526 rows - 9,544,931 unique variants - 6 genome-wide significant variants (P<5e-8) - 22 chromosomes Checking for multi-GWAS. Checking for multiple RSIDs on one row. Inferring genome build. Loading SNPlocs data. Loading reference genome data. Preprocessing RSIDs. Validating RSIDs of 10,000 SNPs using BSgenome::snpsById... Loading required package: BiocGenerics Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:stats’: IQR, mad, sd, var, xtabs The following objects are masked from ‘package:base’: anyDuplicated, aperm, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which.max, which.min Loading required package: S4Vectors Loading required package: stats4 Attaching package: ‘S4Vectors’ The following objects are masked from ‘package:base’: expand.grid, I, unname BSgenome::snpsById done in 116 seconds. Loading SNPlocs data. Loading reference genome data. Preprocessing RSIDs. Validating RSIDs of 10,000 SNPs using BSgenome::snpsById... BSgenome::snpsById done in 93 seconds. Inferred genome build: GRCH37 Checking SNP RSIDs. Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/logs/snp_missing_rs.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/logs/snp_multi_colon.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. 957 SNP IDs appear to be made up of chr:bp, these will be replaced by their SNP ID from the reference genome Loading SNPlocs data. Found Indels. These won't be checked against the reference genome as it does not contain Indels. WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats() Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/logs/snp_not_found_from_bp_chr.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. Checking for merged allele column. Checking A1 is uppercase Checking A2 is uppercase Checking for incorrect base-pair positions Ensuring all SNPs are on the reference genome. Loading SNPlocs data. Loading reference genome data. Preprocessing RSIDs. Validating RSIDs of 9,544,265 SNPs using BSgenome::snpsById... BSgenome::snpsById done in 233 seconds. Found 1,121,020 Indels. These won't be checked against the reference genome as it does not contain Indels. WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats() Checking for correct direction of A1 (reference) and A2 (alternative allele). There are 113 SNPs where neither A1 nor A2 match the reference genome. These will be removed. Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/logs/alleles_dont_match_ref_gen.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. There are 6,976 SNPs where A1 doesn't match the reference genome. These will be flipped with their effect columns. Reordering so first three column headers are SNP, CHR and BP in this order. Reordering so the fourth and fifth columns are A1 and A2. Checking for missing data. Checking for duplicate columns. Ensuring that the N column is all integers. The sumstats N column is not all integers, this could effect downstream analysis. These will be converted to integers. Checking for duplicate SNPs from SNP ID. Found 15,597 Indels. These won't be checked for duplicates based on RS ID as there can be multiples. WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats() 3 RSIDs are duplicated in the sumstats file. These duplicates will be removed Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/logs/dup_snp_id.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. Checking for SNPs with duplicated base-pair positions. Found 15,597 Indels. These won't be checked for duplicates based on base-pair position as there can be multiples. WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats() Checking for duplicated rows. INFO column not available. Skipping INFO score filtering step. Filtering SNPs, ensuring SE>0. Ensuring all SNPs have N<5 std dev above mean. Checking for bi-allelic SNPs. 189,267 SNPs are non-biallelic. These will be removed. Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/logs/snp_bi_allelic.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. Computing Z-score from P using formula: `sign(BETA)*sqrt(stats::qchisq(P,1,lower=FALSE)` N already exists within sumstats_dt. Sorting coordinates with 'data.table'. Sorting coordinates with 'data.table'. Writing in tabular format ==> /rds/general/project/neurogenomics-lab/ephemeral/MAGMA_Files_Public/data/GWAS_munged/ieu-b-4854/ieu-b-4854.tsv Writing uncompressed instead of gzipped to enable tabix indexing. Converting full summary stats file to tabix format for fast querying... Reading header. Ensuring file is bgzipped. Tabix-indexing file. Removing temporary .tsv file. Summary statistics report: - 7,469,735 rows (77% of original 9,703,526 rows) - 7,469,017 unique variants - 6 genome-wide significant variants (P<5e-8) - 22 chromosomes Done munging in 42.47 minutes. Successfully finished preparing sumstats file, preview: Reading header. SNP CHR BP A1 A2 END FILTER BETA LP SE N 1: rs575272151 1 11008 C G 11008 PASS 0.0064 0.166152 0.0157 37405 2: rs544419019 1 11012 C G 11012 PASS 0.0064 0.166152 0.0157 37405 3: rs540538026 1 13110 G A 13110 PASS -0.0185 0.433680 0.0206 37405 4: rs62635286 1 13116 T G 13116 PASS 0.0340 2.604320 0.0113 37405 5: rs62028691 1 13118 A G 13118 PASS 0.0340 2.604320 0.0113 37405 P Z 1: 0.682099922 0.4095993 2: 0.682099922 0.4095993 3: 0.368400321 -0.8994738 4: 0.002487024 3.0249157 5: 0.002487024 3.0249157