chr1 | 779600 | 780954 |
chr1 | 874824 | 876507 |
chr1 | 1147745 | 1148979 |
chr1 | 1576270 | 1576887 |
chr1 | 2325039 | 2326135 |
chr1 | 2429436 | 2430692 |
$ R --vanilla < file.R
$ ceas -g gdb -b bed -w wig
$ ceas [options] -g gdb -b bed
$ ceas [options] -g gdb -w wig
$ ceas [options] --bg -g gdb -b bed -w wig
--version | Show program's version number and exit. |
-h, --help | Show this help message and exit. |
-b, --bed | BED file with ChIP regions. |
-w, --wig | WIG file for either wig profiling or genome background annotation. WARNING: CEAS accepts fixedStep and variableStep WIG file. The user must set --bgflag for genome background annotation. |
-e, --ebed | BED file of extra regions of interest (e.g. non-coding regions) |
-g, --gdb | Gene annotation table file (e.g. a refGene table in sqlite3 db format provided through the CEAS web, http://liulab.dfci.harvard.edu/CEAS/download.html). If the sqlite3 file does not have the genome background annotation, the user must turn on --bg and have an input WIG file.|
--name | Experiment name. This will be used to name the output files (R script, PDF file and XLS file). If an experiment name is not given, the stem of the input BED file name will be used instead. (e.g. if BED is peaks.bed, 'peaks' will be used as a name.) If a BED file is not given, the input WIG file name will be used. |
--sizes | Promoter (also downstream) sizes for ChIP region annotation. Comma-separated three integer numbers or a single number will be accepted. If a single integer is given, it will be segmented into three equal fractions (i.e. 3000 is equivalent to 1000,2000,3000). DEFAULT: 1000,2000,3000. WARNING: numbers > 10000bp are automatically fixed to 10000bp. |
--bisizes | Bidirectional-promoter sizes for ChIP region annotation. The user can choose two numbers to define bidirectional promoters. Comma-separated two values or a single value can be given. If a single value is given, it will be segmented into two equal fractions (i.e. 5000 is equivalent to 2500,5000) DEFAULT: 2500,5000bp. WARNING: numbers > 20000bp are automatically fixed to 20000bp. |
--bg | Run genome BG annotation. WARNING: this flag is effective only if a WIG file is given through -w (--wig). Otherwise, ignored. |
--span | Span from TSS and TTS in the gene-centered annotation. ChIP regions within this range from TSS and TTS are considered when calculating the coverage rates of promoter and downstream by ChIP regions. DEFAULT=3000bp |
--pf-res | Wig profiling resolution, DEFAULT: 50bp. WARNING: a number smaller than the wig interval (resolution) may cause aliasing error. |
--rel-dist | Relative distance to TSS/TTS in wig profiling. DEFAULT: 3000bp |
--gn-groups | Gene-groups of particular interest in wig profiling. Each gene group file must have gene names in the 1s column. The file names are separated by commas w/ no space (e.g. --gn-groups=top10.txt,bottom10.txt) |
--gn-group-names | The names of the gene groups in --gn-groups. The gene group names are separated by commas. (e.g. --gn-group-names='top 10%,bottom 10%'). These group names appear in the legends of the wig profiling plots. If no group names given, the groups are represented as 'Group 1, Group2,...Group n'. |
--gname2 | Whether or not use the 'name2' column of the gene annotation table when reading the gene IDs in the files given through --gn-groups. This flag is meaningful only with --gn-groups. |
$ ceas --name=H3K36me3_ceas --pf-res=20 --gn-groups=top10.txt,bottom10.txt --gn-group-names='Top 10%,Bottom 10%' -g ./hg18/refGene -b
H3K36me3_MACS_pval1e-5_peaks.bed -w H3K36me3.wig
Field | Description |
chr | Chromosome of a RefSeq gene |
txStart | Transcription starting site (TSS) of a RefSeq gene |
txEnd | Transcription terminating site (TTS) of a RefSeq gene |
strand | Strand of a RefSeq Gene |
dist u txStart | Distance to the nearest ChIP region (center) upstream of txStart (bp) |
dist d txStart | Distance to the nearest ChIP region (center) downstream of txStart (bp) |
dist u txEnd | Distance to the nearest ChIP region (center) upstream of txEnd (bp) |
dist d txEnd | Distance to the nearest ChIP region (center) downstream of txEnd (bp) |
3kb u txStart | Occupancy rate of ChIP regions in 3kb upstream of txStart (0.0 - 1.0) |
3kb d txStart | Occupancy rate of ChIP regions in 3kb downstream of txStart (0.0 - 1.0) |
1/3 gene | Occupancy rate of ChIP regions in the 1st third of a gene (0.0 - 1.0) |
2/3 gene | Occupancy rate of ChIP regions in the 2nd third of a gene (0.0 - 1.0) |
3/3 gene | Occupancy rate of ChIP regions in the 3rd third of a gene (0.0 - 1.0) |
3kb d txEnd | Occupancy rate of ChIP regions in 3kb downstream of txEnd (0.0 - 1.0) |
exons | Occupancy rate of ChIP regions in the exons (0.0-1.0) |
$ ceas --name=SDC3_ceas --bg --gname2 --pf-res=86 --gn-groups=ce_top10.txt,ce_middle10.txt,ce_bottom10.txt --gn-group-names='Top
10%,Middle 10%,Bottom 10%' -g ./ce4/refGene -b SDC3_MA2C_peaks.bed -w SDC3_MA2Cscore.wig -e nc_regions.bed
$ build_genomeBG [options] -d db -g gt -w wig -o ot
$ build_genomeBG -d ce4 -g sangerGene -w SDC3_MA2Cscore.wig -o sangerGene
--version | Show program's version number and exit. |
-h, --help | Show this help message and exit. |
-d, --db | Genome of UCSC (e.g. hg18). If -d (--db) is not given, this script searches for a local sqlite3 referenced by -g (--gt). WARNING: MySQLdb must be installed to use the tables of UCSC and if an existing local sqlite3 file is opened, the previous tables will be reset. |
-g, --gt | Name of the gene annotation table (or local sqlite3 file) (e.g. refGene or knownGene). If -d (--db) is given, build_genomeBG will connect to UCSC and download the specified gene table. Otherwise, build_genomeBG search for a local sqlite3 file with the name. |
-w, --wig | WIG file needed to obtain genome locations in BG annotation. VariableStep and fixedWig files are accepted. |
-o, --ot | Output sqlite3 db file name. The gene annotation table read from the local sqlite3 file or UCSC DB will be saved in a table named as 'GeneTable' and the computed genome bg annotation will be saved in two tables named as 'GenomeBGS' and 'GenomeBGP. If this option is not given, this script generates a sqlite3 file with the same name as given through -g (--gt). WARNING! When an existing local sqlite3 file is opened and saved as the same name, the tables in the file will be overwritten. |
--promoter | Maximum promoter size to consider for genome bg annotation. This must be >= 1000bp. Any value less than 1000bp will be set to 1000bp. DEFAULT: 10000bp |
--bipromoter | Maximum Bidirectional promoter size to consider for genome bg annotation. This must be >= 1000bp. Any value less than 1000bp will be set to 1000bp. DEFAULT: 20000bp |
--downstream | Maximum immediate downstream size to consider for genome bg annotation. This must be >= 1000bp. Any value less than 1000bp will be set to 1000bp. DEFAULT: 10000bp |
--binsize | Binsize with which to bin promoter, bidirectional promoter, and immediate downstream sizes. In each bin, the percentage of genome will be calculated. DEFAULT=1000bp |