Boris Lenhard

b.lenhard at imperial.ac.uk

Computational Regulatory Genomics, MRC Clinical Sciences Centre, United Kingdom

Introduction

TFBS Perl OO modules implement classes for the representation of objects encountered in analysis of protein-binding sites in DNA sequences.The objects defined by TFBS classes include:

The modules within the TFBS set are fully integrated and compatible with Bioperl.

Download

The current release of TFBS is 0.7.1 (Mar 7, 2017). It has been tested on Linux and MacOS with perl 5.20.2. The tarball is here: TFBS-0.7.1.tar.gz.

Citing TFBS

If you use TFBS in your work, please cite :

Lenhard B., Wasserman W.W. (2002) TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics 18:1135-1136 View Abstract

Git repository

To check out the latest development snapshot of TFBS, do

git clone https://github.com/ComputationalRegulatoryGenomicsICL/TFBS.git

Recent changes

Changes in 0.7.1

Changes in 0.7.0

Changes in 0.6.1:

Changes in 0.6.0:

Changes in 0.5.0:

Changes in 0.4.1:

Changes in 0.4.0:

Installation

The installation procedure is fairly standard:

$ tar xvfz TFBS-0.7.0.tar.gz
$ cd TFBS-0.7.0
$ perl Makefile.PL

At this point you will be asked for MySQL server acces information, which is needed for testing the TFBS::DB::JASPAR6 module. If you do not have write access to a MySQL server, just answer ‘no’ to the first question.

$ make

TFBS contains a perlxs extension which is a (at present quick and dirty) adaptation of a short C program pwm_search by James Fickett and Wyeth Wasserman, used for searching a DNA sequence against a position weight matrix. It is included for performance reasons. (For developers: there is also a currently undocumented way to make TFBS::Matrix::PWM’s search methods work without the extension. For details, contact the author (or wait for the more extensive documentation of TFBS guts to appear. The latter is not recommended :) )

$ make test

The test suite is not omnipotent. For access to TRANSFAC, the TFBS::DB::TRANSFAC assumes that Internet connection is present and no proxy is required. Test of TFBS::PatternGen::Gibbs is skipped if Gibbs executable is not found in the PATH.

$ su
$ make install

Any questions? Write to b.lenhard at imperial.ac.uk.

Dependencies

Absolutely required

  • Perl 5.10.0 or later
  • bioperl 1.0 or newer
  • PDL 1.1 or later (Note for Linux users: PDL is available as a RPM package for most major Linux distributions. Since some TFBS testers were severely frustrated by problems they encountered compiling PDL, I recommend the use of binary RPMs where possible. Solaris users should upgrade to perl 5.10 and compile it without thread support for PDL or database connectivity to work. These issues are unrelated to TFBS code.)

Note for RedHat 9 users: RedHat 9 is badly broken in several important respects. (1) The PDL installed from a rpm package shipped with RedHat 9 issues “Possible precedence problem” warnings (probably harmless). (2) Some users have had trouble compiling PDL from CPAN. If you try to install PDL from CPAN shell and get the warning “I could not locate your pod2man program…” and the error “Makefile:93: *** missing separator.”, you should unset your $LANG environmental variable before starting the CPAN shell:

$ unset LANG 

The above is strictly a RedHat configuration issue, and is unrelated to TFBS code.

Optional

  • GD 1.3 or later (only required by TFBS::Matrix::ICM for drawing sequence logos)
  • DBI and DBD::MySQL modules, as well as access to a mysql server (only required for storage and retrieval matrix objects in a MySQL database by TFBS::DB::JASPAR2)
  • Gibbs, a program by the group of C.L. Lawrence for matrix pattern generation from a set of nucleotide sequences (only required by TFBS::PatternGen::Gibbs module); write to Dr. Lawrence to obtain a copy
  • ELPH, A Gibbs sampler from TIGR
  • MEME, a popular program for pattern discovery, based on an EM algorithm

Bioperl, GD, DBI, DBD::mysql and PDL are also available from CPAN.

Example scripts

Here are two very simple code snippets that demonstrate some of the TFBS functionality.

The following two somewhat longer scripts have a fully functional command-line interface and annotated source code. Those who want to learn how to use TFBS are advised to study their code:

And finally, a simple CGI script:

Documentation (POD)

From here you can access POD documentation for the modules. It is still far from perfect, but I think it is enough for start. (Internal modules and internal methods are not yet documented.)

COMPLETE MODULE DOCUMENTATION