Foldtree

This is the documentation for foldtree, it’s a combination of some utility functions and a snakemake workflow to make trees from alphafold structures.

Installation

To use foldtree, first install snakemake: https://snakemake.readthedocs.io/en/stable/getting_started/installation.html

Now we can clone the repo and create a conda environment with the software need to run fold tree

Now, we’re ready to run the pipeline. For most users, using the fold_tree pipeline should be sufficient for their needs. You can setup a fold_tree run by creating a folder for your output.

Now we can either add an identifiers.txt file containing the uniprot identifiers of all of the proteins we would like to make a tree with.

Or we can run the pipeline on our own set of structures. Please note that discontinuities or other defects in the PDBs may adversly affect results. Let’s make our structure directory and add some PDB files to it. In this case the identifiers file is blank.

Now we’re ready to build our trees. Let’s run the pipeline.

Usage

To run the snakemake workflow on the test dataset try using. You can change the folder variable to the location of your data.

$ snakemake --cores 4 --use-conda -s ./workflow/fold_tree --config folder=./testdata filter=False customstructs=False  --use-conda

Or if you are using a slurm cluster you can use the slurm profile:

$ snakemake --cores 4 --use-conda -s ./workflow/fold_tree --config folder=./testdata filter=False customstructs=False  --profile slurmsimple --use-conda

The fold_tree workflow will create a tree for each of the uniprot identifiers in the identifier.txt file in the input folder.

To use custom structures leave a blank identifier file and set the customstructs variable to True.

$ snakemake --cores 4 --use-conda -s ./workflow/fold_tree --config folder=./myfam filter=False customstructs=True  --profile slurmsimple --use-conda

To use the foldtree utility functions in your own work first install the repo as a python library.

$ git clone
$ cd foldtree
$ pip install -e .

Then import the libraries in your script or notebooks

from foldtree.src import foldseek2tree
from foldtree.src import AFDBtools
from foldtree.src import treescore


# use the functions somehow.
# comments/help are provided in the code

There are also examples of how to use the different functions in the notebooks in the notebooks folder.

Troubleshooting

If you encounter any issues while using My Project, please file a bug report on our GitHub repository: https://github.com/DessimozLab/fold_tree/issues

Credits

This project was created by Dave Moi, Yannis Nevers and Charles Bernard at DessimozLab (DBC at the university of Lausanne).