foldseek2tree

some docs from the function docstrings

src.foldseek2tree.consensustree(treelist)

get a consensus tree from a list of tree files

Parameters:

treelist (list) – list of tree files

src.foldseek2tree.MDS_smooth(distmat)
src.foldseek2tree.Tajima_dist(kn_ratio, bfactor=0.95, iter=100)
src.foldseek2tree.simple_logdist(kn_ratio, bfactor=0.95, iter=100)
src.foldseek2tree.runargs(args)

run a command line command

Parameters:

args (str) – command line command

src.foldseek2tree.runFoldseekdb(folder, outfolder, foldseekpath='../foldseek/bin/foldseek')

run foldseek createdb

Parameters:
  • folder (str) – path to folder with pdb files

  • outfolder (str) – path to output folder

src.foldseek2tree.runFoldseek_allvall(structfolder, outfolder, foldseekpath='../foldseek/bin/foldseek', maxseqs=3000)

run foldseek search and createtsv

Parameters:
  • dbpath (str) – path to foldseek database

  • outfolder (str) – path to output folder

  • maxseqs (int) – maximum number of sequences to compare to

src.foldseek2tree.runFoldseek_allvall_EZsearch(infolder, outpath, foldseekpath='../foldseek/bin/foldseek')

run foldseek easy-search

Parameters:
  • infolder (str) – path to folder with pdb files

  • outpath (str) – path to output folder

  • foldseekpath (str) – path to foldseek binary

src.foldseek2tree.kernelfun(AA, BB, AB)
src.foldseek2tree.runFastme(fastmepath, clusterfile)

run fastme

Parameters:
  • fastmepath (str) – path to fastme binary

  • clusterfile (str) – path to all vs all distance matrix in fastme format

src.foldseek2tree.runQuicktree(clusterfile, quicktreepath='quicktree')

run quicktree

Parameters:
  • clusterfile (str) – path to all vs all distance matrix in fastme format

  • quicktreepath (str) – path to quicktree binary

src.foldseek2tree.distmat_to_txt(identifiers, distmat, outfile)

write out a distance matrix in fastme format

Parameters:
  • identifiers (list) – list of identifiers for your proteins

  • distmat (np.array) – distance matrix

  • outfile (str) – path to output file

src.foldseek2tree.postprocess(t, outree, delta=0)

postprocess a tree to make sure all branch lengths are positive

Parameters:
  • t (str) – path to tree file

  • delta (float) – small number to replace negative branch lengths with

src.foldseek2tree.read_dbfiles3di(folder, name='outdb')
src.foldseek2tree.structblob2tree(input_folder, outfolder, overwrite=False, fastmepath='fastme', quicktreepath='quicktree', foldseekpath='../foldseek/foldseek', delta=0.0001)

run structblob pipeline for a folder of pdb files without snakemake

Parameters:
  • input_folder (str) – path to folder with pdb files

  • logfolder (str) – path to output folder