AlphaFold 3 structure prediction script. AlphaFold 3 source code is licensed under CC BY-NC-SA 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/4.0/ To request access to the AlphaFold 3 model parameters, follow the process set out at https://github.com/google-deepmind/alphafold3. You may only use these if received directly from Google. Use is subject to terms of use available at https://github.com/google-deepmind/alphafold3/blob/main/WEIGHTS_TERMS_OF_USE.md flags: /app/alphafold/run_alphafold.py: --buckets: Strictly increasing order of token sizes for which to cache compilations. For any input with more tokens than the largest bucket size, a new bucket is created for exactly that number of tokens. (default: '256,512,768,1024,1280,1536,2048,2560,3072,3584,4096,4608,5120') (a comma separated list) --conformer_max_iterations: Optional override for maximum number of iterations to run for RDKit conformer search. (an integer) --db_dir: Path to the directory containing the databases. Can be specified multiple times to search multiple directories in order.; repeat this option to specify a list of values (default: "['/home/appadm/public_databases']") --flash_attention_implementation: : Flash attention implementation to use. 'triton' and 'cudnn' uses a Triton and cuDNN flash attention implementation, respectively. The Triton kernel is fastest and has been tested more thoroughly. The Triton and cuDNN kernels require Ampere GPUs or later. 'xla' uses an XLA attention implementation (no flash attention) and is portable across GPU devices. (default: 'triton') --[no]force_output_dir: Whether to force the output directory to be used even if it already exists and is non-empty. Useful to set this to True to run the data pipeline and the inference separately, but use the same output directory. (default: 'false') --gpu_device: Optional override for the GPU device to use for inference. Defaults to the 1st GPU on the system. Useful on multi-GPU systems to pin each run to a specific GPU. (default: '0') (an integer) --hmmalign_binary_path: Path to the Hmmalign binary. (default: '/hmmer/bin/hmmalign') --hmmbuild_binary_path: Path to the Hmmbuild binary. (default: '/hmmer/bin/hmmbuild') --hmmsearch_binary_path: Path to the Hmmsearch binary. (default: '/hmmer/bin/hmmsearch') --input_dir: Path to the directory containing input JSON files. --jackhmmer_binary_path: Path to the Jackhmmer binary. (default: '/hmmer/bin/jackhmmer') --jackhmmer_n_cpu: Number of CPUs to use for Jackhmmer. Default to min(cpu_count, 8). Going beyond 8 CPUs provides very little additional speedup. (default: '8') (an integer) --jax_compilation_cache_dir: Path to a directory for the JAX compilation cache. --json_path: Path to the input JSON file. --max_template_date: Maximum template release date to consider. Format: YYYY- MM-DD. All templates released after this date will be ignored. Controls also whether to allow use of model coordinates for a chemical component from the CCD if RDKit conformer generation fails and the component does not have ideal coordinates set. Only for components that have been released before this date the model coordinates can be used as a fallback. (default: '2021-09-30') --mgnify_database_path: Mgnify database path, used for protein MSA search. (default: '${DB_DIR}/mgy_clusters_2022_05.fa') --model_dir: Path to the model to use for inference. (default: '/home/appadm/models') --nhmmer_binary_path: Path to the Nhmmer binary. (default: '/hmmer/bin/nhmmer') --nhmmer_n_cpu: Number of CPUs to use for Nhmmer. Default to min(cpu_count, 8). Going beyond 8 CPUs provides very little additional speedup. (default: '8') (an integer) --ntrna_database_path: NT-RNA database path, used for RNA MSA search. (default: '${DB_DIR}/nt_rna_2023_02_23_clust_seq_id_90_cov_80_rep_seq.fasta') --num_diffusion_samples: Number of diffusion samples to generate. (default: '5') (a positive integer) --num_recycles: Number of recycles to use during inference. (default: '10') (a positive integer) --num_seeds: Number of seeds to use for inference. If set, only a single seed must be provided in the input JSON. AlphaFold 3 will then generate random seeds in sequence, starting from the single seed specified in the input JSON. The full input JSON produced by AlphaFold 3 will include the generated random seeds. If not set, AlphaFold 3 will use the seeds as provided in the input JSON. (a positive integer) --output_dir: Path to a directory where the results will be saved. --pdb_database_path: PDB database directory with mmCIF files path, used for template search. (default: '${DB_DIR}/mmcif_files') --rfam_database_path: Rfam database path, used for RNA MSA search. (default: '${DB_DIR}/rfam_14_9_clust_seq_id_90_cov_80_rep_seq.fasta') --rna_central_database_path: RNAcentral database path, used for RNA MSA search. (default: '${DB_DIR}/rnacentral_active_seq_id_90_cov_80_linclust.fasta') --[no]run_data_pipeline: Whether to run the data pipeline on the fold inputs. (default: 'true') --[no]run_inference: Whether to run inference on the fold inputs. (default: 'true') --[no]save_embeddings: Whether to save the final trunk single and pair embeddings in the output. (default: 'false') --seqres_database_path: PDB sequence database path, used for template search. (default: '${DB_DIR}/pdb_seqres_2022_09_28.fasta') --small_bfd_database_path: Small BFD database path, used for protein MSA search. (default: '${DB_DIR}/bfd-first_non_consensus_sequences.fasta') --uniprot_cluster_annot_database_path: UniProt database path, used for protein paired MSA search. (default: '${DB_DIR}/uniprot_all_2021_04.fa') --uniref90_database_path: UniRef90 database path, used for MSA search. The MSA obtained by searching it is used to construct the profile for template search. (default: '${DB_DIR}/uniref90_2022_05.fa') Try --helpfull to get a list of all flags.