Guide to Generating and Comparing 3D Protein Structures¶
This tutorial provides a step-by-step guide for generating 3D structural predictions using the AlphaFold3 web interface and performing structural comparisons using the Foldseek and DALI web servers.
Prerequisites¶
No coding knowledge is required. The only setup needed is to create an account on the AlphaFold3 web server: https://alphafoldserver.com/welcome.
Note that this web interface is not ideal for analyses requiring large batches of 3D predictions (as you are limited to 30 jobs per day) or custom templates, and it enforces fixed parameters. For more advanced and customizable workflows, consider using the open-source version of AlphaFold3: https://github.com/google-deepmind/alphafold3, though this does require a computational background.
AlphaFold3 already provides a detailed guide, so please read the official documentation here: AlphaFold3 Guide before starting to fully understand the web interface.
Getting Started¶
For this tutorial, I will demonstrate the workflow using two sequences. By following along, you will be able to confidently identify and characterize each one.
Sequence 1:
Ga0531454_000088_17320_15773_17
MQTKNGLVPTLRLSFEMTDALASIEQEGLKINLDTLEEIERSYQQEMDDLEVRLKELAQDAVGDTPVNLASPDDRSMLLYSRRVTNKQEWAAIFNLGTERRGATVKPKMRRRMSRKEFNRNVGRLTEVVYKTRAERCTNCLGHGRTRVVKKDGTLGKATRVCRVCGGSGVVYHNTHEVAGFKLLPRTTYDLAAAGFRTDKDTLDERRDDLRGDGREFVESYVRYNALRTYLNTFVEGIKNNVDSKGFIHPEFMQCVTATGRLSSRNPNFQNMPRGSTFAIRKVVESRFSGGYILEGDYSQLEFRVAGFLAKDEQAYTDVKNSVDVHNYTASVIGCTRQEAKAHTFKPLYGGTSGTEDQKRYYAAFKDKYAGVTEWHEELQRQAVTKRVIALPSGREYAFPDARWTEYGTATNRTAICNYPVQGFATADLLPIALVSLHNVVKSAGIRSVICNTVHDSIVMDVHPDEKDTCIDLMKHAMLSLPFETMRRYGLAYDMPVGIELKMGKNWLDLHEVEL
Sequence 2:
Ga0531454_000088_18790_18007_20
YRVVTNDLDALIDEEVGDPDFPFEFELIREHLPGLDRGNLGILFARPEVGKTTFCSFLAASYVRQGFRVSYWANEEPAEKIMLRIAQSYFAVFTSEMRGPMREDFVRRYAEEIAPYLTIMDSVGTSIEELDDYAKLNKPDIIFADQLDKFRIGGEYNRGDERLKQTYVLAREIAKRNKCLVWAVSQASYEAHDRQFIDYSMLDNSRTGKAGEADIIIGIGKTGSSEVENTVRHICISKNKLNGYHGMINSQIDVRRGVYY
3D Structure Predictions Using AlphaFold3¶
Submission¶
Once logged in to AlphaFold3, you should be directed to the default submission page.
Paste the Sequence 1 residues into the input field and select Protein, as we are working with amino acid sequences. Set Copies to 1 since we are only interested in monomer predictions for this tutorial.

Run Job¶
You can optionally set a Job Name to help organize your submissions.
If you want reproducible results, set a Seed value. The seed controls the random initialization of the model — using the same seed with the same input will always return the same prediction. If left blank, the server will assign a random seed, which may yield slightly different outputs across runs.

Click Confirm and submit job. The prediction process typically takes 5 to 10 minutes to complete, depending on the size of the protein.
Below the sequence submission form is the Job History section. Once your job is finished, a checkmark will appear to the left of the job name. Click on the job name to view the 3D structure prediction.
Results¶

All results are displayed on this page. For more detailed explanations of each output, refer back to the guide mentioned at the beginning.
The most important score here is the pTM (Predicted TM-score). Generally, a score above 0.5 suggests a predicted fold that may resemble one of the templates in the PDB. If you are seeing a lot of red and orange with a low pTM, you may not want to use this structure.
The ipTM (interface predicted TM-score) is currently blank because we are not modeling protein-protein interactions in this tutorial.
Download¶
Click the Download button at the top of the results page and unzip the directory to access the Crystallographic Information Files (.cif), which contain the atomic structure data.
AlphaFold3 returns five sets of predictions, labeled model_0 through model_4, along with their corresponding confidence scores in the summary_confidences_0-4 files. In practice, model_0 often contains the best prediction, but it is a good idea to review all confidence scores to confirm.
Follow Up¶
Repeat this process for Sequence 2. Once you have model_0.cif for both Sequence 1 and Sequence 2, you can move to the next step.
Structural Comparison with Foldseek¶
Foldseek is a fast and scalable tool for comparing protein 3D structures by aligning them using structural similarity, enabling high-throughput searches across large structure databases.
Submission¶
After unzipping your AlphaFold3 output and selecting the best model (e.g., fold_sequence_1_test_model_0.cif), head over to the Foldseek Search Server to compare your predicted structure against a wide range of structural databases.
Using the web interface limits some customization compared to running Foldseek locally, but no coding experience is needed. For a deeper understanding of Foldseek methods and advanced usage options, it is highly recommended to read through the official documentation:

Once on the Foldseek homepage, navigate to the Monomer Search tab (this should be the default view) and upload your .cif file of interest.

Databases¶
The image above shows the available subject databases and run options you can select. For phage proteins (known and unknown), I typically stick with the parameters shown in the figure. Keep in mind that the more databases you select, the longer the job will take to run.
Mode¶
I recommend selecting TM-align as the search mode. For more information on the differences between available modes, consult the Foldseek GitHub documentation.
Sensitivity¶
Iterative searching is generally not needed if you are already identifying homologous hits to known sequences. It can increase runtime significantly but may help uncover distant homologs in more divergent or unknown sequences.
Results¶

Once your job is complete, the Results page will display a list of 1000 top hits for each database you selected, as well as an aggregated view under the ALL DATABASES tab, sorted by best scores.
For this example, we will focus on the top hits in the BFVD database. Results are sorted by TM-align score, so you can select the top-ranking structure as the best representative. To explore more details, click on the Target URL to view the full entry on its dedicated results page. Depending on what you select, this will redirect you to either a Foldseek BFVD or UniRef results page.
Download¶

The top hit page is fairly self-explanatory. In this example, we can see that Sequence 1 is similar to a DNA polymerase.
To download the top hit representative structure, click the PDB button located above the structure viewer.

To download the full results table, navigate back to the Results page and use the dropdown menu to the left of the hit list to select your desired format.
Follow Up¶
You now know how to generate a protein structure using AlphaFold3, perform a structural comparison with TM-align via Foldseek, identify top hits across multiple databases, explore annotations, and download a representative structure.
If you are searching for unknowns and still receive low scores or no informative hits, a different approach will be needed to further investigate the structure or function.
Structural Comparison with DALI¶
DALI (Distance matrix ALIgnment) compares protein structures based on intramolecular distance patterns to detect structural similarities, even among distantly related proteins.
Submission¶
If you did not find informative hits using Foldseek, you can try the DALI webserver.

In this example, I search Sequence 1 against the Protein Data Bank (PDB) using PDB Search by uploading a .pdb file and providing a Job title and Email address to receive the results.
Important: DALI does not accept .cif files, so you must convert your AlphaFold3 output to .pdb format before submitting. You can write your own script or use this online converter:
Results¶
You should receive an email with the results in about 5 to 10 minutes. Alternatively, the submission page may redirect you directly to the results once processing is complete if you have the tab open.
A detailed explanation of the DALI output can be found here:

Whether through the page redirection or the email link, you will be taken to the DALI results page. From here, you can explore individual hits or download the full results table.
For this tutorial, we will focus on the Matches against full PDB section to get a comprehensive view of structural similarities.

The DALI results table summarizes structural alignments between your query and known protein structures. Key columns include:
Z: Statistical significance of the alignment (higher = more confident match)
RMSD: Root-mean-square deviation between aligned structures
%ID: Percent sequence identity
PDB: Link to the matched PDB entry
Description: Functional annotation of the hit
In this example, the top hits all align to DNA polymerase proteins, suggesting strong structural similarity with the query sequence.
The top hit aligns with the 8scr-D chain, showing a high Z-score (31.4) and a low RMSD (2.3), indicating strong structural conservation despite some sequence divergence.
You can look up this entry on the RCSB PDB to view the full crystal structure and associated metadata:
https://www.rcsb.org/structure/8SCR
Follow Up¶
DALI provides a complementary approach to Foldseek for detecting remote structural homologs. Even when sequence identity is low, high Z-scores combined with reasonable RMSD values can indicate meaningful structural conservation, helping to refine hypotheses about protein function. Follow-up structural alignments allow you to identify precisely which regions of your unknown protein align with known structures, offering deeper insight into potential functional roles and evolutionary relationships.
Practice: Sequence 2 Analysis¶
Try running this workflow with Sequence 2 to see if you can generate a confident structural prediction. Based on the Foldseek or DALI results, assess whether any functional or structural annotations can be confidently assigned.
If you are curious whether Sequence 2 may physically interact with Sequence 1, consider testing them together using AlphaFold3 Multimer, which is designed to predict protein-protein complexes.
Authors¶
Zachary D. Schreiber zschreib@udel.edu
Acknowledgments¶
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 2024.
van Kempen, M., Kim, S., Tumescheit, C., Mirdita, M., Lee, J., Gilchrist, C.L.M., Söding, J., and Steinegger, M. Fast and accurate protein structure search with Foldseek. Nature Biotechnology, 2023.
Holm, L. Dali server: structural unification of protein families. Nucleic Acids Research, 50(W1), 2022.
Bittrich, S., Segura, J., Duarte, J.M., Burley, S.K., and Rose, Y. RCSB Protein Data Bank: exploring protein 3D similarities via comprehensive structural alignments. Bioinformatics, 40(6), 2024.