diff --git a/docs/_static/config/intermediate.yaml b/docs/_static/config/intermediate.yaml
index 58d1400d..6d93fc2f 100644
--- a/docs/_static/config/intermediate.yaml
+++ b/docs/_static/config/intermediate.yaml
@@ -100,6 +100,8 @@ datasets: # TODO update this based on the dataset that I set up
     # Relative path from the spras directory where these files live
     data_dir: "input"
 
+# TODO: add the gold standard for egfr tutorial; already in SPRAS
+
 reconstruction_settings:
 
   # Set where everything is saved
@@ -111,9 +113,10 @@ analysis:
   ml:
     # ml analysis per dataset
     include: false # set to true for step 3
+    # TODO: can I remove some of these arguments?
     # adds ml analysis per algorithm output
     # only runs for algorithms with multiple parameter combinations chosen
-    aggregate_per_algorithm: false
+    aggregate_per_algorithm: false # set to true for step 4??? Look at todo below
     # specify how many principal components to calculate
     components: 2
     # boolean to show the labels on the pca graph
@@ -127,6 +130,16 @@ analysis:
     # the coordinates of the KDE maximum (kde_peak) are also saved to the PCA coordinates output file.
     # KDE needs to be run in order to select a parameter combination with PCA because the maximum kernel density is used
     # to pick the 'best' parameter combination.
-    kde: false
+    kde: false # set to true for step 4
+    # TODO: double check that if I run step 3 without kde, then set it true in step 4 that pca is rerun
     # removes empty pathways from consideration in ml analysis (pca only)
-    remove_empty_pathways: false
+    remove_empty_pathways: false # set to true for step 4
+    # TODO: double check why this was needed for pca
+
+  evaluation:
+    # evaluation per dataset-goldstandard pair
+    # evaluation will not run unless ml include is set to true
+    include: false # set to true for step 4
+    # adds evaluation per algorithm per dataset-goldstandard pair
+    # evaluation per algorithm will not run unless ml include and ml aggregate_per_algorithm are set to true
+    aggregate_per_algorithm: false # set to true for step 4 ???  TODO: decide if it is better to demonstrate what it looks like per algorithm
diff --git a/docs/tutorial/advanced.rst b/docs/tutorial/advanced.rst
index 368d9cb4..b1227391 100644
--- a/docs/tutorial/advanced.rst
+++ b/docs/tutorial/advanced.rst
@@ -27,6 +27,9 @@ in the configuration file. When executed, SPRAS automatically runs each
 algorithm across all parameter combinations and collects the resulting
 subnetworks.
 
+# TODO maybe add in information about how parameter tuning seems to be
+done now # add in more details about two stage parameter tuning
+
 SPRAS will also support parameter refinement using graph topological
 heuristics. These topological metrics help identify parameter regions
 that produce biologically plausible outputs networks. Based on these
@@ -40,186 +43,8 @@ specific outputs for a given dataset.
 
 .. note::
 
-   Some grid search features are still under development and will be
-   added in future SPRAS releases.
-
-Parameter selection
-===================
-
-Parameter selection refers to the process of determining which parameter
-combinations should be used for evaluation on a gold standard dataset.
-
-Parameter selection is handled in the evaluation code, which supports
-multiple parameter selection strategies. Once the grid space search is
-complete for each dataset, the user can enable evaluation (by setting
-evaluation ``include: true``) and it will run all of the parameter
-selection code.
-
-PCA-based parameter selection
------------------------------
-
-The PCA-based approach identifies a representative parameter setting for
-each pathway reconstruction algorithm on a given dataset. It selects the
-single parameter combination that best captures the central trend of an
-algorithm's reconstruction behavior.
-
-.. image:: ../_static/images/pca-kde.png
-   :alt: Principal component analysis visualization across pathway outputs with a kernel density estimate computed on top
-   :width: 600
-   :align: center
-
-.. raw:: html
-
-   <div style="margin:20px 0;"></div>
-
-For each algorithm, all reconstructed subnetworks are projected into an
-algorithm-specific 2D PCA space based on the set of edges produced by
-the respective parameter combinations for that algorithm. This
-projection summarizes how the algorithm's outputs vary across different
-parameter combinations, allowing patterns in the outputs to be
-visualized in a lower-dimensional space.
-
-Within each PCA space, a kernel density estimate (KDE) is computed over
-the projected points to identify regions of high density. The output
-closest to the highest KDE peak is selected as the most representative
-parameter setting, as it corresponds to the region where the algorithm
-most consistently produces similar subnetworks.
-
-Ensemble network-based parameter selection
-------------------------------------------
-
-The ensemble-based approach combines results from all parameter settings
-for each pathway reconstruction algorithm on a given dataset. Instead of
-focusing on a single "best" parameter combination, it summarizes the
-algorithm's overall reconstruction behavior across parameters.
-
-All reconstructed subnetworks are merged into algorithm-specific
-ensemble networks, where each edge weight reflects how frequently that
-interaction appears across the outputs. Edges that occur more often are
-assigned higher weights, highlighting interactions that are most
-consistently recovered by the algorithm.
-
-These consensus networks help identify the core patterns and overall
-stability of an algorithm's output's without needing to choose a single
-parameter setting (no clear optimal parameter combination could exists).
-
-Ground truth-based evaluation without parameter selection
----------------------------------------------------------
-
-The no parameter selection approach chooses all parameter combinations
-for each pathway reconstruction algorithm on a given dataset. This
-approach can be useful for idenitifying patterns in algorithm
-performance without favoring any specific parameter setting.
-
-************
- Evaluation
-************
-
-In some cases, users may have a gold standard file that allows them to
-evaluate the quality of the reconstructed subnetworks generated by
-pathway reconstruction algorithms.
-
-However, gold standards may not exist for certain types of experimental
-data where validated ground truth interactions or molecules are
-unavailable or incomplete. For example, in emerging research areas or
-poorly characterized biological systems, interactions may not yet be
-experimentally verified or fully known, making it difficult to define a
-reliable reference network for evaluation.
-
-Adding gold standard datasets and evaluation post analysis a configuration
-==========================================================================
-
-In the configuration file, users can specify one or more gold standard
-datasets to evaluate the subnetworks reconstructed from each dataset.
-When gold standards are provided and evaluation is enabled (``include:
-true``), SPRAS will automatically compare the reconstructed subnetworks
-for a specific dataset against the corresponding gold standards.
-
-.. code:: yaml
-
-   gold_standards:
-       -
-       label: gs1
-       node_files: ["gs_nodes0.txt", "gs_nodes1.txt"]
-       data_dir: "input"
-       dataset_labels: ["data0"]
-       -
-       label: gs2
-       edge_files: ["gs_edges0.txt"]
-       data_dir: "input"
-       dataset_labels: ["data0", "data1"]
-
-   analysis:
-       evaluation:
-           include: true
-
-A gold standard dataset must include the following types of keys and
-files:
-
--  ``label``: a name that uniquely identifies a gold standard dataset
-   throughout the SPRAS workflow and outputs.
--  ``node_file`` or ``edge_file``: A list of node or edge files. Only
-   one of these can be defined per gold standard dataset.
--  ``data_dir``: The file path of the directory where the input gold
-   standard dataset files are located.
--  ``dataset_labels``: a list of dataset labels indicating which
-   datasets this gold standard dataset should be evaluated against.
-
-When evaluation is enabled, SPRAS will automatically run its built-in
-evaluation analysis on each defined dataset-gold standard pair. This
-evaluation computes metrics such as precision, recall, and
-precision-recall curves, depending on the parameter selection method
-used.
-
-For each pathway, evaluation can be run independently of any parameter
-selection method (the ground truth-based evaluation without parameter
-selection idea) to directly inspect precision and recall for each
-reconstructed network from a given dataset.
-
-.. image:: ../_static/images/pr-per-pathway-nodes.png
-   :alt: Precision and recall computed for each pathway and visualized on a scatter plot
-   :width: 600
-   :align: center
-
-.. raw:: html
-
-   <div style="margin:20px 0;"></div>
-
-Ensemble-based parameter selection generates precision-recall curves by
-thresholding on the frequency of edges across an ensemble of
-reconstructed networks for an algorithm for given dataset.
-
-.. image:: ../_static/images/pr-curve-ensemble-nodes-per-algorithm-nodes.png
-   :alt: Precision-recall curve computed for a single ensemble file / pathway and visualized as a curve
-   :width: 600
-   :align: center
-
-.. raw:: html
-
-   <div style="margin:20px 0;"></div>
-
-PCA-based parameter selection computes a precision and recall for a
-single reconstructed network selected using PCA from all reconstructed
-networks for an algorithm for given dataset.
-
-.. image:: ../_static/images/pr-pca-chosen-pathway-per-algorithm-nodes.png
-   :alt: Precision and recall computed for each pathway chosen by the PCA-selection method and visualized on a scatter plot
-   :width: 600
-   :align: center
-
-.. raw:: html
-
-   <div style="margin:20px 0;"></div>
-
-.. note::
-
-   Evaluation will only execute if ml has ``include: true``, because the
-   PCA parameter selection step depends on the PCA ML analysis.
-
-.. note::
-
-   To see evaluation in action, run SPRAS using the config.yaml or
-   egfr.yaml configuration files.
+   Grid search features are still under development and will be added in
+   future SPRAS releases.
 
 **********************
  HTCondor integration
@@ -255,3 +80,10 @@ user to set which SPRAS supported container framework to use:
        framework: docker
 
 The frameworks include Docker, Apptainer/Singularity, or dsub
+
+***********************
+ Benchmarking Datasets
+***********************
+
+# add this part in # Should link to the benchmarking repo # We are
+working on the vision of the live benchmarking website
diff --git a/docs/tutorial/beginner.rst b/docs/tutorial/beginner.rst
index a0846666..9c7b4ea6 100644
--- a/docs/tutorial/beginner.rst
+++ b/docs/tutorial/beginner.rst
@@ -50,6 +50,13 @@ Conda environment and install the SPRAS python package:
    The last command is a one-time installation of the SPRAS package into
    the environment.
 
+# The problem was that the participant downloaded the beginner config
+file into the wrong directory and then the snakemake command failed #
+They put it into the spras directory in the conda environment that was
+created after the spras package is installed, may need to watch for that
+# add a note about the folder called spras within the larger spras
+folder
+
 0.3 Test the installation
 =========================
 
@@ -75,6 +82,9 @@ Launch Docker Desktop and wait until it says "Docker is running".
    isolated containers. These containers include all the necessary
    dependencies to run each algorithm or post analysis.
 
+# Confusion about why Docker is followed by conda. Need to explain the
+interaction between these pieces. # add this as a note
+
 *****************************
  Step 1: Configuration files
 *****************************
diff --git a/docs/tutorial/intermediate.rst b/docs/tutorial/intermediate.rst
index 1055b879..9c50ba12 100644
--- a/docs/tutorial/intermediate.rst
+++ b/docs/tutorial/intermediate.rst
@@ -1124,8 +1124,262 @@ algorithms and their parameter settings.
 Higher similarity values indicate that pathways share many of the same
 edges, while lower values suggest distinct reconstructions.
 
-References
-==========
+**************************************
+ Step 4: Use Evaluation post-analysis
+**************************************
+
+In some cases, users may have a gold standard file that allows them to
+evaluate the quality of the reconstructed subnetworks generated by
+pathway reconstruction algorithms.
+
+However, gold standards may not exist for certain types of experimental
+data where validated ground truth interactions or molecules are
+unavailable or incomplete. For example, in emerging research areas or
+poorly characterized biological systems, interactions may not yet be
+experimentally verified or fully known, making it difficult to define a
+reliable reference network for evaluation.
+
+Explain two sentence high level how we do evaluation: Parameter
+selection + pr or prcs
+
+4.1 Adding evaluation post-analysis to the intermediate configuration
+=====================================================================
+
+To enable the evaluation, update the analysis section in your
+configuration file by setting evaluation to true. TODO ALSO UPDATE THE
+OTHER THINGS TOO
+
+Your analysis section in the configuration file should look like this:
+
+.. code:: yaml
+
+   analysis:
+      ml:
+         include: true
+         aggregate_per_algorithm: true
+         ... (other parameters preset)
+         kde: true
+         remove_empty_pathways: true
+
+      evaluation:
+         include: true
+         aggregate_per_algorithm: true
+
+EXPLAIN WHY WE do each of these - kde we explain in parameter selection
+so skip - remove_empty_pathways we do because we don't want to
+cluster/kda and choose empty pathways that are the representative, want
+to see something - aggregate_per_algorithm we want to see how well each
+individual algorithm does on the evaluation instead of the # 1 best or
+all outputs treated the same, we want to see how each algorithm is
+performing
+
+What do gold standard datasets look like in a configuration?
+------------------------------------------------------------
+
+In the configuration file, users can specify one or more gold standard
+datasets to evaluate the subnetworks reconstructed from each dataset.
+When gold standards are provided and evaluation is enabled (``include:
+true``), SPRAS will automatically compare the reconstructed subnetworks
+for a specific dataset against the corresponding gold standards.
+
+.. code:: yaml
+
+   gold_standards:
+       -
+       label: gs1
+       node_files: ["gs_nodes0.txt", "gs_nodes1.txt"]
+       data_dir: "input"
+       dataset_labels: ["data0"]
+       -
+       label: gs2
+       edge_files: ["gs_edges0.txt"]
+       data_dir: "input"
+       dataset_labels: ["data0", "data1"]
+
+   analysis:
+       evaluation:
+           include: true
+
+A gold standard dataset must include the following types of keys and
+files:
+
+-  ``label``: a name that uniquely identifies a gold standard dataset
+   throughout the SPRAS workflow and outputs.
+-  ``node_file`` or ``edge_file``: A list of node or edge files. Only
+   one of these can be defined per gold standard dataset.
+-  ``data_dir``: The file path of the directory where the input gold
+   standard dataset files are located.
+-  ``dataset_labels``: a list of dataset labels indicating which
+   datasets this gold standard dataset should be evaluated against.
+
+# add a note that gold standard datasets must be defined as nodes or
+edges (double check that if the edges are only added if it will run node
+and edge)
+
+add the code thing of what the gold standard looks like for the one we
+will be using config
+
+When evaluation is enabled, SPRAS will automatically run its built-in
+evaluation analysis on each defined dataset-gold standard pair. This
+evaluation computes metrics such as precision, recall, and
+precision-recall curves, depending on the parameter selection method
+used.
+
+4.2 What is parameter selection?
+================================
+
+Parameter selection refers to the process of determining which parameter
+combinations should be used for evaluation on a gold standard dataset.
+
+Parameter selection is handled in the evaluation code, which supports
+multiple parameter selection strategies. Once the grid space search is
+complete for each dataset, the user can enable evaluation (by setting
+evaluation ``include: true``) and it will run all of the parameter
+selection code.
+
+.. note::
+
+   Some parameter selection features are still under development and
+   will be added in future SPRAS releases.
+
+PCA-based parameter selection
+-----------------------------
+
+The PCA-based approach identifies a representative parameter setting for
+each pathway reconstruction algorithm on a given dataset. It selects the
+single parameter combination that best captures the central trend of an
+algorithm's reconstruction behavior.
+
+.. image:: ../_static/images/pca-kde.png
+   :alt: Principal component analysis visualization across pathway outputs with a kernel density estimate computed on top
+   :width: 600
+   :align: center
+
+.. raw:: html
+
+   <div style="margin:20px 0;"></div>
+
+For each algorithm, all reconstructed subnetworks are projected into an
+algorithm-specific 2D PCA space based on the set of edges produced by
+the respective parameter combinations for that algorithm. This
+projection summarizes how the algorithm's outputs vary across different
+parameter combinations, allowing patterns in the outputs to be
+visualized in a lower-dimensional space.
+
+Within each PCA space, a kernel density estimate (KDE) is computed over
+the projected points to identify regions of high density. The output
+closest to the highest KDE peak is selected as the most representative
+parameter setting, as it corresponds to the region where the algorithm
+most consistently produces similar subnetworks.
+
+Ensemble network-based parameter selection
+------------------------------------------
+
+The ensemble-based approach combines results from all parameter settings
+for each pathway reconstruction algorithm on a given dataset. Instead of
+focusing on a single "best" parameter combination, it summarizes the
+algorithm's overall reconstruction behavior across parameters.
+
+All reconstructed subnetworks are merged into algorithm-specific
+ensemble networks, where each edge weight reflects how frequently that
+interaction appears across the outputs. Edges that occur more often are
+assigned higher weights, highlighting interactions that are most
+consistently recovered by the algorithm.
+
+These consensus networks help identify the core patterns and overall
+stability of an algorithm's output's without needing to choose a single
+parameter setting (no clear optimal parameter combination could exists).
+
+Ground truth-based evaluation without parameter selection
+---------------------------------------------------------
+
+# TODO rename this to what it actually is
+
+The no parameter selection approach chooses all parameter combinations
+for each pathway reconstruction algorithm on a given dataset. This
+approach can be useful for idenitifying patterns in algorithm
+performance without choosing any specific parameter setting.
+
+# add more details about this/reword this based on what is in the paper
+
+4.3 Running evaluation post analysis code
+=========================================
+
+With the updates to the intermediate.yaml config, SPRAS will run the
+full evalaution across all outputs for a given dataset and give back
+results per algorithm.
+
+After saving the changes in the configuration file, rerun with:
+
+.. code:: bash
+
+   snakemake --cores 4 --configfile config/intermediate.yaml
+
+What happens when you run this command
+--------------------------------------
+
+What your directory structure should like after this run:
+---------------------------------------------------------
+
+4.4 Reviewing the evalaution outputs
+====================================
+
+MAKE SURE TO UPDATE IMAGES TO WHAT THEY ARE FOR THE EGFR EXAMPLE - add
+how to look up each of these images
+
+For each pathway, evaluation can be run independently of any parameter
+selection method (the ground truth-based evaluation without parameter
+selection idea) to directly inspect precision and recall for each
+reconstructed network from a given dataset.
+
+.. image:: ../_static/images/pr-per-pathway-nodes.png
+   :alt: Precision and recall computed for each pathway and visualized on a scatter plot
+   :width: 600
+   :align: center
+
+.. raw:: html
+
+   <div style="margin:20px 0;"></div>
+
+Ensemble-based parameter selection generates precision-recall curves by
+thresholding on the frequency of edges across an ensemble of
+reconstructed networks for an algorithm for given dataset.
+
+.. image:: ../_static/images/pr-curve-ensemble-nodes-per-algorithm-nodes.png
+   :alt: Precision-recall curve computed for a single ensemble file / pathway and visualized as a curve
+   :width: 600
+   :align: center
+
+.. raw:: html
+
+   <div style="margin:20px 0;"></div>
+
+PCA-based parameter selection computes a precision and recall for a
+single reconstructed network selected using PCA from all reconstructed
+networks for an algorithm for given dataset.
+
+.. image:: ../_static/images/pr-pca-chosen-pathway-per-algorithm-nodes.png
+   :alt: Precision and recall computed for each pathway chosen by the PCA-selection method and visualized on a scatter plot
+   :width: 600
+   :align: center
+
+.. raw:: html
+
+   <div style="margin:20px 0;"></div>
+
+.. note::
+
+   Evaluation will only execute if ml has ``include: true``, because the
+   PCA parameter selection step depends on the PCA ML analysis.
+
+.. note::
+
+   To see evaluation in action, run SPRAS using the config.yaml or
+   egfr.yaml configuration files.
+
+************
+ References
+************
 
 .. [1]