Computational Model Library

Casting: A Bio-Inspired Method for Restructuring Machine Learning Ensembles (1.0.0)

The wisdom of the crowd refers to the phenomenon in which a group of individuals, each making independent decisions, can collectively arrive at highly accurate solutions—often more accurate than any individual within the group. This principle relies heavily on independence: if individual opinions are unbiased and uncorrelated, their errors tend to cancel out when averaged, reducing overall bias. However, in real-world social networks, individuals are often influenced by their neighbors, introducing correlations between decisions. Such social influence can amplify biases, disrupting the benefits of independent voting. This trade-off between independence and interdependence has striking parallels to ensemble learning methods in machine learning. Bagging (bootstrap aggregating) improves classification performance by combining independently trained weak learners, reducing bias. Boosting, on the other hand, explicitly introduces sequential dependence among learners, where each learner focuses on correcting the errors of its predecessors. This process can reinforce biases present in the data even if it reduces variance. Here, we introduce a new meta-algorithm, casting, which captures this biological and computational trade-off. Casting forms partially connected groups (“castes”) of weak learners that are internally linked through boosting, while the castes themselves remain independent and are aggregated using bagging. This creates a continuum between full independence (i.e., bagging) and full dependence (i.e., boosting). This method allows for the testing of model capabilities across values of the hyperparameter which controls connectedness. We specifically investigate classification tasks, but the method can be used for regression tasks as well. Ultimately, casting can provide insights for how real systems contend with classification problems.

Release Notes

Ensemble Connectivity Analysis

Python and R scripts evaluate and visualize the effect of P (ensemble connectivity).

Files

Save these in the same folder:

ant_ensemble_class.py   # Classifier (Python)
testAllData.py          # Runs experiments, outputs CSVs
modelPerformanceGraphs.R# Generates plots (R)

1. Run Python Code in Spyder (Anaconda)

Open Spyder via Anaconda Navigator → File → OpentestAllData.py.
Set Working Directory in the toolbar to the script folder.
Install packages in Anaconda Prompt or IPython Console:

pip install numpy pandas scikit-learn mlxtend seaborn matplotlib

Run Script (Run button).
This calls ant_ensemble_class.py and creates:
- ValidationResults.csv
- VerificationResults.csv

2. Visualize in R (Anaconda)

Open R (RStudio via Anaconda Navigator or R in prompt).
Set Working Directory:

setwd("path/to/folder")

Install packages:

install.packages(c("tidyverse","colorspace","ggpubr","scales","ggridges"))

Run script:

source("modelPerformanceGraphs.R")

Reads the CSVs and plots:
- Bias/variance vs P
- Composite metric vs P

3. Output

Plots match those in CastingModel.pdf.

Workflow:
1. Run testAllData.py → generates CSVs.
2. Run modelPerformanceGraphs.R → creates plots.

Associated Publications

NA

Casting: A Bio-Inspired Method for Restructuring Machine Learning Ensembles 1.0.0

The wisdom of the crowd refers to the phenomenon in which a group of individuals, each making independent decisions, can collectively arrive at highly accurate solutions—often more accurate than any individual within the group. This principle relies heavily on independence: if individual opinions are unbiased and uncorrelated, their errors tend to cancel out when averaged, reducing overall bias. However, in real-world social networks, individuals are often influenced by their neighbors, introducing correlations between decisions. Such social influence can amplify biases, disrupting the benefits of independent voting. This trade-off between independence and interdependence has striking parallels to ensemble learning methods in machine learning. Bagging (bootstrap aggregating) improves classification performance by combining independently trained weak learners, reducing bias. Boosting, on the other hand, explicitly introduces sequential dependence among learners, where each learner focuses on correcting the errors of its predecessors. This process can reinforce biases present in the data even if it reduces variance. Here, we introduce a new meta-algorithm, casting, which captures this biological and computational trade-off. Casting forms partially connected groups (“castes”) of weak learners that are internally linked through boosting, while the castes themselves remain independent and are aggregated using bagging. This creates a continuum between full independence (i.e., bagging) and full dependence (i.e., boosting). This method allows for the testing of model capabilities across values of the hyperparameter which controls connectedness. We specifically investigate classification tasks, but the method can be used for regression tasks as well. Ultimately, casting can provide insights for how real systems contend with classification problems.

Release Notes

Ensemble Connectivity Analysis

Python and R scripts evaluate and visualize the effect of P (ensemble connectivity).

Files

Save these in the same folder:

ant_ensemble_class.py   # Classifier (Python)
testAllData.py          # Runs experiments, outputs CSVs
modelPerformanceGraphs.R# Generates plots (R)

1. Run Python Code in Spyder (Anaconda)

Open Spyder via Anaconda Navigator → File → OpentestAllData.py.
Set Working Directory in the toolbar to the script folder.
Install packages in Anaconda Prompt or IPython Console:

pip install numpy pandas scikit-learn mlxtend seaborn matplotlib

Run Script (Run button).
This calls ant_ensemble_class.py and creates:
- ValidationResults.csv
- VerificationResults.csv

2. Visualize in R (Anaconda)

Open R (RStudio via Anaconda Navigator or R in prompt).
Set Working Directory:

setwd("path/to/folder")

Install packages:

install.packages(c("tidyverse","colorspace","ggpubr","scales","ggridges"))

Run script:

source("modelPerformanceGraphs.R")

Reads the CSVs and plots:
- Bias/variance vs P
- Composite metric vs P

3. Output

Plots match those in CastingModel.pdf.

Workflow:
1. Run testAllData.py → generates CSVs.
2. Run modelPerformanceGraphs.R → creates plots.

Version Submitter First published Last modified Status
1.0.0 Colin Lynch Thu Sep 18 03:53:39 2025 Thu Sep 18 03:53:46 2025 Published Peer Reviewed DOI: 10.25937/8jy2-wj25

Discussion

This website uses cookies and Google Analytics to help us track user engagement and improve our site. If you'd like to know more information about what data we collect and why, please see our data privacy policy. If you continue to use this site, you consent to our use of cookies.
Accept