IDC6940 Final Project
Week 3
This week we continued reading about deep learning models for symbolic regression and started looking at papers published after the PySR revision of May 2023.
Martius & Lampert (2016) “Extrapolation and Learning Equations”
Goal
Find a model from data which can extrapolate well outside the range of the data it was trained on
Importance
Unknown at this point
Solution
The approach to the problem taken by the authors was a surprise to the student.
- They replace the activation function (e.g. ‘relu’) with trancendental functions such as sine and cosine.
- Each layer will have multiple (different) activation functions in parallel.
- Each activation function services a small number of nodes (like 2)
- Then they used regularization to force small weights to zero
- After training, the final equation can be revealed by tracing the non-zero weights in the neural network.
Results
Sahoo et al point out that the algorithm does not handle division!
Sahoo, Lampert & Martius (2018) “Learning Equations for Extrapolation and Control”
Goal
In this follow-up paper to the one above, the authors address two weaknesses of the original method:
- Adding division capability
- Improving model selection
Importance
Unknown at this point.
Solution
Division
Other papers on symbolic regression have discussed the issues of numerical overflow and underflow and how they wreck havoc on our calculations of “fitness”. The protected division introduced here simply sets the result to zero for “bad” divisors. This works for a neural network as it kills the back-propagation of information past the division-by-zero point for that particular record.
Selection
The original paper, the model was chosen based on ranked validation error and ranked sparsity. In this revision, the authors decided it was better to normalize error and sparsity rather than rank them.
Interestingly, if “outlier” data is not excluded as we want the model to perform well at these extremes (or beyond), the authors found that the “sparsity term loses its importance”.
Results
Petersen et al point out “that the authors make several simplifications to the search space, ultimately precluding learning certain simple classes of expressions”
Petersen et al (2021) “Deep Symbolic Regression: Recovering Mathematical Expressions from Data via Risk-Seeking Policy Gradients”
Goal
Find an efficient way to apply neural networks to Symbolic Regression.
Importance
“DSR” won first place in the Real World Track of the the 2022 SRBench Symbolic Regression Competition.
Solution
- Elements that can be used to make up an expression (eg “sin” or “+”) are sampled from a probability distribution, put together as an expression, and that expression used to estimate the dependent variables of the dataset.
- Based on the fitness of that expression, the probability distribution is adjusted. This “memory” is provided by a recurrent neural network.
- The weights of the RNN are adjusted not to optimize the average performance of the generated expressions, but to optimize the best case performance of the generated expressions.
Results
This algorithm was apparently published before the SRBench benchmarks were published. The authors claim that the algorithm exceeded the performance of the gold-standard closed-source algorithm on some older benchmarks.
Landajuela et al (2022) “A Unified Framework for Deep Symbolic Regression”
Goal
Find a way to integrate disparate approaches to Symbolic Regression
Importance
This is a follow-up paper to the one above, and may actually be the algorithm that won the competition.
Solution
Identify the strengths and weaknesses of the following 5 technologies
AI Feynman “AIF”
Deep Symbolic Regression “DSR”(paper above)
Large Scale Pre-Training “LSPT” (paper below)
Genetic Programming “GP”
Linear Models “LM”
Next, carefully combine them to neutralize the identified weaknesses. The typical workflow looks like this:
DSR->AIF->LM->GP->LSPT
Results
Appears to be the top performer against the SRBench benchmarks at the time of publishing.
Kamienny et al (2022) “End-to-End Symbolic Regression with Transformers”
Goal
Replace the two-step proceedure typical in Symbolic Regression with a single step
Importance
Raises the possibility of building models in real time.
Solution
Pre-train a neural net on gobs and gobs of synthetic data.
Results
The authors claim that the results are almost as accurate as the leading two-step algorithm, but much faster.
La Cava et al (2021) “Contemporary Symbolic Regression Methods and their Relative Performance”
Goal
Establishment of a benchmark for Symbolic Regression (SRBench)
Importance
No fair way to compare models existed before this.
Solution
- Collect nearly 300 diverse regression problems
- Separate them into 2 buckets: “Black Box” and “Synthetic”
- Test available models against the problems
- Archive the data/models in an open-access format for transparency
Results
GP based methods peformed best on the “black box” datasets. These are datasets for which we do not know the underlying equations. AIF did the best on the synthetic datasets. These are the datasets for which we do know the underlying equations.
Dong & Zhong (June 2025) “Recent Advances in Symbolic Regression”
Goal
This is a survey paper.