You don't need permission to make progress
This essay is my response to SPARC’s Reflection Q1: Analyze something that has lasted: a building, a government, a habit or thought pattern of yours, or anything else. Why do you think it has?
A month or two ago I realized I had been stuck in a Waiting Room. The Waiting Room was an implicit belief that I was “in training” and can only start to attempt big problems once I acquire a certain undefined level of credentials / age / authority. I never explicitly decided to adopt this habit, but it guided my life and pursuits for years. Why did it last so long? I think because it felt like humility.
An example:
- When researching treatments for the Ehler’s Danlos Syndrome (EDS, hyperfragile tissue disorder) that kept burdening me, I found out about RNA drugs as a promising research direction (modern drugs target proteins, collagen proteins can’t be drugged, but the upstream RNA can).
- I was disappointed to find RNA drug discovery still in its infancy, where we still need to develop accurate computational prescreening before experiments become tractable cost-wise. In fact, we’re at a stage where improvements are possible by someone like me in their home. And the roadmap for this stared right at me as I read the literature.
- A big bottleneck in RNA-drug prediction is scarce 3D RNA structures, but AlphaFold proved that sequence->structure prediction is solvable with evolutionary history (MSAs). RNA has similar evolutionary databases. Looking at the top models like DeepRSMA, I thought “where’s the MSA module?” We don’t have many 3D RNA structures, but there’s no physical law that required binding affinity prediction (just how well RNA and drug bind, not the exact pocket, which is enough for screening) to need explicit structures.
- Plus, binding affinity datasets were dense (thousands not billions). The field’s frontier can indeed be advanced by testing new architectures on mere Google Colab GPUs.
- Yet, I didn’t pay attention. I instinctively dismissed it, thinking there has to be some reason why it wouldn’t work, even though in this case, the market was not fully efficient (as I’ll explain more below).
Why? This part is interesting.
- If I only told myself, “I shouldn’t do this because it’s too much work/too uncertain,” alarms would have sounded, but the Waiting Room gave me a much more durable excuse by disguising status regulation as being humble.
- In my mind, people like Demis Hassabis (my childhood hero of sorts hah) worked on AGI and curing all disease because they had a certain aura of authority.
- I believed I needed to obtain that aura, with olympiad medals or PhDs or decades of “training”, before I had permission to even attempt to work towards those things.
- I never thought to question the fact that that aura my brain associated with Hassabis came from having created AlphaGo and AlphaFold.
- Why did I never question it? Status.
- To attempt a Grand Problem (like curing disease) felt like claiming the status of a Grand Person, and since I knew I hadn’t earned that status yet, my brain categorized the attempt as arrogance rather than helpfulness.
- So the habit was so resilient because it drew power from survival instincts. I’ve been socialized by parents and teachers and so on to filter my words and behavior for modesty, to avoid being the tall poppy that gets cut down for bragging. The Waiting Room protected me from the imaginary slapdown of acting above my place. Its vague (never-ending) proxy goal to “get competent” let me indulge in the aesthetics of science and the fun of personal growth without facing the terrifying reality that the world is burning, that we’re all dying of untreated diseases, and no one is coming to save us. It let me feel virtuous about ignoring the problem; I thought I was just “respecting the rigor” of the experts.
Luckily, this same example is when I broke out of the habit, by accident.
- Reading about the RNA drug screening and the latest models for the first time, it reminded me of AlphaFold (and AlphaGo and all that cool stuff), and got super hooked that very weekend.
- Tired of contrived Kaggle problems and tutorial hell, I thought this would be a great chance for a learning exercise — that is, to replicate papers and learn pytorch in something close to my life.
- So I spent the ensuing weeks on the side playing around with different architectures.
- (I started with basic MLP and added things to get a feel for what worked and why, e.g., RNA-FM, RiNALMo, Graphormer, UniMol as sources of pretrained embeddings; secondary structure and GNNs; early vs late fusion; and so on)
- After an initial few months, I finished trying out much of the fundamentals in existing binding affinity models, so looked to adjacent fields.
- That’s when I went back to AlphaFold and remembered its evolutionary MSA module again (Evoformer).
- This time, I was used to implementing features, not just in paper-reading mode.
- I thought why not give it a shot. It would be good practice to code up something with less of an answer key.
- …and the model got better!
- Holy shit improvements are possible.
- So in the last seven months, I became deeply deeply obsessed with this project, and the idea that it might really turn into somethingn useful. I looked for insights in papers from both near and far fields and tried adapting them towards the RNA ligand problem (e.g. gated cross attention like DeepMind’s Flamingo vision model and semi-supervised learning). From following one little lead after another (and an embarrassing amount of Colab credits) I’ve now ended up with a model that actually outperforms the sota ones like DeepRSMA!
Framing it as ‘learning pytorch’ let it evade internal status-regulator/modesty filter. But I would have been able to iterate and improve much faster if I tried to solve the problem directly, and learned the necessary skills on the go. Or if the problem did in fact require a level of experience/wisdom I was still years/decades away from, I could evaluate the problem’s tractability and my competence, without status clouding judgement. Though in hindsight, I think the inefficiency here should’ve been clear:
- Until recently, from what I read, small molecules for RNA targetting seemed physically impossible and considered career suicide.
- Early RNA drug-discovery companies like Rib-X or Anadys ended up in bankrupt or pivoted to selling lab equipment (like Repligen).
- (but they were using protein-tailored molecule libraries so ofc)
- If you said “I want to target RNA,” the response from funders would be “Why aren’t you using ASOs?” which were the safe predictable route where you design genes instead of drugging them.
- Finally in 2020 Risdiplam finally proved the physics of molecule-RNA binding was possible.
- By this point, the protein field was months away from AlphaFold! RNA predictions are much behind.
- Early RNA drug-discovery companies like Rib-X or Anadys ended up in bankrupt or pivoted to selling lab equipment (like Repligen).
- And the other issue is that traditional drug discovery uses a causal progression: sequence -> 3D structure -> drug binding. All the big name Nature papers use 3D; it’s more rigorous. But because RNA progress lagged behind, 3D RNA structures were very scarce. Everyone was waiting for experimental data, and moved to work on proteins or other problems.
- So RNA-ligand affinity predictors have received very little attention.
- As an outsider, I just noticed predicting affinity alone doesn’t need to simulate 3D folding (specific binding pockets would, but we aren’t there yet) and imported insights from adjacent fields that the specific experts here hadn’t touched yet.
Why did the Waiting Room last so long? Because it felt virtuous. I conflated attempting great work with claiming status of a great person, and as I’ve been taught, I must Be Humble. This meant assuming the world is already the best it can be, and forcing me to rationalize reasons why there cannot be any low hanging fruit. In reality, there are no wise authorities always calculating What Can Be Done and addressing it proportionally. The world can be much better.
I’m so done waiting :)
More updates about the project at /projects