The AI ​​state in 2020: Biology and healthcare’s AI moment, ethics, predictions and neural networks

That Status of AI Report 2020 is a comprehensive report on all things AI. To pick up from the place where we got away with summarizing key findswe continue the conversation with authors Nathan Benaich and Ian Hogarth. Benaich is the founder of Air Street Capital and RAAIS, and Hogarth is an AI angel investor and a UCL IIPP visiting professor.

Main themes we have covered so far were AI democratization, industrialization and the path to artificial general intelligence. We continue with healthcare and biology AI moment, breakthroughs in research and application, AI ethics and predictions.

Biology and healthcare AI moment

A key point discussed with Benaich and Hogarth was the democratization of AI: what it means, whether it matters, and how to compete against behemoths who have the resources it takes to train huge machine learning models on a large scale.

One of the ideas explored in the report is to take existing models and fine-tune them to specific domains. Benaich noted that taking a large model or a pre-trained model within a field and moving it to another field can work to launch performance to a higher level:

“As far as biology and healthcare are concerned, more and more digital domains with lots of imaging are becoming, whether it concerns health issues or what cells look like when they are sick, compiling datasets to describe it and then using transfer learning from ImageNet to these domains has given much better performance than starting from scratch. “

This, Benaich continued to add, plays into one of the dominant themes in the report: Biology – where Benaich has a background – and health care have their AI moment. There are examples of startups at the forefront of research and development moving into production that tackles problems in biology. One area of ​​application that Benaich highlighted was drug screening:

“If I have a software product, I can generate lots of potential drugs that can work against the disease protein that I am interested in targeting. How do I know out of the thousands or hundreds of thousands of possible drugs, which one will work? “And provided I can figure out which one works, how do I know if I can handle it?”

In addition to computer vision, Benaich went on to add that there are several examples AI language models to be useful for protein technique or understanding of DNA, “essentially treating a sequence of amino acids encoding proteins or DNA as just another kind of language, a kind of strings that language models can interpret in the same way that they can interpret signs that spell out words. “


The FDA announced a new proposal to embrace the highly iterative and adaptive nature of AI systems in what they call a “total product lifecycle” regulatory approach based on good machine learning practices.

Transformer based language models such as GPT3 have also been applied to tasks such as filling in images or converting code between different programming languages. Benaich and Hogarth note that the transformer’s ability to generalize is remarkable, but at the same time gives a warning in the example with code: No expert knowledge is required, but no guarantees that the model does not remember the functions either.

This discussion was triggered by the question – posed by some researchers – whether progress in mature areas of machine learning is stagnant. In our opinion, the fact is that COVID19 has dominated 2020 is also reflected in the impact it has had on AI. And there are examples of how AI has been used in biology and healthcare to tackle COVID19.

Benaich used examples from biology and health care to establish that the field of application, in addition to research, is far from stagnant. The report includes work in this area right from the start, such as InVivo and Recursion to Google Health, Deepmind, and NHS.

What’s more, The US Medicaid and Medicare system has approved an AI-based medical imaging product. Despite already existing FDA approvals for deep learning-based medical imaging, whether it’s a stroke, mammogram or broken bones, this is the only one so far that has actually been reimbursed, Benaich noted:

“Many people in the field feel that reimbursement is the critical moment. It’s the financial incentive for doctors to prescribe because they get paid back. So we think it’s an important event. A lot of work needs to be done, of course for to scale this and to ensure that more patients are eligible for this reimbursement, but still larger. “

Interesting enough it is The FDA has also released a new proposal to embrace the highly iterative and adaptive nature of AI systems in what they call a “total product life cycle” regulatory approach built on good practice in machine learning.

Graph neural networks: Becomes three-dimensional

The report also includes a number of examples, as Benaich said: “Evidence that the major pharmaceutical companies are actually gaining value from working with their first drug discovery companies.” This discussion naturally leads to the topic of progress in a particular area of ​​machine learning: graph neural networks.

The connection was how graphneural networks (GNNs) are used to improve the prediction of chemical properties and control the screening of antibiotics, leading to new drugs in vivo. Most deep learning methods focus on learning from two-dimensional input data. Data represented as matrices. GNNs are a growing family of methods designed to process 3D data. This may sound cryptic, but it’s a big deal. The reason is that it makes it possible to process more information of the neural network.

“I think it comes down to a topic that is the true representation of biological data that actually expresses all the complexity and physics and chemistry and vibrant nuances of a biological system in a compact, easy to describe mathematical representation, as a machine learning model can do something about it, “said Benaich.

Sometimes it’s hard to imagine biological systems as a matrix, so it may very well be that we just are not exploiting all the implicit information found in a biological system, he continued to add. This is why graphical representations are an interesting next step – because it feels intuitive as a tool to represent something connected, such as a chemical molecule.


Graph neural networks enable the reproduction of 3-dimensional structures for deep learning. This means being able to capture and use more information and is well suited to the field of biology. Photo: M. Bronstein

Benaich noted examples in predicting molecular properties and planning chemical synthesis, but also in trying to identify new small molecules. Small molecules are treated as Lego building blocks. Using advances in DNA sequencing, all of these chemicals are mixed in a tube with a target molecule, and researchers can see which building blocks are assembled and bound to the target of interest.

Once candidate molecules have been identified that appear to function, GNNs can be used to try to learn what commonalities these building blocks have that make them good binders for the target of interest. Adding this machine learning layer to a standard and well-known chemical screening method provides an improvement several times on the baseline.

Hogarth, for his part, referred to a recent analysis and argued for it GNNs, the transformer architecture and attention-based methods used in language models have the same underlying logic, as you can think of sentences for the connected word graphs. Hogarth noted how the architecture of transformation creeps into many unusual uses and how scaling it increases the effect:

“The meta-point around the neural networks and these attention-based methods is generally that they seem to represent a kind of general enough approach that there will be progress just by continuing to hammer very hard on that nail for the next two years. And one of the ways I challenge myself is to assume that we might see a lot more progress just by doing the same thing with a little more aggression.

And then I would assume that some of the gains found in these GNNs cross-pollinate with the work that is done with language models and transformers. And this approach remains a very fruitful area for that kind of super general, high level AGI-like research. “

AI ethics and predictions

There are plenty of topics we can choose to dissect from Benaich and Hogarth’s work, such as the use of PyTorch, which surpasses TensorFlow in research, the boom in federated learning, the analysis of talent, and retention per se. Geography, progress (or lack thereof) in autonomous vehicles, AI chips and AutoML. We encourage readers to dive into the report to learn more. But we end with something else.

Hogarth mentioned that the speculation phase in AI for biology and healthcare is starting, with plenty of capital flowing. There will be some really amazing companies coming out of it and we will start to see a real implementation phase kick in. But it is just as certain, he went on to add, that there will be cases that will be revealed to be total fraud.

So what about AI ethics? Benaich and Hogarth cite work from pioneers in the field and touch on topics such as commercial gender classification, unregulated police face recognition, algorithm ethics, and robot regulation. For the most part, the report focuses on face recognition. Face recognition is widespread throughout the world and has lead to controversy, as well as wrongful arrests. More thoughtful approaches seem to be gathering steam, Benaich and Hogarth note.

The Duo report cites examples such as Microsoft deletes its database of 10 million faces (the largest available) collected without consent, Amazon announced a one-year hiatus on letting police use its face recognition tool Recognition to give “Congress enough time to introduce appropriate rules.” And IBM announced that it would break down its face recognition products for general purposes.

Hogarth referred to an incident in which a British citizen claimed that his human rights were being violated when he was photographed while shopping for Christmas. Although judges ruled out the plaintiff, they also established an important new duty for the police to ensure that discrimination is proactively “eliminated.” This means that action against bias cannot be legally postponed until the technology has matured:

“This creates a much higher bar for implementing this software. And it creates almost a legal opportunity for anyone who experiences bias in the hands of an algorithm to have a foundation to sue the government or a private act of defiance technology,” Hogarth said. .


AI ethics often focuses on face recognition, but there are more and more domains in which it is becoming relevant.

MegaPixels: Implications of Origin, Ethics, and Confidentiality of Publicly Available Face Recognition Datasets. © Adam Harvey /

Hogarth also emphasized another approach, which he called “API-driven auditing”. I have referred to one new law passed in Washington state with active support from Microsoft. This law restricts law enforcement use of face recognition technology by requiring the software used to be available to an independent third party via an API to assess “accuracy and unfair performance differences” across traits such as race or gender.

Of course, we narrow our focus on ourselves AI ethics, the list is endless: From bias to the use of technology in authoritarian regimes and / or for military purposes, AI nationalism or US tax code that encourages replacing humans with robots, there is no shortage of cause for concern. Benaich and Hogarth, for their part, close their report by offering a series of predictions for the coming year:

The race to build larger language models continues, and we see the first 10 trillion parameter model. Attention-based neural networks are moving from NLP to computer vision to achieve the latest results. A major AI lab shuts down when its parent company changes strategy. In response to U.S. DoD activities and investments in U.S.-based military AI startups, a wave of Chinese and European defense-focused AI startups is raising a total of over $ 100 million over the next 12 months.

One of the leading AI-first drug discovery startups (e.g., Recursion, Exscientia) is either IPOs or acquired for over $ 1 billion. $. DeepMind is making a major breakthrough in structural biology and drug discovery beyond AlphaFold. Facebook is making a major breakthrough in augmented and virtual reality with 3D computer vision. And NVIDIA does not end up completing its acquisition of Arm.

The record for predictions offered in last year’s state of AI report was pretty good – they made up 5 out of 6. Let’s see how this year’s set of predictions cost.