PhD proposal

Supervisors: Pierre Brousset

Team P. Brousset (CRCT)

Deep convolutional neural networks (CNN) have been tackling, with impressive results, most of the pattern recognition challenges of the past few decades. With the advent of efficient digitization techniques for microscopy slides (Whole slide images, WSI), the use of deep learning models on histopathology images has been widely explored as it may save time and reduce errors in the diagnosis, prognosis or response to therapy predictions.

In that field, most of the clinically relevant tasks have been addressed under the framework of supervised classification [1]–[3]. Yet, despite countless efforts of design and optimization, the clinical use and deployment of the developed solutions remain difficult mostly due to domain adaption issues [4].

To override these limitations, the idea is to considerably widen the dictionary of structures and lesions to be recognized by supervised classifiers [5], [6]. Label comprehensiveness is a key-concept to produce any type of decision in a reliable, explainable and interpretable way that would fit with clinical routine [8]–[10].

Indeed, in the classical supervised framework, this can only be done at the cost of making highly qualified experts label tremendous amounts of images.

Therefore, this work has the ambition to lay the foundation of automated diagnosis and multiplex/complex immunostaining analysis with minimal supervision. Related to data mining approaches, that remain under-explored in the field of WSIs analysis [12]–[14]. Statistical models developed in this study will mix auto-supervised strategies [15], [16] with metric learning or generative models [17]–[21], as well as general-purpose feature transfer methods [22]–[25] to gather general and widely re-usable knowledge in large sets of raw unannotated WSIs.

Beyond statistical learning, this work will focus on the structure of the results. Inspired by the Knowledge Graph, the semantic web and graph databases technologies, AI tools will handle multiple decisional contexts and will take decisions based on logic and deduction. While the models will rely on « symbolic » AI, one major theoretical aspect is to formulate graph generation as an optimization problem that is compatible with gradient descent and backpropagation algorithm.

Key words: Digital Pathology, automated diagnosis, immunostaining analysis, data mining, auto-supervised learning, metric learning, generative models, structured knowledge.

*