Research group

Programming Languages and Tools Lab

Secondary structure analyzer

Grigorev SemyonActive

Combination of formal grammars and artificial neural networks for analyzing secondary structure.

We propose a way to combine formal grammars and artificial neural networks for secondary structure processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with detection of patterns and noise.

This approach can be applied for different types of sequences which have rich secondary structure.

Currently we are working on application of this approach to analyses of biological sequences (RNA, proteins). In contrast with the classical way, where probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars, which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we employ matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a neural network to process the features.

Participants

Lunina Polina
Lunina Polina
Grigorev Semyon
Grigorev Semyon

Materials

Publications

Improved Architecture of Artificial Neural Network for Secondary Structure Analysis

November 2019

Semyon Grigorev and Polina Lunina

Read more

The Composition of Dense Neural Networks and Formal Grammars for Secondary Structure Analysis

March 2019

Semyon Grigorev and Polina Lunina

We propose a way to combine formal grammars and artificial neural networks for biological sequences processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with mutations and noise. In contrast to the classical way, when probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we use matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a dense neural network to process features.

Read more