Circular DNA

You have an internship with a bioinformatics research group studying DNA. A single strand of DNA consists of many genes, which fall into different categories called gene types. Gene types are delimited by specific nucleotide sequences known as gene markers. Each gene type $i$ has a unique start marker $\mathtt{s}_ i$ and a unique end marker $\mathtt{e}_ i$. After many dirty jobs (growing bacteria, cell extraction, protein engineering, and so on), your research group can convert DNA into a form consisting of only the gene markers, removing all the genetic material lying between the markers.

Your research group came up with the interesting hypothesis that gene interpretation depends on whether the markers of some gene types form properly nested structures. To decide whether markers of gene type $i$ form a proper nesting in a given sequence of markers $w$, one needs to consider the subsequence of $w$ containing only the markers of gene type $i$ ($\mathtt{s}_ i$ and $\mathtt{e}_ i$), leaving none of them out. The following (and only the following) are considered to be properly nested structures:

  • $\mathtt{s}_ i \mathtt{e}_ i$

  • $\mathtt{s}_ i N \mathtt{e}_ i$, where $N$ is a properly nested structure

  • $AB$, where $A$ and $B$ are properly nested structures

Given your computing background, you were assigned to investigate this property, but there is one further complication. Your group is studying a specific type of DNA called circular DNA, which is DNA that forms a closed loop. To study nesting in circular DNA, it is necessary to cut the loop at some location, which results in a unique sequence of markers (the direction of reading is fixed by molecular properties). Whether a gene type $i$ forms a proper nesting now also depends on where the circular DNA is cut. Your task is to find the cutting location that maximizes the number of gene types that form a properly nested structure. Figure 1 shows an example corresponding to Sample Input 1. The indicated cut results in the markers for gene type $1$ being properly nested.

\includegraphics[width=0.25\textwidth ]{circular-fig}
Figure 1: Illustration of Sample Input 1 with its optimal cutting location.


The first line of input contains an integer $n$ ($1 \leq n \leq 10^6$), the length of the DNA. The next line contains the DNA sequence, that is, $n$ markers. Each marker is a character $c$ followed by an integer $i$, where $c \in \{ \texttt{s}, \texttt{e} \} $ specifies whether it is a start or an end marker and $i$ ($1 \leq i \leq 10^6$) is the gene type of the marker. The given DNA sequence has been obtained from the circular DNA by cutting at an arbitrary location.


Output one line with two integers $p$ and $m$, where $p$ is the cutting position that maximizes the number of different gene types that form a proper nesting, and $m$ is this maximum number of gene types. The DNA is cut just before the $p^{\text {th}}$ input marker (for instance, the cut shown in Figure 1 has $p=3$). If more than one cutting position yields the same maximum value of $m$, output the smallest $p$ that does so.

Sample Input 1 Sample Output 1
e1 e1 s1 e2 s1 s2 e42 e1 s1
3 1
Sample Input 2 Sample Output 2
s1 s1 e3 e1 s3 e1 e3 s3
8 2
CPU Time limit 3 seconds
Memory limit 1024 MB
Difficulty 3.3medium
Statistics Show
License Restricted, used with permission

Please log in to submit a solution to this problem

Log in