https://codebreaker.xyz/problem/sandwich

This editorial is super long overdue, writing it now for completeness.

Easier Version

For convenience, A0=An+1=.

Firstly, let us understand how to solve the problem for all queries with (xi,yi)=(1,i).

We will need to find how many ranges (l,r) are a sandwich for a fixed r.

Let Sr be the set of l<r such that Almax(Al+1,Al+1,,Ar1). That is Sr is the indices of all the suffix maxima when considering the array [1,r1]. Obviously, the values of elements in Sr will be a non-increasing sequence. It is natural to maintain Sr as a stack to easily transition to Sr+1.

Now, for some lSr, (l,r) is a sandwich if nxt(l), the element to the right of l in Sr satisfies Anxt(l)Ar, which is a suffix of Sr since Sr is non-increasing.

Furthermore, if Anxt(l)<Ar, note the strict inequality, then nxt(l)Sr+1.

So we can quickly solve the above problem by maintaining a stack of Sr of (value,occurance):

  1. delete all value<Ar in Sr and add occurance to answer
  2. if value=Ar is in Sr, add occurance to answer
  3. add a single occurrence of Sr to the stack
  4. Sr has been changed to Sr+1 now

Note that we need a stack of (value,occurance) and not only (value) is so that step 1 will not run in O(n2) time, if we have an array with all same values. Now, the above algorithm clearly runs in O(n) time.

Solution 1

This solution was the original solution.

We will proceed with DnC.

Assume that for some fixed m, all queries satisfy xim<yi. Then we will solve this subproblem O(n), so that the entire algorithm will run in O(nlogn).

There are three types of queries are concerned about:

  1. xilrm
  2. m<lryi
  3. xilm<ryi

For types 1 and 2, use the solution presented in easy version. This can be done in O(n).

Now for type 3, we will want to consider the l[1,m] which satisfy Almax(Al+1,Al+2,,Am) and the r(m,n] which satisfy max(Am+1,Al+2,,Ar1)Ar. Define Bl:=max(Al+1,Al+2,,Am) and Br:=max(Am+1,Al+2,,Ar1) for convenience.

Thus, (l,r) is a sandwich if it additionally satisfy max(Bl,Br)min(Al,Ar).

Since we are sweeping l from right to left, it is helpful to consider both Ai and Bi to be non-decreasing here in our mental model and also note that BiAi.

Let us focus on the range of r is a sandwich when l is fixed. We have to satisfy both BlAr and BrAl, which is a contiguous segment in r. So from this alone we can just do a dynamic range add and obtain O(nlogn) complexity for this problem and O(nlog2n) complexity in total. But we can do better.

We don’t only have the condition that Ai and Bi are non-decreasing with BiAi but we also have Bnxt(i)=Ai.

For example, when A[1,m]=[5,1,4,4,1,3,0], we have (in reverse order) (Al,Bl)=[(0,),(3,0),(4,3),(4,4),(5,4)].

We will study the case of:

  • (Al,Bl)=[(x1,),(x2,x1),(x3,x2),(x4,x3),]
  • (Ar,Br)=[(y1,),(y2,y1),(y3,y2),(y4,y3),]

These two special cases basically generalises everything. Define x0=y0= for convinience.

So the ranges of r that each l will take is xl1yr and yr1xl.

So from here, we split the (Ar,Br) into the following disjoint categories:

  • Xl is the set of all r such that xl1yr and yr<xl
  • Yl is the set of all r such that yr=xl

Then, observe that the set of r that satisfies xl1yr and yr1xl is simply XlYlXl+1 and at most one extra element with xl=yr1<yr.

Therefore, if we compress the elements of r into the important categories above, we only need to range add onto O(1) ranges which can be done using prefix sums instead of doing more general dynamic range adds.

Implementation wise, you don’t need to explicitly generate the above categories but just generate the ranges of r for each element of l and then “discretize” those ranges. The above proof is only showing that when you discretize those ranges, you will have need to update O(1) number of ranges.

Code
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/rope>
using namespace std;
using namespace __gnu_pbds;
using namespace __gnu_cxx;
#define ll long long
#define ii pair<ll,ll>
#define iii pair<ii,ll>
#define fi first
#define se second
#define endl '\n'
#define debug(x) cout << #x << ": " << x << endl

#define pub push_back
#define pob pop_back
#define puf push_front
#define pof pop_front
#define lb lower_bound
#define ub upper_bound

#define rep(x,start,end) for(auto x=(start)-((start)>(end));x!=(end)-((start)>(end));((start)<(end)?x++:x--))
#define all(x) (x).begin(),(x).end()
#define sz(x) (int)(x).size()

#define indexed_set tree<ll,null_type,less<ll>,rb_tree_tag,tree_order_statistics_node_update>
//change less to less_equal for non distinct pbds, but erase will bug

mt19937 rng(chrono::system_clock::now().time_since_epoch().count());

int n,q;
int arr[3000005];

struct Q{
	int l,r;
	int idx;
};

vector<Q> qu;

ll ans[3000005];

ll curr=0;
vector<ii> v;
ll stkl[3000005],stkr[3000005];

ii id[3000005];

int used[3000005];
bool impt[3000005];
int block[3000005];

ll pref[3000005];
int num[3000005];

void dnc(int l,int r,vector<Q> qu){
	if (qu.empty()) return;
	int m=l+r>>1;
	
	if (l==r){
		for (auto &it:qu) ans[it.idx]=1;
		return;
	}
	
	curr=0;
	v.clear();
	rep(x,m+1,l){
		while (!v.empty() && v.back().fi<arr[x]) curr+=v.back().se,v.pob();
		if (!v.empty() && v.back().fi==arr[x]){
			if (sz(v)>=2) curr++;
			curr+=v.back().se;
			v.back().se++;
		}
		else{
			if (sz(v)>=1) curr++;
			v.pub(ii(arr[x],1));
		}
		
		curr++;
		stkl[x]=curr;
	}
	
	curr=0;
	v.clear();
	rep(x,m+1,r+1){
		while (!v.empty() && v.back().fi<arr[x]) curr+=v.back().se,v.pob();
		if (!v.empty() && v.back().fi==arr[x]){
			if (sz(v)>=2) curr++;
			curr+=v.back().se;
			v.back().se++;
		}
		else{
			if (sz(v)>=1) curr++;
			v.pub(ii(arr[x],1));
		}
		
		curr++;
		stkr[x]=curr;
	}
	
	vector<int> idx1;
	vector<int> idx2;
	
	int p=-1;
	rep(x,m+1,l){
		if (arr[x]>=p){
			idx1.pub(x);
			p=arr[x];
		}
		id[x]=ii(-1,-1);
	}
	
	p=-1;
	rep(x,m+1,r+1){
		if (arr[x]>=p){
			idx2.pub(x);
			p=arr[x];
			used[x]=sz(idx2)-1;
		}
		else{
			used[x]=-1;
		}
	}
	
	rep(x,0,sz(idx2)) impt[x]=false;
	int pl=0,pr=0;
	rep(x,0,sz(idx1)){
		while (pr<sz(idx2)-1 && arr[idx2[pr]]<=arr[idx1[x]]) pr++;
		while (pl<sz(idx2) && x && arr[idx2[pl]]<arr[idx1[x-1]]) pl++;
		
		if (pl<=pr) id[idx1[x]]=ii(pl,pr);
		impt[pl]=impt[pr+1]=true;
	}
	
	int IDX=0;
	rep(x,m+1,r+1){
		if (used[x]!=-1){
			if (impt[used[x]]) IDX++,num[IDX]=0;
			num[IDX]++;
		}
		id[x]=ii(IDX,num[IDX]);
	}
	
	vector<Q> vl,vr;
	curr=m+1;
	int mx=0;
	
	for (auto &it:qu){
		if (it.r<=m) vl.pub(it);
		else if (m<it.l) vr.pub(it);
		else{
			while (it.l<curr){
				curr--;
				
				if (id[curr]==ii(-1,-1)) continue;
				
				int bl=id[idx2[id[curr].fi]].fi,br=id[idx2[id[curr].se]].fi;
				ll psum=0;
				
				rep(x,bl,br+1){
					if (x<=mx){
						pref[x]+=num[x]+psum;
						psum+=num[x];
					}
					else{
						pref[x]=pref[x-1]+num[x];
						mx++;
					}
				}
			}
			
			ll temp=stkl[it.l]+stkr[it.r];
			int br=id[it.r].fi;
			if (mx<br) temp+=pref[mx];
			else temp+=pref[br-1]+(pref[br]-pref[br-1])/num[br]*id[it.r].se;
			
			ans[it.idx]=temp;
		}
	}
	
	dnc(l,m,vl);
	dnc(m+1,r,vr);
}

int main(){
	ios::sync_with_stdio(0);
	cin.tie(0);
	cout.tie(0);
	cin.exceptions(ios::badbit | ios::failbit);
	
	cin>>n>>q;
	rep(x,1,n+1) cin>>arr[x];
	
	int a,b;
	rep(x,0,q){
		cin>>a>>b;
		qu.pub(Q({a,b,x}));
	}
	
	sort(all(qu),[](Q i,Q j){
		return i.l>j.l;
	});
	
	dnc(1,n,qu);
	
	rep(x,0,q) cout<<ans[x]<<endl;
}

Solution 2

This solution was found by Yan Hao during testing. It is asymptotically faster at O(n), but practically it is about the same speed.

The main idea is to also handle the ranges by cases, but we split by the maximum element in the range.

Indeed, suppose m1 and m2 is the indices of the first and last maximum elements in the range [xi,yi], possibly m1=m2.

There are no sandwiches that satisfy xil<m1<ryi and xil<m2<ryi. Otherwise, this implies that Am1<Al and Am2<r respectively, which contradicts how we selected m1 and m2.

Note that we can find m1 and m2 in O(n),O(1) using 4 Russians or something.

Let us go back to the solution of solving the problem for all queries with (xi,yi)=(1,i). Instead of only tracking the answer, we need to track the answer for all elements in Sr when we are sweeping on r.

In the original version, our algorithm is:

  1. delete all value<Ar in Sr and add occurance to answer
  2. if value=Ar is in Sr, add occurance to answer
  3. add a single occurrence of Sr to the stack
  4. Sr has been changed to Sr+1 now

In the new version, we will store ansl to denote the answer if (xi,yi)=(l,r) when sweeping on l. ansl will only maintained if it is a strict suffix maxima. Our algorithm will become:

  1. delete all value<Ar in Sr and add occurance to all ansl
  2. if value=Ar is in Sr, add occurance to all ansl (this is why we can only maintain for strict suffix maxima, or this will be O(n2))
  3. add a single occurrence of Sr to the stack, and set ansr:=0
  4. Sr has been changed to Sr+1 now

The range adds above are easy to handle since they are global range adds.

The only case we need to handle now is those sandwiches between m1 and m2. But those can be easily handled by modifying the above line sweep a bit.

The below code actually runs in O(nα(n)) cause I didn’t use O(n),O(1) RMQ.

Code
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/rope>
using namespace std;
using namespace __gnu_pbds;
using namespace __gnu_cxx;
#define ll long long
#define ii pair<int,int>
#define iii pair<ii,int>
#define fi first
#define se second
#define endl '\n'
#define debug(x) cout << #x << ": " << x << endl

#define pub push_back
#define pob pop_back
#define puf push_front
#define pof pop_front
#define lb lower_bound
#define ub upper_bound

#define rep(x,start,end) for(auto x=(start)-((start)>(end));x!=(end)-((start)>(end));((start)<(end)?x++:x--))
#define all(x) (x).begin(),(x).end()
#define sz(x) (int)(x).size()

#define indexed_set tree<ll,null_type,less<ll>,rb_tree_tag,tree_order_statistics_node_update>
//change less to less_equal for non distinct pbds, but erase will bug

mt19937 rng(chrono::system_clock::now().time_since_epoch().count());

struct UFDS{
	int p[2500005];
	int r[2500005];
	int v[2500005];
	
	void reset(){
		rep(x,0,2500005){
			p[x]=v[x]=x;
			r[x]=0;
		}
	}
	
	int par(int i){
		if (i==p[i]) return i;
		else return p[i]=par(p[i]);
	}
	
	void unions(int i,int j,int k){
		i=par(i),j=par(j);
		if (r[i]>r[j]) swap(i,j);
		p[i]=j;
		if (r[i]==r[j]) r[j]++;
		v[j]=k;
	}
} ufds;

inline void read(int &x){
	x=0;
	char ch=getchar_unlocked();
	while (ch&16){
		x=x*10+(ch&15);
		ch=getchar_unlocked();
	}
}

int n,q;
int arr[2500005];
vector<int> pos[2500005];

int nxtr[2500005];
ll valr[2500005];

int nxtl[2500005];
ll vall[2500005];

int pos_grp[2500005];
ll pref[2500005];

ii qu[2500005];
ii qu2[2500005];
vector<iii> Q;
ll ans[2500005];

int main(){
	ios::sync_with_stdio(0);
	cin.tie(0);
	cout.tie(0);
	cin.exceptions(ios::badbit | ios::failbit);
	
	read(n),read(q);
	rep(x,1,n+1) read(arr[x]);
	arr[0]=arr[n+1]=1e9+100;
	
	rep(x,0,q){
		read(qu[x].fi),read(qu[x].se);
		Q.pub({qu[x],x});
	}
	
	sort(all(Q),[](iii i,iii j){
		return i.fi.se<j.fi.se;
	});
	
	int idx=0;
	vector<int> stk;
	ufds.reset();
	
	rep(x,1,n+2){
		valr[x]=1;
		while (!stk.empty() && arr[stk.back()]<arr[x]){
			int temp=stk.back(); stk.pob();
			
			ll cnt=1;
			
			rep(y,0,sz(pos[temp])-1){
				pref[pos[temp][y+1]]=valr[pos[temp][y]]+pref[pos[temp][y]];
			}
			
			reverse(all(pos[temp]));
			rep(y,0,sz(pos[temp])){
				nxtr[pos[temp][y]]=x;
				ufds.unions(pos[temp][y],x,x);
				
				valr[pos[temp][y]]+=cnt;
				cnt=valr[pos[temp][y]]+y+2;
			}
			if (!stk.empty()) valr[stk.back()]+=cnt-1;
		}
		if (!stk.empty() && arr[stk.back()]==arr[x]){
			swap(pos[stk.back()],pos[x]);
			stk.back()=x;
		}
		else{
			stk.pub(x);
		}
		pos_grp[x]=sz(pos[x]);
		pos[x].pub(x);
		while (idx<q && Q[idx].fi.se==x){
			qu2[Q[idx].se].fi=ufds.v[ufds.par(Q[idx].fi.fi)];
			idx++;
		}
	}
	
	sort(all(Q),[](iii i,iii j){
		return i.fi.fi>j.fi.fi;
	});
	
	idx=0;
	stk.clear();
	rep(x,1,n+1) pos[x].clear();
	ufds.reset();
	
	rep(x,n+1,0){
		vall[x]=1;
		while (!stk.empty() && arr[stk.back()]<arr[x]){
			int temp=stk.back(); stk.pob();
			
			ll cnt=1;
			
			reverse(all(pos[temp]));
			rep(y,0,sz(pos[temp])){
				nxtl[pos[temp][y]]=x;
				ufds.unions(pos[temp][y],x,x);
				
				vall[pos[temp][y]]+=cnt;
				cnt=vall[pos[temp][y]]+y+2;
			}
			if (!stk.empty()) vall[stk.back()]+=cnt-1;
		}
		if (!stk.empty() && arr[stk.back()]==arr[x]){
			swap(pos[stk.back()],pos[x]);
			stk.back()=x;
		}
		else{
			stk.pub(x);
		}
		pos[x].pub(x);
		while (idx<q && Q[idx].fi.fi==x){
			qu2[Q[idx].se].se=ufds.v[ufds.par(Q[idx].fi.se)];
			idx++;
		}
	}
	
	rep(x,n+1,1) valr[x]+=valr[nxtr[x]];
	rep(x,1,n+1) vall[x]+=vall[nxtl[x]];
	
	int a,b;
	rep(x,0,q){
		tie(a,b)=qu2[x];
		
		ans[x]=1+valr[qu[x].fi]-valr[qu2[x].fi]+vall[qu[x].se]-vall[qu2[x].se];
		
		ans[x]+=pref[b]-pref[a];
		ll temp=pos_grp[b]-pos_grp[a];
		ans[x]+=temp*(temp+1)/2;
		
		cout<<ans[x]<<endl;
	}
}

<
Previous Post
High Dimensional Prefix Sums
>
Next Post
CS3233 Final Contest 2024