When doing lazy propagation in segment tree, people usually have a function to propagate values inside a node.

When I learnt lazy propagation, I did not use reference code and just coded it myself. I apparently did not know how to spell, so I named my function propo. This is how I managed to troll an many Singaporean CPers into spelling propagation as propogation. Even know as I am typing this blog, I still default to writing propogate… I will try to spell it correctly in this blog.

The earliest record I have of the propo thing being public is in the dec course 2020 lecture notes on DS III. If my memory serves me right, it was Rui Yuan and I who were preparing this lecture and I was the one who wrote the code with propogate as function names and as comments. I also distinctly remember coding lazy propagation during an online lecture. I definitely spelled propagation wrongly there as well.

Now, 4 years later. I realized that many people were actually spelling propagation wrongly as well. How many people exactly?

To test this, I decided to check people’s submission on Segment Tree 2 on codebreaker. This is what the top 20 people on codebreaker named their propagate function. Note that I only checked their first AC submission.

Username Name of Function Year O/A
PlayVoltz propagate 2023 A
Pan propogate 2023 O
TheRaptor propogate 2021 O
penguin133 propo 2020 O
LCJLY lazy_propagate 2022 A
ryangohca value 2020  
syy value 2020  
AlphanumericUsername lazy_propagate 2022 A
bribritt notlazy 2022  
errorgorn propo 2020 O
shoryu386 lazy_propagate 2022 A
chenweilian - 2022  
blackscreen1 propogate 2021 O
siewjh propo 2021 O
hmm value 2021  
mofumofu prop 2021  
Sans12345 forceProp 2022  
dsyz propagate 2020 A
kxd subnode 2022  

Note that chenweilian does not have a propagation function because he used a fenwick tree.

But in the top 20, we have more people spelling propagate wrongly! What about the rest of the users?

To do this, I wrote a script to scrape codebreaker (please ask the judge devs for permission before doing this).

LOGIN_SESSION="google-login-session=dQw4w9WgXcQ"

headers = {'Cookie' : LOGIN_SESSION}

import requests
from bs4 import BeautifulSoup
import pandas as pd
import json
import time

def getSubIds():
	res=[]
	URL = r"https://codebreaker.xyz/submissions?problem=segmenttree2&page="
	for x in range(1,81):
		time.sleep(1)
		
		page = requests.get(URL+str(x))
		soup = BeautifulSoup(page.content, "html.parser")
		table = pd.read_html(str(soup.find(id="myTable")))[0]
		
		table = table[table["Score"]==100]
		res += table["ID"].tolist()
	return res
	
def getSubInfo(subId):
	time.sleep(1)
	print(subId)
	
	URL = r"https://codebreaker.xyz/submission/"
	page = requests.get(URL+str(subId), headers=headers)
	soup = BeautifulSoup(page.content, "html.parser")
	table = pd.read_html(str(soup.find_all("table")[0]))[0]
	
	code = soup.find_all("textarea")[0].get_text()
	
	res = {
		"username" : table.at[1,1],
		"id" : subId,
		"date" : table.at[2,1],
		"code" : code
	}
	
	return res
	
subs = getSubIds()
print("total ",len(subs))
subs = [getSubInfo(x) for x in subs]

with open('data.json', 'w', encoding='utf-8') as f:
    json.dump(subs, f, ensure_ascii=False, indent=4)

You can find the json file here.

Please note that the codes that are written by me (errorgorn) are broken. I don’t know why. I think it is because there are Chinese characters in my code (I used to put 雪花飘飘 in my code header as a joke).

Let’s do some analysis on this data.

There are 430 total AC submissions. 109 of them contain the substring “propo” and 145 of them contain the substring “propa”.

Note that people might have submitted multiple AC codes. Especially, so when testing codes of other people. So it is a good to only use statistics when filtering by the first AC of each user.

When ignoring counts of repeated ACs, 211 distinct users with ACs in total and have 58 and 72 users using “propo” and “propa” respectively. Wow, there is actually a substantial number of people spelling propagte wrongly….

If we filter this by the year of the submission, we can actually observe the number of people using propo decreasing. So I guess someone finally fixed the spelling along the way.

Year 2020 2021 2022 2023 2024
propo 24 15 12 4 3
propa 8 7 22 22 13
$\frac{\text{propo}}{\text{propo}+\text{propa}}$ 0.75 0.68 0.35 0.15 0.19

Strangely, there are 4 submissions with the substring “propo” and “propa” inside their code. They have IDs 563732, 343819, 343816, 187268. It turns out they directly copied the lazy propagation code from that lecture notes, but they changed the function name propogate to propagate, but not instances of propagate in the comments. Perhaps the spelling was fixed in lectures of later years?

I suppose it would be a good idea to do some sort of comparison with this code to see who copied it to get AC. To do this, we will define the similarity of a code $A$ to our “model code” $B$ as $\frac{\text{LCS}(A,B)}{\text{len}(B)}$ where $\text{LCS}$ is the longest common subsequence. We will strip all whitespace.

We can try setting $B$ as the segment tree code with comments:

structnode{ints,e,m;//rangeis[s,e],misthemiddlepointintval;//sumof[s,e]intlazy;//lazytagof[s,e]node*l,*r;//createtwochildrenlandr,wherelis[s,m]and[m+1,e]node(intS,intE){//constructorcallednodes=S,e=E,m=(s+e)/2;val=0;//initiallyallvaluesare0lazy=0;//lazytagof0willmeanthereisnoupdate(sentinelvalue)if(s!=e){//nodeisnotyetaleaf,socreatetwochildrenl=newnode(s,m);//createleftchildr=newnode(m+1,e);//createrightchild}}voidpropagate(){if(lazy==0)return;//nothinghappensval+=lazy*(e-s+1);//(e-s+1)isthelengthoftherangeif(s!=e){//notaleaf,sendlazytagstochildrenl->lazy+=lazy;r->lazy+=lazy;}lazy=0;//setourlazytagvaluebacktothesentinel}voidupdate(intS,intE,intV){//increment[S,E]byVif(s==S&&e==E)lazy+=V;//updatecoversrange,updatelazytagelse{//gowehavetogodeeperif(E<=m)l->update(S,E,V);//[S,E]isintheleftchildelseif(m<S)r->update(S,E,V);//[S,E]isintherightchildelsel->update(S,m,V),r->update(m+1,E,V);l->propagate(),r->propagate();//remembertopropagateyourchildrenbeforeupdateyourselfval=l->val+r->val;//updatetherangesum}}intquery(intS,intE){propagate();//remembertopropagateif(s==S&&e==E)returnval;//case1elseif(E<=m)returnl->query(S,E);//case2,recursetoleftchildelseif(S>=m+1)returnr->query(S,E);//case3,recursetorightchildelsereturnl->query(S,m)+r->query(m+1,E);//case4,splitthequeryrange,recursetobothchilds}}*root;

Or without comments:

structnode{ints,e,m;intval;intlazy;node*l,*r;node(intS,intE){s=S,e=E,m=(s+e)/2;val=0;lazy=0;if(s!=e){l=newnode(s,m);r=newnode(m+1,e);//createrightchild}}voidpropagate(){if(lazy==0)return;val+=lazy*(e-s+1);if(s!=e){l->lazy+=lazy;r->lazy+=lazy;}lazy=0;}voidupdate(intS,intE,intV){if(s==S&&e==E)lazy+=V;else{//gowehavetogodeeperif(E<=m)l->update(S,E,V);elseif(m<S)r->update(S,E,V);elsel->update(S,m,V),r->update(m+1,E,V);l->propagate(),r->propagate();val=l->val+r->val;}}intquery(intS,intE){propagate();if(s==S&&e==E)returnval;elseif(E<=m)returnl->query(S,E);elseif(S>=m+1)returnr->query(S,E);elsereturnl->query(S,m)+r->query(m+1,E);range,recursetobothchilds}}*root;

We will graph out the similarity to the code with and without comments. The points are colored:

  • Green if they contain “propo” and “propa”
  • Red if they contain “propo”
  • Blue if they contain “propa”
  • Black otherwise

We obtain this plot:

Also this is a gif when we add these points by time.

Notice that most of the red dots occur before 2022 and they do not even substantially match the code given in the lecture notes. I think this highly suggests that I actually made people type out propo

This is the script I used for my analysis. Shoutout to ChatGPT.

B0="structnode{ints,e,m;//rangeis[s,e],misthemiddlepointintval;//sumof[s,e]intlazy;//lazytagof[s,e]node*l,*r;//createtwochildrenlandr,wherelis[s,m]and[m+1,e]node(intS,intE){//constructorcallednodes=S,e=E,m=(s+e)/2;val=0;//initiallyallvaluesare0lazy=0;//lazytagof0willmeanthereisnoupdate(sentinelvalue)if(s!=e){//nodeisnotyetaleaf,socreatetwochildrenl=newnode(s,m);//createleftchildr=newnode(m+1,e);//createrightchild}}voidpropagate(){if(lazy==0)return;//nothinghappensval+=lazy*(e-s+1);//(e-s+1)isthelengthoftherangeif(s!=e){//notaleaf,sendlazytagstochildrenl->lazy+=lazy;r->lazy+=lazy;}lazy=0;//setourlazytagvaluebacktothesentinel}voidupdate(intS,intE,intV){//increment[S,E]byVif(s==S&&e==E)lazy+=V;//updatecoversrange,updatelazytagelse{//gowehavetogodeeperif(E<=m)l->update(S,E,V);//[S,E]isintheleftchildelseif(m<S)r->update(S,E,V);//[S,E]isintherightchildelsel->update(S,m,V),r->update(m+1,E,V);l->propagate(),r->propagate();//remembertopropagateyourchildrenbeforeupdateyourselfval=l->val+r->val;//updatetherangesum}}intquery(intS,intE){propagate();//remembertopropagateif(s==S&&e==E)returnval;//case1elseif(E<=m)returnl->query(S,E);//case2,recursetoleftchildelseif(S>=m+1)returnr->query(S,E);//case3,recursetorightchildelsereturnl->query(S,m)+r->query(m+1,E);//case4,splitthequeryrange,recursetobothchilds}}*root;"

B1="structnode{ints,e,m;intval;intlazy;node*l,*r;node(intS,intE){s=S,e=E,m=(s+e)/2;val=0;lazy=0;if(s!=e){l=newnode(s,m);r=newnode(m+1,e);//createrightchild}}voidpropagate(){if(lazy==0)return;val+=lazy*(e-s+1);if(s!=e){l->lazy+=lazy;r->lazy+=lazy;}lazy=0;}voidupdate(intS,intE,intV){if(s==S&&e==E)lazy+=V;else{//gowehavetogodeeperif(E<=m)l->update(S,E,V);elseif(m<S)r->update(S,E,V);elsel->update(S,m,V),r->update(m+1,E,V);l->propagate(),r->propagate();val=l->val+r->val;}}intquery(intS,intE){propagate();if(s==S&&e==E)returnval;elseif(E<=m)returnl->query(S,E);elseif(S>=m+1)returnr->query(S,E);elsereturnl->query(S,m)+r->query(m+1,E);range,recursetobothchilds}}*root;"

import json
import pylcs

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation

with open('data.json', 'r') as f:
	data = json.load(f)

data=data[::-1] #dates are reverse order in the json file

tot=0
num=[[0]*5,[0]*5]

X=[]
Y=[]
C=[]
D=[]

names=set()

for dat in data:
	if (dat["username"] in names):
		continue
		
	names.add(dat["username"])	#uncomment this line if you do not want distinct names
	
	tot+=1
	
	A=dat["code"]
	
	X.append(pylcs.lcs_sequence_length(A, B0)/len(B0))
	Y.append(pylcs.lcs_sequence_length(A, B1)/len(B1))
	
	propo = "propo" in A
	propa = "propa" in A
	
	if (propo and propa):
		print(dat["username"],dat["id"])
	
	year=ord(dat["date"][3])-ord('0')
	if (propo):
		num[0][year]+=1
	if (propa):
		num[1][year]+=1
	
	if (propo and propa):
		C.append("green")
	elif (propo):
		C.append("red")
	elif (propa):
		C.append("blue")
	else:
		C.append("black")
		
	D.append(dat["date"])
	

print(tot)
print(num)
print([num[0][i]/(num[0][i]+num[1][i]) for i in range(5)])
print(list(map(sum,num)))

fig, ax = plt.subplots(figsize=(15,10))

ax.set_xlim(0,1)
ax.set_ylim(0,1)
ax.set_xlabel('With Comments')
ax.set_ylabel('Without Comments')

line_y = [0,1]
line_x = [0, len(B1)/len(B0)]
plt.plot(line_x, line_y, color='red', linestyle='-')


scat = ax.scatter([], [], s=100)
title = ax.set_title('Scatter Plot Animation')

def update(frame):
    current_x = X[:frame + 1]
    current_y = Y[:frame + 1]
    current_colors = C[:frame + 1]
    scat.set_offsets(np.column_stack((current_x, current_y)))
    scat.set_color(current_colors)
    title.set_text(D[frame][:7])
    return scat, title

ani = FuncAnimation(fig, update, frames=len(X), interval=100, blit=True)

ani.save("scatter_plot_animation.gif", writer='pillow')

update(len(X) - 1)
fig.savefig("scatter_plot.png", dpi=100)

<
Previous Post
Haskell for Advent of Code (Day 4-7)
>
Blog Archive
Archive of all previous blog posts