A data-driven study of award-winning essays from Vietnam's National Excellent Student Competition in Literature — Literary Essays & Social Essays — tracing patterns across six decades, 1961 to 2024.About this competition First held in 1961, this national competition is for high school students (grades 10–12). Participants must go through intense training and advance successively through school, district, provincial, and national rounds. A standard exam has two parts: Social Essay (8 points) – on societal issues and Literary Essay (12 points) – on literary topics. Award-winning students are nationally recognized and often receive direct university admission.
In this project, we decided to analyze the winning essays from the National Competition for Excellent Students in Literature. Unlike other subjects, Literature has no fixed grading rubric; it demands deep critical thinking and advanced analytical skills hence the evaluation is often qualitative in nature. Therefore, we became curious: over the 63 years of this competition, what patterns and differences emerge when these top-scoring national essays are analyzed through a combination of both qualitative and quantitative methods?
This analysis not only helps students preparing for the competition gain deeper insights and better anticipate exam topics, but also serves as a resource for teachers and the academic community by revealing which authors and works are most frequently cited and what themes tend to appear again and again in award-winning essays. These patterns can help educators understand what the exam values over time, and offer a starting point for deeper discussions about how literary merit is assessed in the national contest.
We analyze high-scoring essays across two complementary dimensions: horizontal (across all years) and vertical (temporal trends).
Hover over any name to see why it appears so frequently in award-winning literary essays.
Nam Cao
A major realist writer whose works focus on peasants and marginalized intellectuals, highlighting their
psychological struggles and social conditions. Widely regarded as one of the most important figures in
modern Vietnamese literature.
Nguyễn Du
A classical Vietnamese poet best known for The Tale of Kiều, a foundational work of Vietnamese literature.
His writing reflects deep humanism and explores themes of fate, morality, and compassion.
Tố Hữu
A revolutionary poet whose works reflect Vietnam's socialist ideology and wartime experiences. His poetry
combines political commitment with strong emotional expression.
Hồ Chí Minh – "Bác"
The founding leader of modern Vietnam and also a writer and poet, with works like Prison Diary. His writing
blends simplicity with strong moral and ideological messages.
Liên — Hai đứa trẻ
The central character in Hai đứa trẻ, representing a quiet, observant perspective on rural life. Her
character reflects subtle emotions and a deep awareness of surrounding stagnation.
Kim Trọng — Truyện Kiều
Kiều's first love and the ideal Confucian scholar — loyal, moral, and devoted. He symbolizes enduring love
within a turbulent society.
Truyện Kiều
The Tale of Kiều tells the tragic story of Thúy Kiều and explores themes of fate, sacrifice, and human
dignity. It is considered the most important work in Vietnamese literature.
Chí Phèo
A realist short story about a peasant who is dehumanized by society. It critically portrays social injustice
and loss of identity.
Tây Tiến
A poem about soldiers during the anti-French resistance, blending romanticism with realism. It captures both
hardship and heroism.
Numbers reflect how many essays feature each theme. Multiple themes can appear within a single essay, as prompts typically require students to address more than one argumentative dimension.
Does length equal quality? The answer differs dramatically between literary and social essays.
For literary essays, lengths are consistently high across all prize levels — approximately 2,500–2,900 words. Depth, detailed textual analysis, and extensive evidence are essential regardless of ranking.
For social essays, a striking contrast emerges: First and Second prize winners average only ~1,500 words, while Third prize winners average ~2,850 words — nearly double.
How cited authors, literary traditions, and intellectual frameworks evolved from 1961 to 2024.
114 award-winning essays (77 literary, 37 social) — collected, digitized, and analyzed across 63 years.
Since competition solutions are unofficial and non-public, only selected high-award essays have been preserved by educational organizations as teaching materials for future generations of competition participants. In this project, we keep all excerpts in the original language to preserve the integrity of the authors' intended meaning and to avoid translation inaccuracies caused by cultural and linguistic differences.
Tesseract detects word-level boxes per page. Invalid detections are filtered, then merged into line-level regions, sorted top-to-bottom, and padded to avoid character clipping at edges.
Each line region is cropped from the page image and passed to VietOCR for text recognition. Empty or invalid predictions are discarded to ensure clean output.
Recognized lines are joined with newlines per page, then appended in document order across all pages to produce the complete extracted text.
Watermarks, page numbers, and embedded links are stripped from the OCR output — e.g. "Chuyên trang ôn Văn", "TaiLieuOnThi.Net", residual page numbers.
Each cleaned document is converted into structured JSON entries with four fields: year,
problem, solution, award.
Evolving Social
Themes Over Time
How topics in social essays shifted from enduring moral values toward contemporary societal concerns. Each bar shows the approximate span of a theme's presence.
– Community
– Cultural Conduct
– Nature Protection
Media – Information